Potential research projects!

This is where we post ideas for research projects that are outside the main work of the lab. These could be suitable for undergraduate students to do as independent studies, or MS projects, or just for fun! Some of them may be a little off the wall. If you see something that strikes you as interesting and you would like to find out more, get in touch!


Suitable for students from CS or a related discipline. You bring the skills, we provide the cool applied problems!

Dynamic fish tracking and identification

Given a video of fish swimming in an aquarium, develop a system to both track and identify the individual fish in real time, so as to create a dynamic, live overlay of labels on the video. This would involve putting together a number of existing technologies, as separate good (mainly neural-network-based) systems exist now for tracking and image identification. Suitable for a student with experience in machine learning, especially CNNs. Potential commercial applications.

Machine learning for ranking based on conditional choices

Machine learning techniques, including deep neural networks, have proved themselves excellent at a variety of ‘classic’ tasks such as numerical prediction (regression) and classification. Presumably this also includes logistic regression, in which the probability of a discrete event is modeled based on some numerical input. But my own work (and ecology more broadly) often involves conditional choices. Think of it as trying to learn relative rankings from observations of choices which are made between only a subset of possible options. To give a toy example: suppose we want to know the relative popularity of pets from the list {snake, rabbit, cat, dog}, but our data come from a survey in which each person is only presented two animals to choose between. On average, people presented with the choice of {snake, cat} tend to choose cat, but people presented with {cat, dog} tend to choose dog: the cat choice is conditional on the other options. My question is: is there (or could there be) a neural network architecture that addresses this question (i.e., comes up with preference scores for all the animals based on a dataset of subset choices). If there is, then I would like to apply it using array input, i.e., like the pet example but with the images themselves rather than categories as inputs, so that the modeled preferences would be based directly on learned features of the image, not a label like ‘cat’. The overall application would be to learn how animals choose between aspects of their environment represented as multidimensional arrays. For example: suppose an animal looks in two different directions: can we predict which way it will move based on what it ’sees’ in each direction (some kind of array or image)?

This is what the standard technique of conditional logistic regression does, except that instead of a one-dimensional vector of variables associated with each choice, it would be generalized to learned multidimensional features from a CNN front-end.

Texture mixtures for plant competition

I would like to be able to train a CNN to classify complex textures from images, and then be able to take an image in which there is a blend of textures and know the relative proportions of each kind. The application is this: I want to understand how different plant species spread and invade each others’ territories. So step one would be to plant experimental plots containing single species and teach a system (e.g., a CNN) to classify the species from top-down imagery (say from a drone). The different species would be planted in a checkerboard pattern, so that there are straight borders between all possible pairs of species, and monitored over time (again with aerial imagery). During that time, some species would spread out of their squares into other squares, but not with a hard boundary: more like diffusion, so that border areas would end up with a mix of species. The goal would be to monitor this process at high resolution from the imagery: to be able to measure the mixing. The long-term goal is to some up with a matrix of spatial competition coefficients that will allow us to predict what will happen given any starting mix of species by plugging them into a simulation model we have developed. There are a number of potential commercial applications.

Physical platforms for field imaging

One of our research areas is automated ‘technical’ ID systems: systems for identifying species from images of specific parts, such as bees from their wings or trees from their leaves. As good as modern deep learning networks are at being robust to noise, disambiguating closely related species often depends on very subtle differences, and so it is still important to get good consistent images, both for training, testing, and ultimately deployable systems. This can be a challenge. We suspect that there might be relatively simple hardware accessories that would improve the process dramatically. Here are two cases:

Bee wings

Bees are identified from preserved specimens under a microscope. Obtaining a clear image of a bee’s wing is a challenge: generally the wing is detached from the body and sandwiched under a cover slip, which is fiddly and time-consuming. The resulting images often have glare spots, which are noise for the ID algorithm. So the design challenge is this: is there a specially shaped platform (in microscope terms, a ‘stage’) that would allow a high-quality image of a bee wing (or really any kind of insect wing) to be taken quickly without detachment?


The leaves of some species have distinct outlines, but most are ‘simple,’ and the key to distinguishing the leaves of different species is the pattern of venation. It is therefore necessary to acquire an image of a leaf that emphasizes this, which can be achieved by oblique lighting (usually of the underside of the leaf). This is easy to achieve in a lab, but we want to be able to do it in the field. So we are thinking about a sort of light box onto which one might place an iPhone or similar in a specially designed slot, plus a leaf. The box would apply consistent oblique lighting (plus a plain background) for getting a good leaf image, allowing field researchers to obtain many images in a short amount of time without having to bring the leaves back to the lab.

Environmental data science and communication

Suitable for students from STS, design and/or computing backgrounds, probably working in teams.

‘State of NJ’ dashboard

Create a live display of key environmental indicators for the state of NJ, from public records. Forest coverage, soil loss, recycling efficiency, energy consumption balance, etc. Requires some ‘investigative journalism’ skills.

Augmented reality for environmental information

Take existing environmental datasets and figure out how to represent normally invisible environmental variables and processes using location-based AR. Example 1: show sea surface temperature anomalies as an overlay on the actual ocean. Example 2: show gas exchange around a tree, for example as visible particles as in the ‘carbon tree’ simulation (from hiilipuu.fi). Requires someone with skills in (or interested in learning) Apple’s ARKit or Google’s ARCore.