Inverted Encoding Models

A Brief Introduction to My Project: 

Our brains are sensitive to features of what we see. Imagine that you are in one of those big lecture halls with lots of projector screens. The projectors are fixed at some locations with your professors moving their hands and explaining things. You are able to detect the motion of your professor at those projectors. What you see goes through your eyes and the signals are transmitted to your brain. 

The basis behind the ability to detect motion at a certain location is based on the motion and spatial location selectivity of neurons in our visual cortex. On a cellular level, the evidence comes from single-cell recordings conducted on monkeys. For humans, functional magnetic resonance imaging(fMRI) is able to show general feature selectivity of our visual cortex, such as the region motion-selective region TO1/TO2). 

So given that we know the feature selectivity of the brain roughly resembles a Gaussian curve and we have fMRI as a powerful tool to extract neural activity patterns. We can build an encoding model to predict responses generated by the brain when shown a visual stimulus and a decoding model to understand/uncover to the stimulus presented to the brain and potentially reconstruct the representation of it. 

One particular model that does the combination of encoding and decoding is the Inverted Encoding Model (IEM), which leverages neural activation patterns based on responses to presented stimuli within a given feature space to reconstruct representations of those stimuli. Previous studies have shown that IEM could reconstruct the representation of individual visual features (Serences & Saproo, 2014; Sprague & Serences, 2013).

But, in the real world, we have combinations of features, such as the example I mentioned at the beginning, the professor waving his/her hands displayed on a projector at a particular location. In fact, specific visual regions contain neural populations that are both specialized for space and motion (Albright et. al., 1984; Huk & Heeger, 2002; St-Yves & Naselaris, 2018). And, IEM has yet to show the ability to reconstruct a joint representation of visual features. So, my project aims to augment IEM to reconstruct the combined representation of spatial locations and motion in one’s visual regions. 

Our Poster at VSS 2023: