Resolving object recognition in space and time

Radoslaw M. Cichy
Dimitrios Pantazis
Aude Oliva
Free University Berlin
MIT
MIT

Abstract

A comprehensive picture of object processing in the human brain requires combining both spatial and temporal information about brain activity. Here, we acquired human magnetoencephalography (MEG) and functional magnetic resonance imaging (fMRI) responses to 92 object images. Multivariate pattern classification applied to MEG revealed the time course of object processing: whereas individual images were discriminated by visual representations early, ordinate and superordinate category levels emerged relatively later. Using representational similarity analysis, we combined human fMRI and MEG to show content-specific correspondence between early MEG responses and primary visual cortex (V1), and later MEG responses and inferior temporal (IT) cortex. We identified transient and persistent neural activities during object processing with sources in V1 and IT. Finally, human MEG signals were correlated to single-unit responses in monkey IT. Together, our findings provide an integrated space- and time-resolved view of human object categorization during the first few hundred milliseconds of vision.


[paper][journal link]
[supplementary materials]

Multivariate pattern classification on MEG data reveals time course of object processing

We recorded MEG and fMRI data while participants viewed a set of 92 different objects (image set previously used by Kiani et al., 2007 & Kriegeskorte et al., 2008).

92 object image set

We conducted time-resolved multivariate pattern classification on the MEG data, pairwise for all condition combinations. This yielded a 92*92 matrix of decoding results, indexed in rows and columns by the conditions/images classified. The movie below shows the results. Until ~55ms after image onset, there is no information and structure. Then, a stable structure emerges. Over time, different categorical distinctions emerge.

Fusion of fMRI with MEG data by RSA localizes sources of observed dynamics

To localize the sources of the observed MEG signals we used represenational similarity analysis and fused MEG data with fMRI data. We found that early MEG signals had similiar representations to V1, and later MEG signals had similar representations to IT.

Results of fMRI and MEG ROI fusion

Transient and persistent aspects of visual representations

We determined the persistance of visual representations, i.e. similarity of representations over time. Besides evidence for transient signals throughout, we found evidence for persistent signals at two time point combinations (indicated by ellipses in dotted white lines below).
Results of time-time analysis


Comparing these results to the fMRI data in V1 and IT, we found that one of the persistent signals originated from V1, and the other from IT.
Sources of persistent activity in time-time analysis

MEG and fMRI Data

We provide the fMRI and MEG data for the 92 object image set. Please refer to the readme.txt in each link for further information.
MEG RDMs (representational dissimilarity matrices)
fMRI beta- and t-value maps

Acknowledgements

This work was funded by National Eye Institute grant EY020484 and a Google Research Faculty Award (to A.O.), the McGovern Institute Neurotechnology Program (to D.P. and A.O.), a Feodor Lynen Scholarship of the Humboldt Foundation (to R.M.C) and an Emmy Noether grant of the DFG (CI 241/1-1), and was conducted at the Athinoula A. Martinos Imaging Center at the McGovern Institute for Brain Research, Massachusetts Institute of Technology. We thank Chen Yi and Carsten Allefeld for helpful comments on methodological issues, and Seyed-Mahdi Khaligh-Razavi and Santani Teng for helpful comments on the manuscript.