I'm interested in sequence models, multimodality, and visual reasoning. Much of my work continues to be inspired by my time at MIT with Bill Freeman and Ruth Rosenholtz. Under their guidance, I explored how we can understand visual representation learning through the lens of human intelligence.
I continue to be exicted by the question of how we can build efficient and general visual systems through a blend of data, architecture, and learning constrainsts.
A family of foundation models featuring a sparse hybrid architecture, optimized for on-device deployment. Includes multimodal variants for vision and audio across multiple scales.
Peripheral vision dataset to evalaute and train deep neural networks.
Object Detection in Deep Neural Networks Differs from Humans in the Periphery
Anne Harrington,
Vasha DuTell,
Mark Hamilton,
Ayush Tewari,
Simon Stent,
William T. Freeman,
Ruth Rosenholtz
ATTRIB @ NeurIPS, 2023
Psychophysics testing object detection in humans and deep neural networks.