Erin Grant
she / her

I am a Senior Research Fellow at the
Sainsbury Wellcome Centre & Gatsby Computational Neuroscience Unit
at University College London (UCL).

Research summary: I study prior knowledge and learning mechanisms in minds and machines using a combination of behavioral experiments, computational simulations, and analytical techniques. See my research page for more.


active / ongoing

🔬 We’re looking for pre-doctoral Project Research Interns at the Gatsby! We review applications 4 times a year but the start date is flexible. We especially encourage applications from students who have faced barriers to participation in research. See the eligibility criteria and apply here.

Dec 2023
Sep 2023

I graduated as a reviewer and am now an area chair for ICLR.

Aug 2023

I attended CCN in Oxford.

May 2023

I graduated as a reviewer and am now an area chair for NeurIPS.

Apr 2023

I am serving a renewed term on the WiML Board of Directors.

Mar 2023

🥶 I attended COSYNE for the first time, and sadly failed to ski.

Recent Publications

For a complete list, see my research page or my CV.

The transient nature of emergent in-context learning in transformers.
Aaditya K. Singh*, Stephanie C.Y. Chan*, Ted Moskovitz, Erin Grant, Andrew Saxe, Felix Hill.
In Advances in Neural Information Processing Systems (NeurIPS), 2023.

Transformer neural networks can exhibit a surprising capacity for in-context learning (ICL), despite not being explicitly trained for it. Prior work has provided a deeper understanding of how ICL emerges in transformers, e.g., through the lens of mechanistic interpretability, Bayesian inference, or by examining the distributional properties of training data. However, in each of these cases, ICL is treated largely as a persistent phenomenon; namely, once ICL emerges, it is assumed to persist asymptotically. Here, we show that the emergence of ICL during transformer training is, in fact, often transient. We train transformers on synthetic data designed so that both ICL or in-weights learning (IWL) strategies can lead to correct predictions. We find that ICL first emerges, then disappears and gives way to IWL, all while the training loss decreases, indicating an asymptotic preference for IWL. The transient nature of ICL is observed in transformers across a range of model and dataset sizes, raising the question of how much to "overtrain" transformers when seeking compact, cheaper-to-run models. We find that L2 regularization may offer a path to more persistent ICL that removes the need for early stopping based on ICL-style validation tasks.

Gaussian process surrogate models for neural networks.
Gaussian process surrogate models for neural networks.
Michael Y. Li, Erin Grant, Thomas L. Griffiths.
In Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), 2023.

The lack of insight into deep learning systems hinders their systematic design. In science and engineering, modeling is a methodology used to understand complex systems whose internal processes are opaque. Modeling replaces a complex system with a simpler surrogate that is more amenable to interpretation. Drawing inspiration from this, we construct a class of surrogate models for neural networks using Gaussian processes. Rather than deriving the kernels for certain limiting cases of neural networks, we learn the kernels of the Gaussian process empirically from the naturalistic behavior of neural networks. We first evaluate our approach with two case studies inspired by previous theoretical studies of neural network behavior in which we capture neural network preferences for learning low frequencies and identify pathological behavior in deep neural networks. In two further practical case studies, we use the learned kernel to predict the generalization properties of neural networks.

Distinguishing rule- and exemplar-based generalization in learning systems.
Distinguishing rule- and exemplar-based generalization in learning systems.
Ishita Dasgupta*, Erin Grant*, Thomas L. Griffiths.
In Proceedings of the International Conference on Machine Learning (ICML), 2022.

Machine learning systems often do not share the same inductive biases as humans and, as a result, extrapolate or generalize in ways that are inconsistent with our expectations. The trade-off between exemplar- and rule-based generalization has been studied extensively in cognitive psychology; in this work, we present a protocol inspired by these experimental approaches to probe the inductive biases that control this trade-off in category-learning systems such as artificial neural networks. We isolate two such inductive biases: feature-level bias (differences in which features are more readily learned) and exemplar-vs-rule bias (differences in how these learned features are used for generalization of category labels). We find that standard neural network models are feature-biased and have a propensity towards exemplar-based extrapolation; we discuss the implications of these findings for machine-learning research on data augmentation, fairness, and systematic generalization.

Recent Service

Area Chair
Diversity, Equity, & Inclusion Chair