Wes Gurnee

Wes Gurnee：MIT 可解释性研究者，稀疏探针方法和 LLM 时空表征研究的核心贡献者

实

ENTITY · WES GURNEE · MIT · LLM INTERPRETABILITY · SPATIOTEMPORAL WORLD MODEL · SPARSE PROBING

Wes Gurnee

MIT interpretability researcher — systematic validation combining large-scale empirics, linear probing and causal intervention

With Max Tegmark, Gurnee used 6 spatiotemporal datasets (over 180k samples in total) to systematically show that Llama-2 carries linear representations of real-world geographic and historical-time coordinates, and to localize individual “space neurons” and “time neurons” — spatial probe R²=0.911, temporal probe R²=0.835.

Key Spatiotemporal Results (Llama-2 family)

R² = 0.911Spatial RepresentationLinear probe on geographic coordinates — validated across 6 spatial datasets

R² = 0.835Temporal RepresentationLinear probe on historical time coordinates — validated on chronological sequences

Neuron ablation confirms causality — intervening on “space neurons” changes spatial predictions, proving a causal role rather than mere correlation

Methodological Signature

Large-Scale Datasets

180k+ samples, 6 spatiotemporal datasets — pushes against small-sample case analysis

Full-Layer Probe Sweep

Linear probes applied to every transformer layer — identifies the layer-wise distribution of representations

Sparse Probing (2023)

Finding Neurons in a Haystack — uses a tiny set of neurons to localize where specific information is encoded

→ Spatiotemporal World Model · Linear Representation · Max TegmarkICLR 2024 arXiv:2310.02207

Wes Gurnee

机构： 麻省理工学院（MIT） 研究方向： LLM 可解释性、神经网络内部表征、稀疏探针

主要贡献

Wes Gurnee 是 LLM 可解释性领域的核心研究者之一。

Language Models Represent Space and Time（2023）

与 Max Tegmark 合作，发表于 ICLR 2024。首次系统证明 Llama-2 在内部形成了真实世界地理坐标和历史时间坐标的线性表征，并定位了个体”空间神经元”和”时间神经元”。

详见：时空世界模型，线性表征假说

Finding Neurons in a Haystack（2023）

提出稀疏探针方法，通过极少数神经元定位模型内部编码的特定信息（如性别、职业、年份）。

研究风格

倾向于构建大规模实证数据集，结合线性探针与因果干预进行系统验证，而非理论先行。代表性方法：构建 6 个时空数据集（累计 >18 万样本）+ 全层探针扫描 + 神经元消融验证。

References

sources/arxiv_papers/2310.02207-language-models-represent-space-and-time.md