I work on research and special projects at the Center for AI Safety, advised by Dan Hendrycks. I studied Computer Science (Penn Engineering) and Economics (Wharton) at the University of Pennsylvania.
I have co-led the most comprehensive empirical meta-analysis of AI safety benchmarks to date (Safetywashing, NeurIPS '24) as well as the development of an AI honesty benchmark (MASK). My co-1st-authored work has been presented at the UK Government AI Safety Institute (by invitation), cited by the Singapore Consensus on AI Safety Priorities, and used by researchers at xAI, OpenAI, and Anthropic.
- The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems β co-1st author, in collaboration with Scale AI
- Remote Labor Index: Measuring AI Automation of Remote Work β in collaboration with Scale AI
- Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs β ICML 2025 Spotlight
- Humanity's Last Exam: A Benchmark of Expert-Level Academic Questions β Nature, in collaboration with Scale AI
- Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress? β co-1st author, NeurIPS 2024 D&B
- Localizing Lying in Llama β co-1st author, NeurIPS 2023 SoLaR Workshop
- Representation Engineering: A Top-Down Approach to AI Transparency β co-2nd author, 766+ citations
- High-Efficiency Scattering Probe Design for S-Polarized Near-Field Microscopy β Applied Physics Express, 2021
- Validity of ML in Quantitative Analysis of Complex SNOM Signals β Physical Review Applied, 2021
Website Β· Google Scholar Β· LinkedIn Β· X Β· Substack Β· Email

