Hello
I spend my time thinking about how to make powerful AI systems honest and aligned to human values. I work on things honesty training, dangerous capability evaluations, monitoring and control.
I’m currently a Master’s student in ML at UCL and teach at ARENA, a ML engineering program for upskilling people in doing technical AI safety work. Previously, I was the director of Cambridge AI Safety Hub, where I ran educational/research programs like CaMLAB and MARS. I have a BA Hons in psychology & neuroscience from the University of Cambridge, where I was interested in computational cognitive neuroscience.
Publications & other work
- Chloe Li, Mary Phuong, Noah Y. Siegel (2025). LLMs Can Covertly Sandbag on Capability Evaluations Against Chain-of-Thought Monitoring. In Proceedings of the ICML Workshop on Technical AI Governance, Vancouver, Canada. (Oral Presentation)
- ARENA LLM Evaluations Curriculum