I am a PhD student studying computer science at the University of California, Santa Cruz, advised by Dr. Chenguang Wang. Previously, I completed my Bachelor’s degree in Data Science from the Mathematics Department at Washington University in St. Louis, where I was advised by Dr. Ulugbek Kamilov and graduated with Highest Distinction.

My research focuses on developing methods grounded in linear algebra to interpret and steer the internal activations of large language models (LLMs) to improve their safety and reliability.

I am always looking for collaborators or students to work with me on our projects. Feel free to email me regarding collaboration or, if you are a UCSC student, fill out our lab intake form here!

Research Interests: LLM Interpretability, Alignment & Safety, Agentic AI

📢 Announcements

  • December 2025: Our workshop on Agent Safety is accepted to ICLR - see you all in Brazil! I will be in NeurIPS this week as well.
  • September 2025: RepIt and SteeringSafety are now on arXiv!
  • August 2025: I will be moving with my advisor to UCSC to continue my PhD! Additionally, AgentVigil was accepted to EMNLP 2025!
  • March 2025: Excited to share COSMIC accepted to ACL 2025! 🎉
RepIt overview

RepIt: Steering Language Models with Concept-Specific Refusal Vectors

Vincent Siu, Nicholas W. Henry, Nicholas Crispino, Yang Liu, Dawn Song, Chenguang Wang

ResponsibleFM, NeurIPS2025

Paper (arXiv)

SteeringSafety result table

SteeringSafety: A Systematic Safety Evaluation Framework of Representation Steering in LLMs

Vincent Siu*, Nicholas Crispino*, David Park, Nathan W. Henry, Zhun Wang, Yang Liu, Dawn Song, Chenguang Wang

ResponsibleFM, NeurIPS2025

Paper (arXiv) | Code (GitHub) | Data (HuggingFace)

Find the full list of publications here

Open Source Software

  • MassGen — Contributor
    Open-source framework for scaling multi-agent LLM systems. I contribute by setting and tracking long-term development goals, reviewing code, and ensuring releases stay aligned with project direction.
    ⭐ 400+ GitHub stars