joycenerd.cs09[AT]nycu.edu.tw EC129 Guangfu Campus
Hi! I am a research fellow at the Oxford AI Governance Initiative (AIGI) at University of Oxford, working with Fazl Barez. I'm also an incoming PhD student at CISPA under the supervision of Mario Fritz. Previously, I was a research assistant in the Reinforcement Learning and Bandits Lab (REAL) at National Yang Ming Chiao Tung University, working closely with Ping-Chun Hsieh and Pin-Yu Chen from IBM.
My research focuses on developing trustworthy and interpretable generative AI systems. I work on understanding and controlling the internal mechanisms of large generative models, particularly text-to-image diffusion models and LLMs to build more robust safety systems. My published work includes developing red-teaming tools for safe text-to-image model development. Currently, I'm developing automated interpretability frameworks for analyzing model misbehavior, focusing on identifying which internal components are responsible for generating problematic content in generative models.
I am broadly interested in AI safety, mechanistic interpretability, and multimodal generative models, with a focus on red-teaming methodologies and alignment. For more details about me and my work, please see my CV and Google Scholar. I'd love to connect and discuss research, feel free to reach out at joycenerd.cs09[AT]nycu.edu.tw
for potential collaborations or discussions.
For a comprehensive list of my publications, please refer to my Google Scholar.
(† indicates equal contribution)
Besides research, I am an opera and classical crossover singer who performs both soprano and alto pieces. The majority of my free time is spent running and have completed three half marathons and one full marathon. I also enjoy reading and exploring dessert and coffee shops in my free time.