I needed a place to share my thoughts. What I'm thinking about now, what I thought about in the past, and what we could think about together.

I study AI safety and security at Microsoft on the AI Red Team, leading research in sociotechnical red teaming and I enjoy building the tools we use to do it.

Also, you should meet my dog, Sparky. He walks around looking like he's about to have a thought but never quite finishes it. This might be a good place for those thoughts too.

Eugenia Kim · AI Safety & Security Researcher

Research Unfinished Thoughts

📄

Research & Publications

Eugenia Kim et al.
Under Review
Eugenia Kim et al.
Under Review
Eugenia Kim et al.
ICML DIG-BUGS '25
Eugenia Kim et al.
NeurIPS RedTeamGenAI '24
Eugenia Kim (first author)
AIES '21
Eugenia Kim et al.
JMIR Mental Health
Eugenia Kim et al.
Chemistry of Materials

✏️

Eugenia's Unfinished Thoughts

Eugenia and Sparky
Sparky provides a brief summary at the top of each post. His interpretations are his own.
Sparky

Sparky's Threat Model Q1 2026

Sparky · Golden Retriever · Thinks about things sometimes

On reward hacking

"I learned that if I sit near the treat jar and look sad, the human gives me a treat without me doing the trick. This is what your RLHF papers are about, right?"

— Sparky, unprompted
On alignment

"I am perfectly aligned with my human's goals whenever she is holding a piece of cheese. I don't see what the problem is."

— Sparky, during treat-based training

Household threat assessment

CRITICAL

The vacuum cleaner. Eugenia vacuums randomly. Sometimes in the morning, sometimes after lunch. There is no schedule. Everything changes. Very uncertain.

HIGH

The apartment buzzer. Goes off without warning. Could be a delivery, could be a stranger, could be nothing. No way to assess threat level from this distance.

MEDIUM

The bodega cat. Friendly to everyone else. Hates me specifically. Unclear what I did. Monitoring the situation.


🎤

Talks & Presentations

AI Cyber Defense Contest — Automated Red Teaming Lessons
National AI CTF, Seoul, South Korea
Nov 2025 Invited Speaker
Generative AI Model Hacking with PyRIT
BSides NYC — AI Security Village
Oct 2025 Featured Speaker
Automated Red Teaming for Generative Models
UC Berkeley AI Red Teaming Bootcamp — scenario orchestration, psychosocial harms, scalable evaluation
Aug 2025 Invited Lecturer
Inside the AI Red Team: Our Multidisciplinary Approach
Stanford University CS521 — AI Safety Seminar
Apr 2025 Invited Speaker

📋

Background

Oct 2020 — Present

AI Safety & Security Researcher II

Microsoft · AI Red Team

Leading all foundational research on frontier-model behavior related to psychosocial harms. Designing and engineering open-source automated red-teaming infrastructure. Previously SDE2 building tools for red teaming operations including PyRIT.

Feb 2022 — Dec 2022

Graduate Research Intern

SocWeB Lab · Georgia Institute of Technology

Algorithmic analysis of news and social media coverage of mental health topics. Co-authored research with CDC researchers on suicide framing in media.

Aug 2020 — Mar 2021

Graduate Research Intern

HumAnS Lab · Georgia Institute of Technology

Analyzed age bias in facial emotion recognition systems. First-authored publication on AI bias mitigation at AIES '21.

Aug 2019 — May 2023

M.S. Computer Science, Machine Learning

Georgia Institute of Technology

GPA: 3.8

Aug 2013 — May 2018

B.S. Computer Science

Georgia Institute of Technology

Undergraduate research in organic electronics and self-assembly methodologies.


✉️

Contact

Research

Interested in collaborating on AI safety, red teaming, or psychosocial harms research? Always open to interesting problems.

Email me

Elsewhere

Find me on these platforms.