Xinyue (Vera) Shen

Ph.D. Student

CISPA Helmholtz Center for Information Security, Germany

I am on the academic job market for the 2025-2026 cycle. Would greatly appreciate it if you'd like to discuss any opportunities!

Hi, I am Xinyue (pronounced “Shin-Yueh” 🔊). I am a final-year Ph.D. candidate at CISPA Helmholtz Center for Information Security, advised by Michael Backes and Yang Zhang. I earned my B.S. from the University of Electronic Science and Technology of China (UESTC). Before joining CISPA, I have two year industrial experience of working as an algorithm engineer at Alibaba.

My research interests lie in Trustworthy AI, with a focus on the security, safety, and responsibility of generative AI systems. My recent work focuses on three main directions:

Understanding real-world AI system misuses, such as in-the-wild jailbreaks and LLM agent misuses.
Proactively detecting and mitigating misused outputs from AI systems, such as hate speech, hateful memes, unsafe images, stereotypes, and AIGC.
Identifying emerging security risks like prompt stealing attack and knowledge file leakage.

Recognitions and Awards: My research has been acknowledged by Google, Microsoft, and OpenAI, and featured in major media outlets such as New Scientist, Deutschlandfunk Nova. My work is now integrated into major AI systems such as Nvidia’s Garak, OpenAI’s GPT-4.5, o3-mini, and o1, with 3K+ Github stars and 53K+ downloads on HuggingFace. I’m honored with several awards, including the KAUST Rising Star in AI (2025), Machine Learning and Systems Rising Star (2025), Heidelberg Laureate Forum Young Researcher (2024), and Best Machine Learning and Security Paper in Cybersecurity Award (2025).

Teaching, Mentoring, and Outreach: I am passionate about teaching and mentoring. To help reduce barriers to starting research or pursuing a Ph.D. in this area, I am hosting weekly office hours open to everyone (please sign up from Calendly!). I also write sci-fi novels and popular-science articles to make AI and Cybersecurity more accessible to the general public, especially the next generation.

Selected Publications

“Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang;

🏆
🎙️ Coverage:
Prompt Stealing Attacks Against Text-to-Image Generation Models
Xinyue Shen, Yiting Qu, Michael Backes, Yang Zhang;

🏆 Recognized in
🎙️ Coverage:
HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns
Xinyue Shen, Yixin Wu, Yiting Qu, Michael Backes, Savvas Zannettou, Yang Zhang;

📦 Artifact Badges: Available, Functional, Results Reproduced
🎙️ Coverage:
GPTracker: A Large-Scale Measurement of Misused GPTs
Xinyue Shen, Yun Shen, Michael Backes, Yang Zhang;

✨ Our findings help the platform owner take down thousands of misused GPTs
When GPT Spills the Tea: Comprehensive Assessment of Knowledge File Leakage in GPTs
Xinyue Shen, Yun Shen, Michael Backes, Yang Zhang;

What’s new?

(2025.09) My full-length popular science novel “Help! I’ve Fallen into the Computer!” was published! Thanks to my amazing editor and everyone who helped me make this dream come true!
(2025.07) Our paper titled “UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images” got accepted by ACM CCS 2025.
(2025.07) Our paper ““Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models” won the Best Machine Learning and Security Paper in Cybersecurity Award 2025!
(2025.07) I will serve on the Program Committee of SaTML 2026.
(2025.06) I will be giving an invited talk at the LLMApp Workshop @FSE 2025. See you in Norway!
(2025.06) I will serve on the Program Committee of ICWSM 2026.
(2025.05) 3 papers got accepted by ACL 2025. See you in Vienna!
(2025.05) I will be giving an invited talk at CNIL Privacy Research Day 2025 on LLM-driven threats in hate speech domain (HateBench). See you in Paris!
(2025.05) I will serve on the Program Committee of AISec 2025.
(2025.03) Thrilled to be selected as 2025 ML and Systems Rising Star!
(2025.03) Our paper titled “GPTracker: A Large-Scale Measurement of Misused GPTs” got accepted by IEEE S&P 2025. See you in San Francisco!
(2025.03) Our paper titled “On the Effectiveness of Prompt Stealing Attacks on In-The-Wild Prompts” got accepted by IEEE S&P 2025. See you in San Francisco!
(2025.01) Thrilled to be selected as 2025 KAUST Rising Star in AI!
(2025.01) Our paper titled “HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns” got accepted by USENIX Security 2025. See you in Seattle!
(2025.01) Our paper titled “From Meme to Threat: On the Hateful Meme Understanding and Induced Hateful Content Generation in Open-Source Vision Language Models” got accepted by USENIX Security 2025. See you in Seattle!
(2024.11) My popular-science novel “When Trojan Virus Meets Military Training” won the "Outstanding Popular Science Work Award" from China Science Writers Association. May it inspire love for cybersecurity in young readers ;D
(2024.09) I will serve on the Program Committee of ICWSM 2025.
(2024.07) I will serve on the Program Committee of USENIX Security 2025.