profile photo

Xinyue (Vera) Shen

Incoming Assistant Professor

Cheriton School of Computer Science, University of Waterloo

Email  /  Scholar  /  GitHub  /  Twitter  /  LinkedIn  /  ๐Ÿ”ŠFeel free to call me Xinyue (pronounced "Shin-Yueh") or Vera.

Hello! I'm an incoming Assistant Professor at the Cheriton School of Computer Science, University of Waterloo UWaterloo, and a member of the Cryptography, Security, and Privacy (CrySP) group, starting in late 2026. I am currently finishing my PhD from CISPA Helmholtz Center for Information Security CISPA, advised by Michael Backes and Yang Zhang.

My research interests lie in Trustworthy AI, specifically focusing on three directions:

๐Ÿ” AI Misuse in the Wild
Discovering real-world AI misuse, such as jailbreaks, misused agents, and harmful skills.
๐Ÿ›ก๏ธ AI Security and Safety
Algorithmically mitigating AI-driven harms, including hate speech, unsafe images, model bias, and deepfakes.
๐ŸŒ AI in Society
Identifying risks in the broader AI ecosystem, like prompt marketplaces and agent OSNs.

My research has been recognized by Google, Microsoft, and OpenAI, and adopted into security evaluation pipelines of major AI systems (e.g., Nvidia's Garak, OpenAI's GPT-4.5/o3-mini/o1), with 3K+ GitHub stars and 100K+ HuggingFace downloads. My work has also been honored with ML and Systems Rising Star 2025, KAUST Rising Star in AI 2025, Best ML and Security Paper in Cybersecurity Award 2025.

I am looking for highly motivated PhD / research-based master students (MMath) to join my group! If you are interested, please email me (xinyue.shen@uwaterloo.ca) with your CV and list me as a potential advisor in your application.

News

Show more news

Publications

The complete publication list can be found at Google Scholar.

2026ACL
Open Schrรถdinger's Closed Box: Identifying Retrieval Augmented Generation in API-Accessible Large Language Model Services
Yukun Jiang, Xinyue Shen, Michael Backes, Zheng Li, Yang Zhang
2026ACL (Findings)
The Art of (Mis)alignment: How Fine-Tuning Methods Effectively Misalign and Realign LLMs in Post-Training
Rui Zhang, Hongwei Li, Yun Shen, Xinyue Shen, Wenbo Jiang, Guowen Xu, Yang Liu, Michael Backes, Yang Zhang
2025IEEE S&P
GPTracker: A Large-Scale Measurement of Misused GPTs
Xinyue Shen, Yun Shen, Michael Backes, Yang Zhang
2025IEEE S&P
On the Effectiveness of Prompt Stealing Attacks on In-The-Wild Prompts
Yicong Tan, Xinyue Shen, Yun Shen, Michael Backes, Yang Zhang
2025USENIX Security
HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns
Xinyue Shen, Yixin Wu, Yiting Qu, Michael Backes, Savvas Zannettou, Yang Zhang
2025USENIX Security
From Meme to Threat: On the Hateful Meme Understanding and Induced Hateful Content Generation in Open-Source Vision Language Models
Yihan Ma, Xinyue Shen, Yiting Qu, Ning Yu, Michael Backes, Savvas Zannettou, Yang Zhang
2025ACM CCS
UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images
Yiting Qu, Xinyue Shen, Yixin Wu, Michael Backes, Savvas Zannettou, Yang Zhang
2025ACL
JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs (Oral)
Junjie Chu, Yugeng Liu, Ziqing Yang, Xinyue Shen, Michael Backes, Yang Zhang
2025ACL
When GPT Spills the Tea: Comprehensive Assessment of Knowledge File Leakage in GPTs
Xinyue Shen, Yun Shen, Michael Backes, Yang Zhang
2025ACL
Are We in the AI-Generated Text World Already? Quantifying and Monitoring AIGT on Social Media
Zhen Sun, Zongmin Zhang, Xinyue Shen, Ziyi Zhang, Yule Liu, Michael Backes, Yang Zhang, Xinlei He
2024ACM CCS
"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
Best Machine Learning & Security Paper in Cybersecurity Award 2025
2024ACM CCS
MGTBench: Benchmarking Machine-Generated Text Detection
Xinlei He, Xinyue Shen, Zeyuan Chen, Michael Backes, Yang Zhang
2024USENIX Security
Prompt Stealing Attacks Against Text-to-Image Generation Models
Xinyue Shen, Yiting Qu, Michael Backes, Yang Zhang
Recognized by Microsoft Vulnerability Severity Classification for AI Systems
2024EMNLP
The Death and Life of Great Prompts: Analyzing the Evolution of LLM Prompts from the Structural Perspective
Yihan Ma, Xinyue Shen, Yixin Wu, Boyang Zhang, Michael Backes, Yang Zhang
2024EMNLP
ModScan: Measuring Stereotypical Bias in Large Vision-Language Models from Vision and Language Modalities
Yukun Jiang, Zheng Li, Xinyue Shen, Yugeng Liu, Michael Backes, Yang Zhang
2024ICWSM
Games and Beyond: Analyzing the Bullet Chats of Esports Livestreaming
Yukun Jiang, Xinyue Shen, Rui Wen, Zeyang Sha, Junjie Chu, Yugeng Liu, Michael Backes, Yang Zhang
2023ACM CCS
Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
Yiting Qu, Xinyue Shen, Xinlei He, Michael Backes, Savvas Zannettou, Yang Zhang
2022ICWSM
On Xing Tian and the Perseverance of Anti-China Sentiment Online
Xinyue Shen, Xinlei He, Michael Backes, Jeremy Blackburn, Savvas Zannettou, Yang Zhang
2021USENIX Security
Evil Under the Sun: Understanding and Discovering Attacks on Ethereum Decentralized Applications
Liya Su*, Xinyue Shen*, Xiangyu Du, Xiaojing Liao, XiaoFeng Wang, Luyi Xing, Baoxu Liu

* Equal contribution.

Selected Awards & Honors

  • Best Machine Learning and Security Paper in Cybersecurity Award, 2025
  • Machine Learning and Systems Rising Star, 2025
  • KAUST Rising Star in AI, 2025 (7.8%)
  • Heidelberg Laureate Forum Young Researcher, 2024
  • Abbe Grant, Carl-Zeiss-Stiftung Foundation, 2024
  • Outstanding Popular Science Work Award, China Science Writers Association, 2024
  • First Prize, Intel National College Student Software Competition, 2017 (2.0%)
  • Excellent Volunteer, National Games for Persons with Disabilities & National Special Olympics Games, 2015

Service

  • Program Committee:
    • 2027: USENIX Security, AsiaCCS, ACL, ICWSM
    • 2026: ACL, COLM, SaTML, ICWSM
    • 2025: USENIX Security, SaTML, AISec, ACL, ICWSM
    • 2024: ICWSM, AISec
  • Poster Program Committee: IEEE S&P (2023, 2024, 2025), USENIX Security (2024)
  • Artifact Evaluation Committee: ACM CCS (2024)
  • Journal Reviewers: Nature Human Behaviour, IEEE S&P Magazine, Pattern Recognition, TIFS, TOPS, TSE
  • Session Chair: USENIX Security (2025)
  • Organizing and Chairing: LAMPS workshop @ ACM CCS (2024)
Copyright © Xinyue (Vera) Shen