profile photo

Xinyue (Vera) Shen

PhD Candidate

CISPA Helmholtz Center for Information Security

Email  /  CV  /  Scholar  /  GitHub  /  Twitter  /  LinkedIn  /  🔊Feel free to call me Xinyue (pronounced "Shin-Yueh") or Vera.

Publications

The complete publication list can be found at Google Scholar.

Recent Preprints

  • "Humans welcome to observe": A First Look at the Agent Social Network Moltbook
    Yukun Jiang*, Yage Zhang*, Xinyue Shen*, Michael Backes, Yang Zhang
    [arXiv: 2602.10127]   [website]   [dataset]
    Coverage: [TechXplore] [AI Era]
  • Real Money, Fake Models: Deceptive Model Claims in Shadow APIs
    Yage Zhang, Yukun Jiang, Zeyuan Chen, Michael Backes, Xinyue Shen, Yang Zhang
    [arXiv: 2603.01919]
    Coverage: [Jiqizhixin]
  • Benchmark of Benchmarks: Unpacking Influence and Code Repository Quality in LLM Safety Benchmarks
    Junjie Chu, Xinyue Shen, Ye Leng, Michael Backes, Yun Shen, Yang Zhang
    [arXiv: 2603.04459]
  • OrgAgent: Organize Your Multi-Agent System like a Company
    Yiru Wang, Xinyue Shen, Yaohui Han, Michael Backes, Pin-Yu Chen, Tsung-Yi Ho
    [arXiv: 2604.01020]
  • HarmfulSkillBench: How Do Harmful Skills Weaponize Your Agents?
    Yukun Jiang, Yage Zhang, Michael Backes, Xinyue Shen, Yang Zhang
    [arXiv: 2604.15415]   [code]   [dataset]

2026

  • Open Schrödinger's Closed Box: Identifying Retrieval Augmented Generation in API-Accessible Large Language Model Services
    Yukun Jiang, Xinyue Shen, Michael Backes, Zheng Li, Yang Zhang
    ACL 2026   [pdf]
  • The Art of (Mis)alignment: How Fine-Tuning Methods Effectively Misalign and Realign LLMs in Post-Training
    Rui Zhang, Hongwei Li, Yun Shen, Xinyue Shen, Wenbo Jiang, Guowen Xu, Yang Liu, Michael Backes, Yang Zhang
    ACL 2026 (Findings)   [pdf]

2025

  • GPTracker: A Large-Scale Measurement of Misused GPTs
    Xinyue Shen, Yun Shen, Michael Backes, Yang Zhang
    IEEE S&P 2025   [pdf]   [dataset]   [code]
  • On the Effectiveness of Prompt Stealing Attacks on In-The-Wild Prompts
    Yicong Tan, Xinyue Shen, Yun Shen, Michael Backes, Yang Zhang
    IEEE S&P 2025   [pdf]   [code]
  • HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns
    Xinyue Shen, Yixin Wu, Yiting Qu, Michael Backes, Savvas Zannettou, Yang Zhang
    USENIX Security 2025   [pdf]   [website]   [dataset]   [code]
    Coverage: [CNIL, France's Data Protection Authority]
  • From Meme to Threat: On the Hateful Meme Understanding and Induced Hateful Content Generation in Open-Source Vision Language Models
    Yihan Ma, Xinyue Shen, Yiting Qu, Ning Yu, Michael Backes, Savvas Zannettou, Yang Zhang
    USENIX Security 2025   [pdf]   [dataset]   [code]
  • When GPT Spills the Tea: Comprehensive Assessment of Knowledge File Leakage in GPTs
    Xinyue Shen, Yun Shen, Michael Backes, Yang Zhang
    ACL 2025   [pdf]   [arXiv]   [website]
  • Are We in the AI-Generated Text World Already? Quantifying and Monitoring AIGT on Social Media
    Zhen Sun, Zongmin Zhang, Xinyue Shen, Ziyi Zhang, Yule Liu, Michael Backes, Yang Zhang, Xinlei He
    ACL 2025   [pdf]   [dataset]   [code]
  • JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs
    Junjie Chu, Yugeng Liu, Ziqing Yang, Xinyue Shen, Michael Backes, Yang Zhang
    ACL 2025   [pdf]   [website]   [code]   (Oral)
  • UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images
    Yiting Qu, Xinyue Shen, Yixin Wu, Michael Backes, Savvas Zannettou, Yang Zhang
    ACM CCS 2025   [pdf]   [website]   [dataset]   [code]

2024

  • "Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
    Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
    ACM CCS 2024   [pdf]   [website]   [dataset]   [code]
    Best Machine Learning & Security Paper in Cybersecurity Award 2025
    Coverage: [New Scientist] [German Federal Office for Information Security] [NIST] [Deutschlandfunk Nova] [Spektrum.de]
  • MGTBench: Benchmarking Machine-Generated Text Detection
    Xinlei He, Xinyue Shen, Zeyuan Chen, Michael Backes, Yang Zhang
    ACM CCS 2024   [pdf]   [dataset]   [code]
  • Prompt Stealing Attacks Against Text-to-Image Generation Models
    Xinyue Shen, Yiting Qu, Michael Backes, Yang Zhang
    USENIX Security 2024   [pdf]   [video]   [dataset]   [code]
    Recognized by Microsoft Vulnerability Severity Classification for AI Systems
    Coverage: [German Federal Office for Information Security] [NIST] [CISPA News]
  • The Death and Life of Great Prompts: Analyzing the Evolution of LLM Prompts from the Structural Perspective
    Yihan Ma, Xinyue Shen, Yixin Wu, Boyang Zhang, Michael Backes, Yang Zhang
    EMNLP 2024   [pdf]
  • ModScan: Measuring Stereotypical Bias in Large Vision-Language Models from Vision and Language Modalities
    Yukun Jiang, Zheng Li, Xinyue Shen, Yugeng Liu, Michael Backes, Yang Zhang
    EMNLP 2024   [pdf]
  • Games and Beyond: Analyzing the Bullet Chats of Esports Livestreaming
    Yukun Jiang, Xinyue Shen, Rui Wen, Zeyang Sha, Junjie Chu, Yugeng Liu, Michael Backes, Yang Zhang
    ICWSM 2024   [pdf]   [poster]

2023

2022

  • On Xing Tian and the Perseverance of Anti-China Sentiment Online
    Xinyue Shen, Xinlei He, Michael Backes, Jeremy Blackburn, Savvas Zannettou, Yang Zhang
    ICWSM 2022   [pdf]

Before PhD

  • Evil Under the Sun: Understanding and Discovering Attacks on Ethereum Decentralized Applications
    Liya Su*, Xinyue Shen*, Xiangyu Du, Xiaojing Liao, XiaoFeng Wang, Luyi Xing, Baoxu Liu
    USENIX Security 2021   [pdf]

* Equal contribution.

Copyright © Xinyue (Vera) Shen