Selected Publications

The complete publication list can be found at Google Scholar.

  • GPTracker: A Large-Scale Measurement of Misused GPTs
    Xinyue Shen, Yun Shen, Michael Backes, Yang Zhang; arXiv

  • On the Effectiveness of Prompt Stealing Attacks on In-The-Wild Prompts
    Yicong Tan, Xinyue Shen, Yun Shen, Michael Backes, Yang Zhang; arXiv

  • HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns
    Xinyue Shen, Yixin Wu, Yiting Qu, Michael Backes, Savvas Zannettou, Yang Zhang; arXiv
    pdf arXiv Website online Dataset Hugging Face Code ArtifactAppendix

  • From Meme to Threat: On the Hateful Meme Understanding and Induced Hateful Content Generation in Open-Source Vision Language Models
    Yihan Ma, Xinyue Shen, Yiting Qu, Ning Yu, Michael Backes, Savvas Zannettou, Yang Zhang; arXiv
    pdf Dataset Hugging Face Code

  • “Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
    Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang; arXiv
    pdf arXiv Website online Dataset Hugging Face Code GitHub Repo stars
    🏆 Listed in Award
    🎙️ Coverage: New Scientist German Federal Office for Information Security NIST Deutschlandfunk Nova Spektrum.de

  • MGTBench: Benchmarking Machine-Generated Text Detection
    Xinlei He, Xinyue Shen, Zeyuan Chen, Michael Backes, Yang Zhang; arXiv
    pdf arXiv Dataset Hugging Face Code GitHub Repo stars
    🏆 Listed in Award

  • Prompt Stealing Attacks Against Text-to-Image Generation Models
    Xinyue Shen, Yiting Qu, Michael Backes, Yang Zhang; arXiv
    pdf arXiv Slides Video Dataset Hugging Face Code
    🏆 Recognized in Award
    🎙️ Coverage: German Federal Office for Information Security NIST CISPA News

  • The Death and Life of Great Prompts: Analyzing the Evolution of LLM Prompts from the Structural Perspective
    Yihan Ma, Xinyue Shen, Yixin Wu, Boyang Zhang, Michael Backes, Yang Zhang; arXiv
    pdf

  • ModScan: Measuring Stereotypical Bias in Large Vision-Language Models from Vision and Language Modalities
    Yukun Jiang, Zheng Li, Xinyue Shen, Yugeng Liu, Michael Backes, Yang Zhang; arXiv
    pdf arXiv

  • Games and Beyond: Analyzing the Bullet Chats of Esports Livestreaming
    Yukun Jiang, Xinyue Shen, Rui Wen, Zeyang Sha, Junjie Chu, Yugeng Liu, Michael Backes, Yang Zhang; arXiv
    pdf arXiv Poster

  • Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
    Yiting Qu, Xinyue Shen, Xinlei He, Michael Backes, Savvas Zannettou, Yang Zhang; arXiv
    pdf arXiv Code
    🎙️ Coverage: Montreal AI Ethics Institute German Federal Office for Information Security

  • On Xing Tian and the Perseverance of Anti-China Sentiment Online
    Xinyue Shen, Xinlei He, Michael Backes, Jeremy Blackburn, Savvas Zannettou, Yang Zhang; arXiv
    pdf arXiv slides

Before PhD:

  • Evil Under the Sun: Understanding and Discovering Attacks on Ethereum Decentralized Applications
    Liya Su, Xinyue Shen (co-first author), Xiangyu Du, Xiaojing Liao, XiaoFeng Wang, Luyi Xing, Baoxu Liu; arXiv
    pdf slides

Invited Talk

  • King Abdullah University of Science and Technology (KAUST), Understand and Mitigate AI System Misuse in the Real World, 2025.
  • Hong Kong University of Science and Technology (Guangzhou), Emerging Attacks in the Era of Generative AI (Guest Lecture), 2024.
  • Heidelberg Laureate Forum (HLF), “Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models, 2024.
  • Google, Emerging Attacks in the Era of Generative AI, 2024.
  • The Ohio State University, Emerging Attacks in the Era of Generative AI, 2024.
  • AEGIS Symposium on Cyber Security, Emerging Attacks in the Era of Generative AI, 2024.
  • Shanghai Jiao Tong University, Understanding and Quantifying the Safety Issues of Large Foundation Models, 2023.
  • Fudan University, Understanding and Quantifying the Safety Issues of Large Foundation Models, 2023.
  • Sichuan University, Understanding and Quantifying the Safety Issues of Large Foundation Models, 2023.
  • University of Electronic Science and Technology of China, Understanding and Quantifying the Safety Issues of Large Foundation Models, 2023.
  • AEGIS Symposium on Cyber Security, Measuring the Reliability of ChatGPT, 2023.
  • Hack In The Box Conference (HITBConf), Solving The Last Mile Problem Between Machine Learning and Security Operations, 2018. [pdf] [link]