Selected Publications

The complete publication list can be found at Google Scholar.

  • Prompt Stealing Attacks Against Text-to-Image Generation Models
    Xinyue Shen, Yiting Qu, Michael Backes, Yang Zhang; USENIX Security 2024
    [pdf] [arxiv] [code]

  • “Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
    Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang; CCS 2024
    [arxiv] [website] [code] Media Coverage: New Scientist, Deutschlandfunk Nova

  • MGTBench: Benchmarking Machine-Generated Text Detection
    Xinlei He, Xinyue Shen, Zeyuan Chen, Michael Backes, Yang Zhang; CCS 2024
    [arxiv] [code]

  • The Death and Life of Great Prompts: Analyzing the Evolution of LLM Prompts from the Structural Perspective
    Yihan Ma, Xinyue Shen, Yixin Wu, Boyang Zhang, Michael Backes, Yang Zhang; EMNLP 2024

  • ModScan: Measuring Stereotypical Bias in Large Vision-Language Models from Vision and Language Modalities
    Yukun Jiang, Zheng Li, Xinyue Shen, Yugeng Liu, Michael Backes, Yang Zhang; EMNLP 2024

  • Games and Beyond: Analyzing the Bullet Chats of Esports Livestreaming
    Yukun Jiang, Xinyue Shen, Rui Wen, Zeyang Sha, Junjie Chu, Yugeng Liu, Michael Backes, Yang Zhang; ICWSM 2024
    [pdf] [arxiv] [poster]

  • Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
    Yiting Qu, Xinyue Shen, Xinlei He, Michael Backes, Savvas Zannettou, Yang Zhang; CCS 2023
    [arxiv] [code] Media Coverage: Montreal AI Ethics Institute

  • On Xing Tian and the Perseverance of Anti-China Sentiment Online
    Xinyue Shen, Xinlei He, Michael Backes, Jeremy Blackburn, Savvas Zannettou, Yang Zhang; ICWSM 2022
    [pdf] [arxiv] [slides]

  • Evil Under the Sun: Understanding and Discovering Attacks on Ethereum Decentralized Applications
    Liya Su, Xinyue Shen (co-first author), Xiangyu Du, Xiaojing Liao, XiaoFeng Wang, Luyi Xing, Baoxu Liu; USENIX Security 2021
    [pdf] [slides]

  • Voice Jailbreak Attacks Against GPT-4o
    Xinyue Shen, Yixin Wu (co-first author), Michael Backes, Yang Zhang
    [arxiv] [code] Media Coverage: TheCyberExpress, The Decoder

  • In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT
    Xinyue Shen, Zeyuan Chen, Michael Backes, Yang Zhang
    [arxiv]

  • Comprehensive Assessment of Jailbreak Attacks Against LLMs
    Junjie Chu, Yugeng Liu, Ziqing Yang, Xinyue Shen, Michael Backes, Yang Zhang
    [arxiv]

  • UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images
    Yiting Qu, Xinyue Shen, Yixin Wu, Michael Backes, Savvas Zannettou, Yang Zhang
    [arxiv] [website] [code]

Invited Talk

  • Heidelberg Laureate Forum (HLF), “Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models, 2024.
  • Google, Emerging Attacks in the Era of Generative AI, 2024.
  • The Ohio State University, Emerging Attacks in the Era of Generative AI, 2024.
  • AEGIS Symposium on Cyber Security, Emerging Attacks in the Era of Generative AI, 2024.
  • Shanghai Jiao Tong University, Understanding and Quantifying the Safety Issues of Large Foundation Models, 2023.
  • Fudan University, Understanding and Quantifying the Safety Issues of Large Foundation Models, 2023.
  • Sichuan University, Understanding and Quantifying the Safety Issues of Large Foundation Models, 2023.
  • University of Electronic Science and Technology of China, Understanding and Quantifying the Safety Issues of Large Foundation Models, 2023.
  • AEGIS Symposium on Cyber Security, Measuring the Reliability of ChatGPT, 2023.
  • Hack In The Box Conference (HITBConf), Solving The Last Mile Problem Between Machine Learning and Security Operations, 2018. [pdf] [link]

Pop-science & Novels

Also, as a sci-fiction writer, I am grateful for the opportunity to meet you via stories (Full List).

  • When Trojan Virus Meets Military Training. Outstanding Popular Science Work Award from China Science Writers Association in 2024.
  • How Far Are We From “Westworld”? Science Fiction World Pictorial·Amazing Science, 2024.03.
  • Hacking Storm. Exploration Discovery, 2023.01-02.
  • Is Artificial Intelligence a “Tower”? Science Fiction World (Youth), 2022.09.
  • Empty Yellow Crane Tower Here. The EELISA Science Fiction Contest, Chinese-Language Category Winner, 2022.02.
  • Lady White Bone. The Ninth “Light-Year” Award, First Prize, 2021.01.
  • Hack! A Seven-Day Invasion Diary of Trojan Horse. “Pop-science and Sci-fiction Youth Star” Award from China Science Writers Association, 2020.11.
  • Stars on the Wrist. Science Fiction Cube, 2020.07.
  • A War without Smoke: the Evolution History of Hacker Empire. Science Fiction World, 2019.06.