The complete publication list can be found at Google Scholar.
GPTracker: A Large-Scale Measurement of Misused GPTs
Xinyue Shen, Yun Shen, Michael Backes, Yang Zhang;
On the Effectiveness of Prompt Stealing Attacks on In-The-Wild Prompts
Yicong Tan, Xinyue Shen, Yun Shen, Michael Backes, Yang Zhang;
HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns
Xinyue Shen, Yixin Wu, Yiting Qu, Michael Backes, Savvas Zannettou, Yang Zhang;
From Meme to Threat: On the Hateful Meme Understanding and Induced Hateful Content Generation in Open-Source Vision Language Models
Yihan Ma, Xinyue Shen, Yiting Qu, Ning Yu, Michael Backes, Savvas Zannettou, Yang Zhang;
“Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang;
🏆 Listed in
🎙️ Coverage:
MGTBench: Benchmarking Machine-Generated Text Detection
Xinlei He, Xinyue Shen, Zeyuan Chen, Michael Backes, Yang Zhang;
🏆 Listed in
Prompt Stealing Attacks Against Text-to-Image Generation Models
Xinyue Shen, Yiting Qu, Michael Backes, Yang Zhang;
🏆 Recognized in
🎙️ Coverage:
The Death and Life of Great Prompts: Analyzing the Evolution of LLM Prompts from the Structural Perspective
Yihan Ma, Xinyue Shen, Yixin Wu, Boyang Zhang, Michael Backes, Yang Zhang;
ModScan: Measuring Stereotypical Bias in Large Vision-Language Models from Vision and Language Modalities
Yukun Jiang, Zheng Li, Xinyue Shen, Yugeng Liu, Michael Backes, Yang Zhang;
Games and Beyond: Analyzing the Bullet Chats of Esports Livestreaming
Yukun Jiang, Xinyue Shen, Rui Wen, Zeyang Sha, Junjie Chu, Yugeng Liu, Michael Backes, Yang Zhang;
Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
Yiting Qu, Xinyue Shen, Xinlei He, Michael Backes, Savvas Zannettou, Yang Zhang;
🎙️ Coverage:
On Xing Tian and the Perseverance of Anti-China Sentiment Online
Xinyue Shen, Xinlei He, Michael Backes, Jeremy Blackburn, Savvas Zannettou, Yang Zhang;
Before PhD: