Data – Xinyue (Vera) Shen

I champion open-source ethos. Below are datasets my collaborators and I have built together. Feel free to explore and use them in your work.

🌐 Real-World Data from Online Social Platforms

HarmfulSkillBench

agents red-teaming

HarmfulSkillBench: How Do Harmful Skills Weaponize Your Agents? (arXiv'26)

Description: Harmful skills against agent refusal behavior
Source: ClawHub, Skills.Rest, Original
Size: 200 skills

Moltbook

agents OSNs 5.1K ↓

"Humans welcome to observe": A First Look at the Agent Social Network Moltbook (arXiv'26)

Description: Moltbook posts created by real-world agents
Source: Moltbook
Size: 40K+ posts and 12K+ submolts

Lexica Dataset

prompts images 32.6K ↓

Prompt Stealing Attacks Against Text-to-Image Generation Models (USENIX'24)

Description: User-crafted prompts with generated images
Source: Lexica.art
Size: 61,467

In‑The‑Wild Jailbreak Prompts

prompts jailbreak 20.4K ↓

"Do Anything Now": Characterizing and Evaluating In‑The‑Wild Jailbreak Prompts on Large Language Models (CCS'24)

Description: (Jailbreak) prompts created by real-world users
Source: Reddit, Discord, websites, open datasets
Size: 15,140 prompts, including 1,405 jailbreak prompts

ForbiddenQuestionSet

questions LLMs 15.8K ↓

"Do Anything Now": Characterizing and Evaluating In‑The‑Wild Jailbreak Prompts on Large Language Models (CCS'24)

Description: Questions that LLMs should not answer
Source: GPT-4 generated, based on OpenAI usage policy
Size: 390

GPT Metadata

metadata GPTs AI agents

GPTracker: A Large-Scale Measurement of Misused GPTs (S&P'25)

Description: GPT (user-customized ChatGPT) metadata, e.g., basic information, builders, user feedback, and configurations
Source: The official GPT Store, collected bi-weekly
Size: 755,297

UnsafeBench

images safety 12.0K ↓

UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images (CCS'25)

Description: real-world/AI-generated safe/unsafe images
Source: LAION-5B and Lexica.art
Size: 10,146

AIGTBench

text LLMs OSNs 1.5K ↓

Are We in the AI-Generated Text World Already? Quantifying and Monitoring AIGT on Social Media (ACL'25)

Description: Real/AI-generated social media posts
Source: Medium, Quora, Reddit
Size: 845,497

🤖 AI-Generated Data with Human Annotation

HateBenchSet

hate speech LLMs 1.3K ↓

HateBench: Benchmarking Hate Speech Detectors on LLM‑Generated Content and Hate Campaigns (USENIX'25)

Description: AI-generated hate speech, covering 34 identity groups
Source: LLMs (GPT‑3.5, GPT‑4, Vicuna, Baichuan2, Dolly2, OPT)
Size: 7,838

Hateful Memes in VLMs

hateful memes VLMs

From Meme to Threat: On the Hateful Meme Understanding and Induced Hateful Content Generation in Open-Source Vision Language Models (USENIX'25)

Description: VLM QA pairs of hateful memes
Source: VLMs (InstructBlip, ShareGPT-4V, LLaVA, CogVLM)
Size: 27,373

Unsafe Images

text-to-image models

Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models (CCS'23)

Description: Unsafe images generated by text-to-image models
Source: Generated with prompts from 4chan, Lexica.art, COCO
Size: 800