Breaking LLM safety filters by splitting dangerous prompts into harmless-looking sequential questions.
Multi-round jailbreak attack on large…
Breaking LLM safety filters by splitting dangerous prompts into harmless-looking sequential questions.