LLMs see structure in seemingly nonsense prompts, revealing their true language processing nature
Machine-generated prompts that seem random actually follow interpretable patterns in influencing LLM outputs, making them less mysterious than previously thought.
-----
https://arxiv.org/abs/2412.08127
🤔 Original Problem:
→ LLMs respond predictably to algorithmically generated prompts that appear unintelligible to humans, raising concerns about potential misuse and revealing gaps in our understanding of how LLMs process language
-----
🔍 Solution in this Paper:
→ The researchers analyzed opaque machine-generated prompts across 3 LLMs of different sizes and families
→ They discovered that the last token plays a crucial role and strongly affects generation
→ Several tokens act as "fillers" that can be removed without impact
→ Non-filler tokens work like keywords influencing semantic content without strict syntactic relationships
-----
🎯 Key Insights:
→ Over 60% of machine prompts can be pruned by removing average 1.9 tokens out of 10
→ Non-linguistic tokens are more likely to be pruned (32.9%) compared to kept tokens (24.5%)
→ The last token has strong natural-language connection to the continuation
→ Natural language prompts show similar properties when subjected to pruning
-----
📊 Results:
→ 99% of natural prompts can be pruned while maintaining continuation
→ Last token position shows 95% resistance to pruning
→ Token shuffling leads to average BLEU score of 0.02-0.05
Share this post