Smart model combination during inference time blocks LLMs from memorizing copyrighted content.
CP-Fuse, proposed in this paper, adaptively combines models trained on different copyright sets during inference to prevent LLMs from reproducing protected content while maintaining generation quality.
-----
https://arxiv.org/abs/2412.06619v1
🤔 Original Problem:
LLMs often reproduce copyrighted training data verbatim, leading to legal risks and potential lawsuits. Existing protection methods like differential privacy training are computationally expensive and degrade model performance.
-----
🔧 Solution in this Paper:
→ CP-Fuse combines outputs from multiple LLMs trained on disjoint sets of copyrighted material during inference time
→ The method adaptively aggregates model logits to minimize copyright content reproduction through a balancing property
→ When one model dominates generation, CP-Fuse automatically shifts weight to other models to prevent memorized content reproduction
→ The approach works post-hoc, allowing seamless integration with other protection techniques
-----
💡 Key Insights:
→ Model fusion can effectively prevent copyright violations without compromising generation quality
→ Adaptive weighting based on generation history is crucial for preventing memorization
→ Post-processing approaches can be more practical than training-time methods
→ The balancing property mathematically guarantees reduced regurgitation
-----
📊 Results:
→ Reduces copyrighted content reproduction by over 25x compared to baselines
→ Maintains equivalent code generation pass@1 scores as original models
→ Achieves same text generation fluency scores as unprotected models
→ Shows robustness against prompt-based extraction attacks
Share this post