LLMs gain human-like awareness of word positions through numbered tracking.
Adding position markers to LLM inputs enables exact length control and accurate text manipulation.
📚 https://arxiv.org/abs/2410.07035
Original Problem 🔍:
LLMs struggle with length control and precise copy-paste operations due to lack of positional awareness.
The authors identify a lack of positional awareness as the root cause of LLMs' inability to effectively control text length. This stems from token-level operations and insufficient training on data with strict length limitations.
-----
Solution in this Paper 🛠️:
• PositionID Prompting: Assigns sequential IDs to words/sentences/paragraphs during generation
• PositionID Fine-Tuning: Trains models on mixed normal and PositionID modes
• PositionID CP Prompting: Enables accurate copy-paste using a three-stage tool-use mechanism
-----
Key Insights from this Paper 💡:
• Explicit positional awareness enhances LLMs' length control and copy-paste abilities
• PositionID techniques work for both closed-source and open-source models
• Mixed-mode training transfers positional awareness to normal generation mode
-----
Results 📊:
• PositionID Prompting: Best Rouge-L (23.2) and MAE scores across all levels
• PositionID Fine-Tuning: Outperforms CFT and InstructCTG in MAE metrics
• PositionID CP Prompting: 80.8% CP Success Rate, 18.4 Rouge-L, 8.4 PPL
Share this post