LLMs struggle in realistic environments due to a lack of high-quality agent data. LEARN-BY-INTERACT synthesizes this data automatically by having LLMs interact with environments and refining instructions based on interaction histories.
-----
Paper - https://arxiv.org/abs/2501.10893
Original Problem 🤔:
→ Existing LLMs perform poorly in complex, real-world tasks.
→ Annotating agent data for training is expensive and challenging.
→ Current methods for adapting LLMs to new environments are limited.
-----
Solution in this Paper 💡:
→ LEARN-BY-INTERACT creates synthetic training data through LLM interaction with environments.
→ It generates instructions using self-instruct, guided by documentations.
→ LLMs attempt to complete tasks, generating interaction trajectories.
→ Backward construction creates refined instructions by summarizing or abstracting sub-trajectories, fixing misalignments between initial instructions and LLM-generated trajectories.
→ Agentic retrieval methods optimize data usage in both training and In-Context Learning (ICL).
-----
Key Insights from this Paper 😲:
→ Backward construction is crucial for generating high-quality training data.
→ Agentic retrieval improves performance and efficiency in ICL
→ Synthesized data improves LLM performance across diverse environments.
-----
Results 💯:
→ Improves baseline results by up to 12.2% for ICL with Claude-3.5 and 19.5% for training with Codestral-22B.
→ Backward construction provides up to 14.0% improvement in training.
→ Agentic retrieval outperforms conventional retrieval methods.
Share this post