Making API documentation digestible for LLMs through intelligent chunking and retrieval.
The paper introduces a technique to reduce LLM input token limitations when processing API documentation by intelligently chunking OpenAPI specifications and using RAG with an agent-based approach.
https://arxiv.org/abs/2411.19804
Original Problem 🤔:
→ LLM-based service composition faces strict input token limitations, making it challenging to process comprehensive API documentation efficiently.
→ Current methods struggle to preprocess and chunk API specifications while preserving essential information.
Solution in this Paper 🔧:
→ The paper proposes OpenAPI RAG, a system that creates optimized chunks from API documentation using token-based and LLM-based strategies.
→ A Discovery Agent extends this by decomposing queries into subtasks and retrieving endpoint details on demand.
→ The system employs embedding models for semantic similarity matching, reducing token count while maintaining information relevance.
Key Insights 💡:
→ LLM-based summary strategies outperform naive chunking approaches
→ Agent-based approach significantly reduces token usage while improving precision
→ Token-based chunking with endpoint splitting provides balanced performance
Results 📊:
→ Discovery Agent achieved 70.29% precision on Spotify API tests
→ Reduced token count by 62% compared to baseline RAG approach
→ Maintained F1 score of 68.30% while using fewer tokens
→ Successfully demonstrated scalability with TMDB API having 54 endpoints
Share this post