Parallel tool calls: The key to unlocking the full potential of LLMs in real-world tasks.
Divide-Then-Aggregate, DTA-Llama improves LLM tool use by enabling parallel tool calls, boosting efficiency and performance. It transforms tree-based tool sequences into DAGs, trains on this parallel data, and uses a Process/Threads framework for parallel inferences.
-----
https://arxiv.org/abs/2501.12432
Original Problem: 😥:
→ Current LLMs struggle with efficient tool use for complex real-world tasks.
→ Existing methods like CoT/ReAct use serial tool calls, limiting their scope and efficiency.
→ Tree-based methods suffer from backtracking, increasing cost and time.
-----
Solution in this Paper: 💡:
→ DTA-Llama allows parallel tool calls within each round of tool planning.
→ It transforms tree-based tool sequences into Directed Acyclic Graphs (DAGs).
→ It creates DTA-Tool, a parallel tool invocation dataset based on ToolBench.
→ DTA-Llama is trained on this dataset to perform divide-then-aggregate tool invocation.
→ A Process/Threads framework is used for parallel tool invocation during inference. Process plans and divides tasks; Threads execute independently.
-----
Key Insights from this Paper: 🤔:
→ Parallel tool invocation can significantly improve LLM efficiency and performance.
→ DAG structure enables more efficient tool planning compared to tree-based methods.
→ The Process/Threads framework provides a robust mechanism for parallel inference.
-----
Results: 📊:
→ Llama2-7B with DTA-Llama achieves performance comparable to GPT-3.5 function calling.
→ Reduces token consumption and inference time compared to existing methods.
→ Shows improvements in solvable pass rate (SoPR) and solvable win rate (SoWR).
Share this post