This paper introduces Domaino1s, an LLM framework enhancing reasoning and explainability for high-stakes domains like finance and law, using fine-tuning and tree search.
-----
Paper - https://arxiv.org/abs/2501.14431
Solution in this Paper 💡:
→ Domaino1s uses supervised fine-tuning on newly created CoT-stock-2k and CoT-legal-2k datasets.
→ These datasets are built using GPT-4o to decompose reasoning into structured steps.
→ Domaino1s employs Selective Tree Exploration during inference.
→ Selective Tree Exploration uses perplexity to guide the search for optimal reasoning paths at each step.
→ This method balances reasoning quality and computational cost by selectively expanding reasoning paths when needed.
→ A new metric, PROOF-Score, is introduced to evaluate explainability considering reasoning, safety, and factual accuracy.
-----
Key Insights from this Paper 🧠:
→ Multi-stage reasoning, like in o1-type models, improves accuracy compared to single-pass Chain-of-Thought.
→ Fine-tuning with structured reasoning data enhances domain-specific reasoning capabilities of Large Language Models.
→ Selective Tree Exploration is an efficient way to improve reasoning by exploring and sampling better solution paths.
→ Evaluating Large Language Models in high-stakes domains requires metrics beyond just accuracy, explainability is critical.
-----
Results 📈:
→ Domaino1s-legal achieves 88.64% average accuracy on legal reasoning tasks, outperforming baselines like Qwen-2.5-Instruct at 73.86%.
→ Domaino1s-finance achieves 51.98% accuracy and 0.021 MCC on stock investment recommendation, exceeding baselines.
→ Domaino1s achieves the highest PROOF-Score on both stock and legal tasks, indicating superior explainability.
→ Selective Tree Exploration with beam size 3 achieves 89.14% accuracy on Scalr dataset with 15.18s inference time.
Share this post