"Demystifying Domain-adaptive Post-training for Financial LLMs"

Playback speed

Share post at current time

0:00

Transcript

"Demystifying Domain-adaptive Post-training for Financial LLMs"

Generated below podcast on this paper with Google's Illuminate.

Rohan Paul

Jan 21, 2025

Making LLMs finance-savvy

FINDAP introduces a systematic approach to adapt LLMs for finance through a comprehensive evaluation framework and novel training methods, enabling better domain-specific performance while preserving general capabilities.

-----

https://arxiv.org/abs/2501.04961

Original Problem 🤔:

Domain adaptation of LLMs faces challenges in balancing specialized knowledge with general capabilities. Financial LLMs often lack systematic evaluation criteria and effective training strategies across varying data configurations.

-----

Solution in this Paper 🔧:

→ FINDAP first identifies core capabilities required for financial domain expertise: domain knowledge, task performance, reasoning, and instruction-following.

→ It introduces a novel preference data distillation method leveraging process signals from a generative reward model.

→ The solution implements three-stage training: continual pre-training, instruction tuning, and preference alignment.

→ Training uses a mix of domain-specific and general knowledge to prevent catastrophic forgetting.

-----

Key Insights 💡:

→ Joint training of continual pre-training and instruction tuning outperforms sequential training

→ Parameter-efficient fine-tuning works well for task adaptation but requires full-model training for knowledge transfer

→ Process rewards from GenRM significantly improve reasoning capabilities

-----

Results 📊:

→ Outperforms all 8B baseline models across 91.13% financial tasks

→ Achieves state-of-the-art performance on CFA-Challenge with 55.56% accuracy

→ Maintains competitive MT-Bench score of 7.36 for instruction-following capability

Rohan's Bytes

"Demystifying Domain-adaptive Post-training for Financial LLMs"

Discussion about this video