Smart routing system that knows which LLM to use just by looking at their outputs
A method to select the best LLM for different inputs without needing labeled data, using a weak supervision approach that estimates model quality through output comparisons.
-----
https://arxiv.org/abs/2412.04692v1
🤔 Original Problem:
→ Current LLM routing methods require human-annotated data to decide which model works best for which input
→ Engineers need a way to select optimal LLMs for different tasks without expensive labeling
-----
🔧 Solution in this Paper:
→ SMOOTHIE constructs a latent variable graphical model over embeddings of LLM outputs and unknown true outputs
→ It models embedding differences between LLM outputs and true outputs as multivariate Gaussian distributions
→ The method uses other LLM outputs as "voters" to estimate quality through weak supervision
→ SMOOTHIE comes in two variants: Global (uses all test data) and Local (uses nearest neighbors)
-----
🎯 Key Insights:
→ LLM quality can be estimated without labeled validation data
→ Sample-specific routing outperforms global model selection
→ Embedding-based comparison enables unsupervised quality assessment
-----
📊 Results:
→ Quality scores correlate with ground-truth (correlation = 0.72)
→ Identifies optimal model on 9/14 tasks
→ Outperforms baselines by up to 10 points accuracy
→ Improves performance by up to 7 points over global version
Share this post