Discussion about this post

User's avatar
Neural Foundry's avatar

The Souper-Model approach is fascinatin because it shows that thoughtful model merging can actualy outperform individual models without adding inference cost. What really stands out is how they treated it as an optimizaton problem rather than just blending checkpoints randomly. This could change how we think about getting better performace from existing models.

Expand full comment

No posts

Ready for more?