Discussion about this post

User's avatar
Neural Foundry's avatar

The OlmoTrace feature is what makes this release genuinly different from the usual open weight releases. Being able to map model outputs back to specific training data means we can actually debug why a model behaves a certain way instead of just throwing more data at problems. The fact that they're achieving competitive performance with 6x fewer training tokens on the Think model suggests they're finding efficiencies in the training pipline that everyone else is missing.

Expand full comment

No posts