Rohan's Bytes
Subscribe
Sign in
Share this post
Rohan's Bytes
GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment
Copy link
Facebook
Email
Notes
More
GenARM: Reward Guided Generation with…
Rohan Paul
Nov 10, 2024
Share this post
Rohan's Bytes
GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment
Copy link
Facebook
Email
Notes
More
GenARM guides LLMs using token-level rewards without retraining the base model.
Read →
Comments
Share
Copy link
Facebook
Email
Notes
More
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts
Share this post
GenARM: Reward Guided Generation with…
Share this post
GenARM guides LLMs using token-level rewards without retraining the base model.