* We're working on your training correlation caveat :)
* Now we're at the phase where we are getting closed models added to the benchmark to show the gap open needs to close (because good alignment capabilities are important for good societal outcomes.
* We're working on your training correlation caveat :)
* Now we're at the phase where we are getting closed models added to the benchmark to show the gap open needs to close (because good alignment capabilities are important for good societal outcomes.
* The leaderboard is characterized by the design space not being well explored, so more DPO models exist because they're popular. I don't expect this to change too much, but already more people are training RMs since release! (a specific training blog post here: https://www.notion.so/Reward-Modeling-for-RLHF-abe03f9afdac42b9a5bee746844518d0?pvs=21)
Great summary of the benchmark. Keep up the great work.
RewardBench lead author here, a couple notes:
* We're working on your training correlation caveat :)
* Now we're at the phase where we are getting closed models added to the benchmark to show the gap open needs to close (because good alignment capabilities are important for good societal outcomes.
* The leaderboard is characterized by the design space not being well explored, so more DPO models exist because they're popular. I don't expect this to change too much, but already more people are training RMs since release! (a specific training blog post here: https://www.notion.so/Reward-Modeling-for-RLHF-abe03f9afdac42b9a5bee746844518d0?pvs=21)
Great summary of the benchmark. Keep up the great work.
https://huggingface.co/spaces/allenai/reward-bench
Thanks for the informative comment, and I am looking forward to that correlation analysis (haha, but no rush!)