Reinforcement Learning from Human Feedback (arxiv.org)

30 points by onurkanbkrc 3 hours ago

2 comments:

by klelatti 2 hours ago

Web version with links, etc:

https://rlhfbook.com/

by verdverm 44 minutes ago

Last time I saw Nathan say something about the book, he's actively working on the next version and looking for feedback, check his socials

Data from: Hacker News, provided by Hacker News (unofficial) API