Microsoft announced today that it will include results from a Large Language Model based on GPT-3 in Bing results. They will also release a new version of the Edge browser that will include a ChatGPT-like bot.
GPT-3 has been around for almost two years. What has caused this sudden leap forward in the capabilities of Large Language Models 🤔?
The answer is – *Reinforcement Learning From Human Feedback* or RLHF.
By combining the capabilities of a large language model with those of another model trained on the end-users preferences, we end up with the uncannily accurate results that ChatGPT seems to produce.
Ok – but how does RLHF work? Let me try and explain with a (ridiculous) analogy.
In the Star Trek series, the Replicator is a device that can produce pretty much anything on demand.
When Captain Picard says, “Tea, Earl Grey, Hot!” it produces the perfect cup of tea. But how might you train a Replicator? With RLHF, of course!
Let’s see how:
1. Feed the Replicator with all the beverage recipes in the known universe.
2. Train it to try and predict what a recipe would be when given a prompt. I.e. when a user says “Tea, Earl Gray, Hot!” – it should be able to predict what goes into the beverage.
3. Train *another* model – let’s call it the “Tea Master 2000” with Captain Picard’s preferences.
4. When the Replicator generates a beverage, the Tea Master responds with a score. +10 for a perfect cup of tea, -10 for mediocre swill.
5. We now use Reinforcement Learning (RL) to optimize the Replicator to get a perfect ten score.
6. After much optimization, the Replicator can generate the perfect cup of tea – tuned to Captain Picard’s preferences.
If you substitute the Replicator with an LLM like GPT-3, and substitute the Tea Master with another ML model called the *Preference* model, then you have seen RLHF in action!
It is a lot more complicated, but I will take any opportunity to generate Star Trek TNG-themed content 🖖.
Further Reading
Hugging Face has a fantastic blog post explaining RLHF in detail: https://huggingface.co/blog/rlhf
For those more visually inclined, Hugging Face also has a YouTube video about RLHF: https://www.youtube.com/live/2MBJOuVq380?feature=share
Anthropic AI has a paper that goes into a lot of detail on how they use RLHF to train their AI Assistant: https://arxiv.org/abs/2204.05862