Introduction to OpenAI’s Achievements
OpenAI is the name of the company that has created ChatGPT. They launched their first generative pre-training transformer model in 2018. They had named it GPT 1. So the GPT phrase in ChatGPT means this. The full form is the generative pre-training transformer. Two big datasets were used as training data.
The first big dataset was millions of internet webpages. The dataset was named Common Crawl. And the second dataset was 11, 000 books. It was named Book Corpus. Natural Language Processing was a huge achievement, but it had many shortcomings. It always generated a lot of repetitive text. Its text sequences were also very small, but it worked like a foundation.
Advancements with GPT-2 and GPT-3
In February 2019, OpenAI launched its sequel called GPT 2. It had 1.5 billion parameters. GPT 1 had only 117 million parameters. Parameters are basically the variables on which pattern recognition is done by the GPT model. For example, if we look at a linear equation, x equals my plus b. It has x and y variables and m and b parameters.
With the help of more parameters, more accurate predictions will be made. The pattern will be recognized more easily in this big training data set. Then comes, friends, in the year 2020, the GPT 3 model, in which 100 times more parameters are used. 175 billion parameters. Here, the biggest improvement can be seen.
The training data that was fed to the GPT 3 model, in that, apart from Common Crawl and Book Corpus, there were Wikipedia pages as well. There were many other books and articles as well. Thank you. The total training data set of the GPT 3 model was given for 570 GB. For comparison, the GPT GB of data for training.
The Emergence of ChatGPT and Reinforcement Learning
The next revolutionary step comes when OpenAI launches ChatGPT 3. 5 to the public in November 2022. It was only after this that the whole world came to know about ChatGPT. There was a very critical difference between 3 and 3. 5. In 3. 5, he used a specific technique called Reinforcement Learning from Human Feedback, RLHF.
Apart from transformers, with the implementation of this technique, chat GPT became even more accurate and fluent. And it started looking like a magical technology. So, let’s understand this RLHF technique in detail.