Unboxing ChatGPT: A Deep-Dive on How This AI-Driven Chatbot Was Trained

less than 1 minute read

Published:

ChatGPT, OpenAI’s latest dialogue model, has taken the internet by storm, surpassing 1 million users in just 5 days. From seamless chatting to creating poetry and from writing code to conceiving an imaginary OS, its performance is truly mind-blowing. How did conversational AI become so much better so quickly? OpenAI appears to have cracked the nut using Reinforcement Learning with Human Feedback (RLHF) – a method that uses human demonstrations to guide the model toward desired behavior. In this article, we’ll unpack ChatGPT’s training techniques and take a deeper look at what goes on under the hood. Find the wandb article written by me here.