Unboxing ChatGPT: A Deep-Dive on How This AI-Driven Chatbot Was Trained
Published:
ChatGPT, OpenAI’s latest dialogue model, has taken the internet by storm, surpassing 1 million users in just 5 days. From seamless chatting to creating poetry and from writing code to conceiving an imaginary OS, its performance is truly mind-blowing. How did conversational AI become so much better so quickly? OpenAI appears to have cracked the nut using Reinforcement Learning with Human Feedback (RLHF) – a method that uses human demonstrations to guide the model toward desired behavior. In this article, we’ll unpack ChatGPT’s training techniques and take a deeper look at what goes on under the hood. Find the wandb article written by me here.