Account Info
Log Out
English
Back
Log in to access Online Inquiry
Back to the Top
Microsoft Build: Unleashing Innovation in the AI Era
Views 89K Contents 27

Someone finally made clear the status of GPT!

$Microsoft (MSFT.US)$ Watching Andrej Karpathy presentation from today and taking twitter notes, come along for the ride:
Andrej Karpathy starts with stages:
1 - Pre-training - months x thousands of GPUs
2, 3, 4 - Finetuning stages that take hours or days
Someone finally made clear the status of GPT!
Before pre-training happens, there are 2 preparation steps.
Data collection - Get tons of data from different sources (here Andrej LLaMa mixture)
Tokenization - a lossless translations between pieces of words and integers.
Someone finally made clear the status of GPT!
Someone finally made clear the status of GPT!
"You shouldn't judge the power of the model just by the number of parameters it contains"
LLaMa has trained on 1-1.4 Trillion tokens vs 300B tokens in GPT-3.
Someone finally made clear the status of GPT!
"I don't have enough time to go into how transformers work unfortunately" Gotta love Andrej thirst for teaching!
I cannot summarize this into a tweet tbh.
Someone finally made clear the status of GPT!
Here's an example from NYT who trained a GPT model on Shakespeare
You can see continued improved after many iterations of how the LM is getting better at predicting what next word would come in a Shakespeare text.
Someone finally made clear the status of GPT!
Ok STRONGLY paraphrasing here but, every iteration, the trainee model tries to predict which token/integer would come next after the green one (in image) and this is outlined by the Training curve, how well does is it able to predict the next tokens compared the original text.
Around GPT-2, the industry noticed that if we structure out prompts in a specific way, and provide a few examples (Few Shot prompting) then the base model will be "tricked" into autocompleting what instructions we provided it in prompt.
Someone finally made clear the status of GPT!
Andrej repeats this several times, the best open source model to learn from right now is probably LLaMa from
$Meta Platforms (META.US)$ AI (since OAI didn't release anything about GPT-4)
GPT-2 - released + weights
GPT-3 - base model available via API (da-vinci)
GPT-4 - Not Available via API
Someone finally made clear the status of GPT!
Base models are not assistants, they don't "do what you ask them" in the basic sense. They just autocomplete text.
But if you structure your document with Few-shot prompts, it will "trick" the base model to think that it autocompletes a chat between an AI and a human
Someone finally made clear the status of GPT!
But this trick is not enough. So we're moving to step 2.
Supervised Finetuning.
Collecting small but high quality (think human contractors) datasets of instructions
And continue training the model with a swapped dataset now and we get the SFT (supervised finetuning) model.
Someone finally made clear the status of GPT!
SFT model is... not great yet, definitely not chatGPT quality. So the training continues
Generating outputs of questions with the SFT model, users review and compare between 3 versions & rank which was the best, and then the model is retrained on the selections by the users
Someone finally made clear the status of GPT!
This is done by wighting the better voted on responses. For example, when you hit or in chatGPT, or choose to regenerate a response, those signals are great for RLHF.
Andrej is going into the potential reasons of why RLHF models "feel" better to us. At least in terms being a good assistant.
Here again if anyone's still reading, I'll refer you to the video
Interestingly, Andrej talks about RLHF are not strictly improvements on base models. RLHF models have less enthropy so it is less "inventive" potentially.
For that base models are still better because they are still chaotic.
Someone finally made clear the status of GPT!
This is the current state of models as ranked by folks from Berkley based on ranking.
Interestingly here, karpathy says that GPT-4 is the best "by far", but on the chart its 1274 to Claude's 1224 ELO rating that doesn't seem "by far"
Someone finally made clear the status of GPT!
RLHF models are better ranked, all the top 3 are RLHF models and the rest (to his knowledge are SFT models)
Wohoo! We're through the first half of the talk. Moving to Application of these models to problems.
Someone finally made clear the status of GPT!
Andrej then goes fairly in depth into the difference between a human being process of writing a statement like
"California's population is 53 times that of Alaska"
A human brain goes through loops, fact checks, calculation, reflection.
Someone finally made clear the status of GPT!
While a GPT is trying to autocomplete, there is no internal dialog in GPT.
It spends the same amount of "compute" per token, no matter if the token is a number it needs to look up or a fact it needs to check, but they have vast knowledge and perfect memory (context window)
Someone finally made clear the status of GPT!
Methods like Chain of thought provide models with "more tokens" or "more time to think" by asking "let's think step by step"
Which will make the model to show it's work, and this will give it "time to think" for a better answer
Someone finally made clear the status of GPT!
Now Andrej is going into Self Reflection as a method.
Models can get "stuck" because they have no way to cancel what tokens they already sampled.
Imagine yourself saying the wrong word and stopping yourself in the middle "let me rephrase" and you re-start the sentence
Models don't have that luxury so they can get stuck down that wrong path...
But examples like self-reflection show that asking the model to review it's output, judge it, gives models a "second change" or another pass over the reasoning of the output which improves results!
Someone finally made clear the status of GPT!
I love it, Andrej uses the Thinking Fast and Slow - system 1 and system 2 models of our thinking to LLMs.
These techniques like CoT, Self Reflexion and the recently released Tree of thought are our attempt to build system 2, the slower, more deliberate thinking
analogy.
Someone finally made clear the status of GPT!
Disclaimer: Community is offered by Moomoo Technologies Inc. and is for educational purposes only. Read more
2
5
+0
Translate
Report
43K Views
Comment
Sign in to post a comment