How LLM's Work
Deep diving into how LLM's work.
GPT
Genrative PreTrained Transformer
- Genrative - Unlike serach engine these LLM's can generate next set of SEQUENCES based on your input.
- To do that it requires pre trained data.
- And transformers are the heart of LLM's
Listen -> [Tranformer] -> "Arabic"
"Hi" -> [Tranformer] -> "H"
"Hi" -> [Tranformer] -> "e"
"Hi" -> [Tranformer] -> "l"
"Hi" -> [Tranformer] -> "l"
"Hi" -> [Tranformer] -> "0"
"Hi" -> [Tranformer] -> "returns True"
Attension is all your need
This was a transformer model paper published by google in 201_ [Link]
They made this for google translate, so openAI saw and user it to make a GPT
Computers don't understand human languages like English or Hindi, they only understand numbers
Now let's understand how this transformer model works
Step 1: Encoding
Vector embedding
Step 2: Positional Encoding
Multi head self Attension
TODO: There are many things that are abstracted from this, you read the Transformer paper to understand this and basically