UnicornSpaceUI

How LLM's Work

Deep diving into how LLM's work.

GPT

Genrative PreTrained Transformer

  • Genrative - Unlike serach engine these LLM's can generate next set of SEQUENCES based on your input.
  • To do that it requires pre trained data.
  • And transformers are the heart of LLM's
Listen -> [Tranformer] -> "Arabic"
"Hi" -> [Tranformer] -> "H"
"Hi" -> [Tranformer] -> "e"
"Hi" -> [Tranformer] -> "l"
"Hi" -> [Tranformer] -> "l"
"Hi" -> [Tranformer] -> "0"
"Hi" -> [Tranformer] -> "returns True"

Attension is all your need

This was a transformer model paper published by google in 201_ [Link]

They made this for google translate, so openAI saw and user it to make a GPT

Computers don't understand human languages like English or Hindi, they only understand numbers

Now let's understand how this transformer model works

Step 1: Encoding

Vector embedding

Step 2: Positional Encoding

Multi head self Attension

TODO: There are many things that are abstracted from this, you read the Transformer paper to understand this and basically