Transformer Neural Networks are the heart of pretty much everything exciting in AI right now. ChatGPT, Google Translate and many other cool things, are based on Transformers. This StatQuest cuts through all the hype and shows you how a Transformer works, one-step-at-a time.
NOTE: If you're interested in learning more about Backpropagation, check out these 'Quests:
The Chain Rule: [ Ссылка ]
Gradient Descent: [ Ссылка ]
Backpropagation Main Ideas: [ Ссылка ]
Backpropagation Details Part 1: [ Ссылка ]
Backpropagation Details Part 2: [ Ссылка ]
If you're interested in learning more about the SoftMax function, check out:
[ Ссылка ]
If you're interested in learning more about Word Embedding, check out: [ Ссылка ]
If you'd like to learn more about calculating similarities in the context of neural networks and the Dot Product, check out:
Cosine Similarity: [ Ссылка ]
Attention: [ Ссылка ]
If you'd like to support StatQuest, please consider...
Patreon: [ Ссылка ]
...or...
YouTube Membership: [ Ссылка ]
...buying my book, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...
[ Ссылка ]
...or just donating to StatQuest!
paypal: [ Ссылка ]
venmo: @JoshStarmer
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
[ Ссылка ]
0:00 Awesome song and introduction
1:26 Word Embedding
7:30 Positional Encoding
12:53 Self-Attention
23:37 Encoder and Decoder defined
23:53 Decoder Word Embedding
25:08 Decoder Positional Encoding
25:50 Transformers were designed for parallel computing
27:13 Decoder Self-Attention
27:59 Encoder-Decoder Attention
31:19 Decoding numbers into words
32:23 Decoding the second token
34:13 Extra stuff you can add to a Transformer
#StatQuest #Transformer #ChatGPT
Ещё видео!