Transformers explained | The architecture behind LLMs - Смотреть видео или скачать видео в MP4, музыку MP3 на телефон или компьютер

All you need to know about the transformer architecture: How to structure the inputs, attention (Queries, Keys, Values), positional embeddings, residual connections. Bonus: an overview of the difference between Recurrent Neural Networks (RNNs) and transformers.
9:19 Order of multiplication should be the opposite: x1(vector) * Wq(matrix) = q1(vector). Otherwise we do not get the 1x3 dimensionality at the end. Sorry for messing up the animation!

Check this out for a super cool transformer visualisation! 👏 [ Ссылка ]

➡️ AI Coffee Break Merch! 🛍️ [ Ссылка ]

Outline:
00:00 Transformers explained
00:47 Text inputs
02:29 Image inputs
03:57 Next word prediction / Classification
06:08 The transformer layer: 1. MLP sublayer
06:47 2. Attention explained
07:57 Attention vs. self-attention
08:35 Queries, Keys, Values
09:19 Order of multiplication should be the opposite: x1(vector) * Wq(matrix) = q1(vector).
11:26 Multi-head attention
13:04 Attention scales quadratically
13:53 Positional embeddings
15:11 Residual connections and Normalization Layers
17:09 Masked Language Modelling
17:59 Difference to RNNs

Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏
Dres. Trost GbR, Siltax, Vignesh Valliappan, @Mutual_Information , Kshitij

Our old Transformer explained 📺 video: [ Ссылка ]
📺 Tokenization explained: [ Ссылка ]
📺 Word embeddings: [ Ссылка ]
📽️ Replacing Self-Attention: [ Ссылка ]-
📽️ Position embeddings: [ Ссылка ]
@SerranoAcademy Transformer series: [ Ссылка ]

📄 Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. "Attention is all you need." Advances in neural information processing systems 30 (2017).

▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕
Patreon: [ Ссылка ]
Ko-fi: [ Ссылка ]
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

🔗 Links:
AICoffeeBreakQuiz: [ Ссылка ]
Twitter: [ Ссылка ]
Reddit: [ Ссылка ]
YouTube: [ Ссылка ]

#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research
Music 🎵 : Sunset n Beachz - Ofshane
Video editing: Nils Trost

Transformers explained | The architecture behind LLMs

Теги

Смотрите далее

Cardiogram of The Earth

ВЕЛИКИЕ РУССКИЕ КОМПОЗИТОРЫ. ЧАЙКОВСКИЙ ПЁТР ИЛЬИЧ (1840 – 1893). СТИХИ ДЛЯ МАЛЫШЕЙ

эволюция транзисторных усилителей часть 3. 70е. дифференциальный входной каскад, устойчивость.

Оперативная диагностика горного оборудования. Тяжмашсервис

Сравнение realme GT 7 Pro vs Xiaomi 14 - НЕ БРАТЬ: какой и почему или какой ЛУЧШЕ ВЗЯТЬ? ОБЗОР

Anycubic Kobra 2 Pro -ПРОБЛЕМА! Разбираем голову

Жаль, Что Я Не Знал Этого Раньше – Сэкономил Бы Кучу Времени

Как сделать загрузочную флешку Windows 7-10? Пошаговая инструкция

ДЛR#332. О запретах в науке (аудио)

Поиск полезных ископаемых (геология)

Квантовый переход или электронный концлагерь - твой главный выбор

Простой таймер на одном транзисторе!

Экипаж смотреть онлай 2016.

Термит. Термитные составы. Металлотермия. 25+ реакций термитных (и не очень) смесей.

ЛУЧШИЙ В СЕГМЕНТЕ ?🔥УМНЫЕ ЧАСЫ KOSPEKT TANK T3

Новые клипы

Тренды Наука