πŸ“’μžλ£Œ

https://medium.com/@hugmanskj/transformer의-큰-κ·Έλ¦Ό-이해-기술적-λ³΅μž‘ν•¨-없이-핡심-아이디어-νŒŒμ•…ν•˜κΈ°-5e182a40459d

https://www.blossominkyung.com/deeplearning/transfomer-positional-encoding

https://www.blossominkyung.com/deeplearning/transformer-mha

https://www.blossominkyung.com/deeplearning/transfomer-last

https://medium.com/@mansoorsyed05/understanding-transformers-architecture-c571044a1c21

https://wikidocs.net/31379

Attention is All you Need : https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf

πŸŽ―ν•΅μ‹¬ ν‚€μ›Œλ“œ

self-attention, positional encoding, Mulit-Head Attention, Scaled dot-product Attention, Cross Attention

κΈ°μ‘΄ λͺ¨λΈμ˜ ν•œκ³„

<aside>

πŸ’‘νŠΈλžœμŠ€ν¬λ¨Έμ˜ 핡심 아이디어:

이 μ–΄ν…μ…˜μ„ RNN의 보정을 μœ„ν•œ μš©λ„λ‘œμ„œ μ‚¬μš©ν•˜λŠ” 것이 μ•„λ‹ˆλΌ μ–΄ν…μ…˜λ§ŒμœΌλ‘œ 인코더와 디코더λ₯Ό λ§Œλ“€μ–΄λ³΄λ©΄ μ–΄λ–¨κΉŒ?

</aside>

ꡬ쑰

transformer1.png

νŠΈλžœμŠ€ν¬λ¨ΈλŠ” RNN을 μ‚¬μš©ν•˜μ§€ μ•Šμ§€λ§Œ 기쑴의 seq2seq처럼 인코더-디코더 ꡬ쑰λ₯Ό μœ μ§€ν•˜κ³  μžˆλ‹€.

Positional Encoding