Excellent explanation of Transformers, cited in
Natural Language Processing in Action (https://isidore.co/calibre/#panel=book_details&book_id=10472) ยง9.2.2 as "a mind-expanding walk through the modern GPT architecture", "3Blue1Brown (https://www.3blue1brown.com/topics/neural-networks) visualizations and explanations by Grant Sanderson":
(from full Neural Nets playlist (https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi))
AquinasLatinEnglish (https://huggingface.co/datasets/Geremia23/AquinasLatinEnglish) / AquinasLatinEnglishModel (https://huggingface.co/Geremia23/AquinasLatinEnglishModel) uses Transformers and Byte-Pair Encoding (https://github.com/eole-nlp/eole/discussions/379#discussioncomment-16616895).
original Transformers paper:
- Vaswani, Ashish, Noam Shazeer, Niki Parmar, et al. "Attention Is All You Need (https://isidore.co/misc/Physics%20papers%20and%20books/Zotero/storage/Z2YMNZTJ/Vaswani%20et%20al.%20-%202023%20-%20Attention%20Is%20All%20You%20Need.pdf)." arXiv:1706.03762. Preprint, arXiv, August 2, 2023 [1st ed.: 2017].
original byte-pair encoding (BPE) (https://github.com/eole-nlp/eole/discussions/379#discussioncomment-16616895) paper:
- Sennrich, Rico, Barry Haddow, and Alexandra Birch. "Neural Machine Translation of Rare Words with Subword Units (https://isidore.co/misc/Physics%20papers%20and%20books/Zotero/storage/L3ZDXZRS/Sennrich%20et%20al.%20-%202016%20-%20Neural%20Machine%20Translation%20of%20Rare%20Words%20with%20Subword%20Units.pdf)." arXiv:1508.07909. Preprint, arXiv, June 10, 2016.
Latin ๐ป๐ฆ โ English ๐ฌ๐ง Translator (https://huggingface.co/spaces/Geremia23/eole-Latin-English-translator) is now live. ๐
Quote from: Geremia on April 18, 2026, 11:25:11 PMoriginal Transformers paper:
- Vaswani, Ashish, Noam Shazeer, Niki Parmar, et al. "Attention Is All You Need (https://isidore.co/misc/Physics%20papers%20and%20books/Zotero/storage/Z2YMNZTJ/Vaswani%20et%20al.%20-%202023%20-%20Attention%20Is%20All%20You%20Need.pdf)." arXiv:1706.03762. Preprint, arXiv, August 2, 2023 [1st ed.: 2017].
Another good explanation of Transformers:
Quote from: Geremia on April 11, 2026, 01:52:05 AM3Blue1Brown (https://www.3blue1brown.com/topics/neural-networks)
Another good video from 3Blue1Brown (https://www.3blue1brown.com/topics/neural-networks):