Excellent explanation of Transformers, cited in
Natural Language Processing in Action (https://isidore.co/calibre/#panel=book_details&book_id=10472) ยง9.2.2 as "a mind-expanding walk through the modern GPT architecture", "3Blue1Brown (https://www.3blue1brown.com/topics/neural-networks) visualizations and explanations by Grant Sanderson":
(from full Neural Nets playlist (https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi))
AquinasLatinEnglish (https://huggingface.co/datasets/Geremia23/AquinasLatinEnglish) / AquinasLatinEnglishModel (https://huggingface.co/Geremia23/AquinasLatinEnglishModel) uses Transformers.