Generative Models: What They Are, How They Work and Where They Came From
A jargon free explanation of how AI large language models work
Welch Labs youtube explainers, from an early intro to visual AI, to how DeepSeek rewrote the Transformer
How Transformer LLMs Work: a DeepLearning.AI course (free registration required)
An amazing series from Distil summarising the pre-Chat GPT evolution with beautiful visuals
A youtube video using the Jay Alammar explainer to explain Transformers
A Notebook LM generated ‘podcast’ explanation of the Attention is All You Need paper
Why Machines Learn: The Elegant Maths Behind Modern AI, Anil Ananthaswamy
Quick assessment of the various models (March 2024)
Resources for a deeper dive
Michael Nielsen’s wonderful free ebook guide to neural networks (recommended by ThreeBlueOneBrown)
Andrej Karparthy’s Winter 2016 Stanford CS231n course, start with lecture 2
Free online version of the Deep Learning textbook that seems to be widely regarded as the gold standard and which costs £65 on Amazon
I haven’t looked at this Stanford course in depth, but someone on a useful Reddit thread said it was good!
How to Use Them?
What are the dangers of using them?
Cognitive debt - MIT study suggests you get a better work product, but you become less smart because you didn’t do the work
Personal Favourite Use Cases
Reading non-English books and creating excellent parallel translations (often better than the published English translations I own) live using Claude on my phone. Very natural to have your phone with you when you read, and easy to read one aside the other. Natural chunking into paragraphs also makes it less daunting to read an unfamiliar language.