Generative Models: What They Are, How They Work and Where They Came From

  1. A jargon free explanation of how AI large language models work

  2. Foundations of Large Language Models

  3. Welch Labs youtube explainers, from an early intro to visual AI, to how DeepSeek rewrote the Transformer

  4. How Transformer LLMs Work: a DeepLearning.AI course (free registration required)

  5. An amazing series from Distil summarising the pre-Chat GPT evolution with beautiful visuals

  6. The Illustrated Transformer (Jay Alammar)

  7. A youtube video using the Jay Alammar explainer to explain Transformers

  8. A Notebook LM generated ‘podcast’ explanation of the Attention is All You Need paper

  9. The Moment We Stopped Understanding AI (AlexNet)

  10. The Busy Person’s Intro to LLMs (Andrej Karparthy)

  11. Why Machines Learn: The Elegant Maths Behind Modern AI, Anil Ananthaswamy

  12. Helen Toner’s LLM explainer

  13. Anthropic Interpretability explainer

  14. Quick assessment of the various models (March 2024)

  15. The Bitter Lesson

  16. Gwern on Scaling

Resources for a deeper dive

  1. Michael Nielsen’s wonderful free ebook guide to neural networks (recommended by ThreeBlueOneBrown)

  2. Andrej Karparthy’s Winter 2016 Stanford CS231n course, start with lecture 2

  3. Free online version of the Deep Learning textbook that seems to be widely regarded as the gold standard and which costs £65 on Amazon

  4. I haven’t looked at this Stanford course in depth, but someone on a useful Reddit thread said it was good!

  5. Ilya Sutskever’s AI reading list for John Carmack

How to Use Them?

  1. Andrej Karparthy’s guide to LLM use cases

  2. March 2025 review of model capabilities

  3. Why Gemini is actually very good

  4. A land use researcher works (successfully) with Deep Research

  5. An economist works (successfully) with Deep Research

  6. How Tyler Cowen uses the different models

What are the dangers of using them?

  1. Cognitive debt - MIT study suggests you get a better work product, but you become less smart because you didn’t do the work

Personal Favourite Use Cases

  1. Reading non-English books and creating excellent parallel translations (often better than the published English translations I own) live using Claude on my phone. Very natural to have your phone with you when you read, and easy to read one aside the other. Natural chunking into paragraphs also makes it less daunting to read an unfamiliar language.