Generative Models: What They Are, How They Work and Where They Came From

  1. 3Blue1Brown’s stunningly helpful YouTube series on neural networks has undoubtedly been the most helpful resource for me

  2. And to dig a layer down, the 3Blue1Brown series on linear algebra was also really helpful (even if was a bit lost after beyond video 5 in the series)

  3. A jargon free explanation of how AI large language models work

  4. Foundations of Large Language Models

  5. Welch Labs youtube explainers, from an early intro to visual AI, to how DeepSeek rewrote the Transformer

  6. How Transformer LLMs Work: a DeepLearning.AI course (free registration required)

  7. An amazing series from Distil summarising the pre-Chat GPT evolution with beautiful visuals

  8. The Illustrated Transformer (Jay Alammar)

  9. A youtube video using the Jay Alammar explainer to explain Transformers

  10. A Notebook LM generated ‘podcast’ explanation of the Attention is All You Need paper

  11. The Moment We Stopped Understanding AI (AlexNet)

  12. The Busy Person’s Intro to LLMs (Andrej Karparthy)

  13. Why Machines Learn: The Elegant Maths Behind Modern AI, Anil Ananthaswamy

  14. Helen Toner’s LLM explainer

  15. Anthropic Interpretability explainer

  16. Quick assessment of the various models (March 2024)

  17. The Bitter Lesson

  18. Gwern on Scaling

Resources for a deeper dive

  1. Michael Nielsen’s wonderful free ebook guide to neural networks (recommended by ThreeBlueOneBrown)

  2. Andrej Karparthy’s Winter 2016 Stanford CS231n course, start with lecture 2

  3. Free online version of the Deep Learning textbook that seems to be widely regarded as the gold standard and which costs £65 on Amazon

  4. I haven’t looked at this Stanford course in depth, but someone on a useful Reddit thread said it was good!

  5. Ilya Sutskever’s AI reading list for John Carmack

How to Use Them?

  1. Andrej Karparthy’s guide to LLM use cases

  2. March 2025 review of model capabilities

  3. Why Gemini is actually very good

  4. A land use researcher works (successfully) with Deep Research

  5. An economist works (successfully) with Deep Research

  6. How Tyler Cowen uses the different models

What are the dangers of using them?

  1. Cognitive debt - MIT study suggests you get a better work product, but you become less smart because you didn’t do the work

Personal Favourite Use Cases

  1. Reading non-English books and creating excellent parallel translations (often better than the published English translations I own) live using Claude on my phone. Very natural to have your phone with you when you read, and easy to read one aside the other. Natural chunking into paragraphs also makes it less daunting to read an unfamiliar language.