/about
About this project
This is a graphical, interactive explainer of the Transformer decoder. Type your own tokens, watch every operation execute, and sample the next token end-to-end. Every matrix you see is the actual value the server computed — the visualisations are not a separate model.
It exists for two readers. End users learning the Transformer get a step-by-step walk-through with three layers of depth (concept, maths, code) they can toggle inside any section. Source-readers get a real-world Next.js full-stack codebase that reads like a textbook, with shapes, formulas, and references in the JSDoc of every operation.
Built with
- Next.js 14 (App Router)
- TypeScript (strict)
- Tailwind CSS
- D3.js
- Drizzle ORM + Postgres
- Auth.js v5
- next-mdx-remote
- Vitest + Playwright
References
- Vaswani et al., 2017 — Attention Is All You Need. The original Transformer paper. The pre-norm decoder block, sinusoidal positional encoding, and scaled dot-product attention in this codebase all follow this paper directly.
- Radford et al., 2019 — Language Models are Unsupervised Multitask Learners (GPT-2). The decoder-only architecture and pre-LayerNorm pattern follow this work.
- Hendrycks & Gimpel, 2016 — Gaussian Error Linear Units (GELUs). The tanh approximation used in `lib/transformer/gelu.ts`.
- Andrej Karpathy — nanoGPT and “Let's build GPT”. Reference implementations checked against during development.
- 3blue1brown — Neural Networks series. Pedagogical inspiration for the visual intuition layer.
Licence
MIT. Use the code as a learning resource, fork it, ship it. See LICENSE for the text.