Large Language Models

Deep dive into the architecture, optimization, and engineering of large language models. From tokenization to attention mechanisms, understand how LLMs work under the hood.

4
Core Concepts
45min
Total Reading
Interactive
Visualizations
Practical
Code Examples

Suggested Learning Path

1
TokenizationStart with understanding how text becomes numbers
2
Context WindowsLearn about attention patterns and memory limits
3
KV CacheUnderstand inference optimization through caching
4
Flash AttentionMaster advanced GPU optimization techniques
Mastodon