Subscribe
Sign in
Home
Notes
LLM Gallery
Support
LLMs From Scratch Book
Reasoning From Scratch Book
Archive
About
Latest
Top
Discussions
My Workflow for Understanding LLM Architectures
A learning-oriented workflow for understanding new open-weight model releases
Apr 18
•
Sebastian Raschka, PhD
45
2
4
Components of A Coding Agent
How coding agents use tools, memory, and repo context to make LLMs work better in practice
Apr 4
•
Sebastian Raschka, PhD
734
59
71
March 2026
A Visual Guide to Attention Variants in Modern LLMs
From MHA and GQA to MLA, sparse attention, and hybrid architectures
Mar 22
•
Sebastian Raschka, PhD
362
15
32
February 2026
A Dream of Spring for Open-Weight LLMs: 10 Architectures from Jan-Feb 2026
A Round Up And Comparison of 10 Open-Weight LLM Releases in Spring 2026
Feb 25
•
Sebastian Raschka, PhD
205
12
20
January 2026
Categories of Inference-Time Scaling for Improved LLM Reasoning
And an Overview of Recent Inference-Scaling Papers
Jan 24
•
Sebastian Raschka, PhD
41
2
December 2025
The State Of LLMs 2025: Progress, Problems, and Predictions
A 2025 review of large language models, from DeepSeek R1 and RLVR to inference-time scaling, benchmarks, architectures, and predictions for 2026.
Dec 30, 2025
•
Sebastian Raschka, PhD
516
39
55
LLM Research Papers: The 2025 List (July to December)
In June, I shared a bonus article with my curated and bookmarked research paper lists to the paid subscribers who make this Substack possible.
Dec 30, 2025
•
Sebastian Raschka, PhD
36
3
From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates
Understanding How DeepSeek's Flagship Open-Weight Models Evolved
Dec 3, 2025
•
Sebastian Raschka, PhD
264
14
28
November 2025
Beyond Standard LLMs
Linear Attention Hybrids, Text Diffusion, Code World Models, and Small Recursive Transformers
Nov 4, 2025
•
Sebastian Raschka, PhD
373
28
36
October 2025
Understanding the 4 Main Approaches to LLM Evaluation (From Scratch)
Multiple-Choice Benchmarks, Verifiers, Leaderboards, and LLM Judges with Code Examples
Oct 5, 2025
•
Sebastian Raschka, PhD
367
28
34
September 2025
Understanding and Implementing Qwen3 From Scratch
A Detailed Look at One of the Leading Open-Source LLMs
Sep 6, 2025
•
Sebastian Raschka, PhD
123
6
12
August 2025
From GPT-2 to gpt-oss: Analyzing the Architectural Advances
And How They Stack Up Against Qwen3
Aug 9, 2025
•
Sebastian Raschka, PhD
627
47
55
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts