Subscribe
Sign in
Home
Notes
Chat
Support
LLMs From Scratch Book
Reasoning From Scratch Book
Archive
About
The Big LLM Architecture Comparison
From DeepSeek-V3 to Kimi K2: A Look At Modern LLM Architecture Design
Jul 19, 2025
•
Sebastian Raschka, PhD
1,749
86
161
Categories of Inference-Time Scaling for Improved LLM Reasoning
And an Overview of Recent Inference-Scaling Papers
Jan 24
•
Sebastian Raschka, PhD
21
1
The State Of LLMs 2025: Progress, Problems, and Predictions
A 2025 review of large language models, from DeepSeek R1 and RLVR to inference-time scaling, benchmarks, architectures, and predictions for 2026.
Dec 30, 2025
•
Sebastian Raschka, PhD
487
40
54
LLM Research Papers: The 2025 List (July to December)
In June, I shared a bonus article with my curated and bookmarked research paper lists to the paid subscribers who make this Substack possible.
Dec 30, 2025
•
Sebastian Raschka, PhD
34
2
From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates
Understanding How DeepSeek's Flagship Open-Weight Models Evolved
Dec 3, 2025
•
Sebastian Raschka, PhD
248
13
28
Most Popular
View all
Understanding Reasoning LLMs
Feb 5, 2025
•
Sebastian Raschka, PhD
1,241
44
115
Understanding and Coding Self-Attention, Multi-Head Attention, Causal-Attention, and Cross-Attention in LLMs
Jan 14, 2024
429
41
16
Understanding Large Language Models
Apr 16, 2023
•
Sebastian Raschka, PhD
946
53
50
From GPT-2 to gpt-oss: Analyzing the Architectural Advances
Aug 9, 2025
•
Sebastian Raschka, PhD
617
47
55
Recent posts
View all
Beyond Standard LLMs
Linear Attention Hybrids, Text Diffusion, Code World Models, and Small Recursive Transformers
Nov 4, 2025
•
Sebastian Raschka, PhD
356
25
36
Understanding the 4 Main Approaches to LLM Evaluation (From Scratch)
Multiple-Choice Benchmarks, Verifiers, Leaderboards, and LLM Judges with Code Examples
Oct 5, 2025
•
Sebastian Raschka, PhD
355
26
33
Understanding and Implementing Qwen3 From Scratch
A Detailed Look at One of the Leading Open-Source LLMs
Sep 6, 2025
•
Sebastian Raschka, PhD
116
6
12
From GPT-2 to gpt-oss: Analyzing the Architectural Advances
And How They Stack Up Against Qwen3
Aug 9, 2025
•
Sebastian Raschka, PhD
617
47
55
LLM Research Papers: The 2025 List (January to June)
A topic-organized collection of 200+ LLM research papers from 2025
Jul 1, 2025
•
Sebastian Raschka, PhD
98
5
9
Understanding and Coding the KV Cache in LLMs from Scratch
KV caches are one of the most critical techniques for efficient inference in LLMs in production.
Jun 17, 2025
•
Sebastian Raschka, PhD
438
40
34
Coding LLMs from the Ground Up: A Complete Course
Why build LLMs from scratch? It's probably the best and most efficient way to learn how LLMs really work. Plus, many readers have told me they had a lot…
May 10, 2025
•
Sebastian Raschka, PhD
253
4
18
The State of Reinforcement Learning for LLM Reasoning
Understanding GRPO and New Insights from Reasoning Model Papers
Apr 19, 2025
•
Sebastian Raschka, PhD
493
33
39
First Look at Reasoning From Scratch: Chapter 1
Welcome to the next stage of large language models (LLMs): reasoning. LLMs have transformed how we process and generate text, but their success has been…
Mar 29, 2025
•
Sebastian Raschka, PhD
63
15
8
The State of LLM Reasoning Model Inference
Inference-Time Compute Scaling Methods to Improve Reasoning Models
Mar 8, 2025
•
Sebastian Raschka, PhD
408
11
32
Understanding Reasoning LLMs
Methods and Strategies for Building and Refining Reasoning Models
Feb 5, 2025
•
Sebastian Raschka, PhD
1,241
44
115
Noteworthy AI Research Papers of 2024 (Part Two)
Six influential AI papers from July to December
Jan 15, 2025
•
Sebastian Raschka, PhD
178
10
16
See all
Subscribe to receive new in-depth research insights on AI and machine learning.
Subscribe
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts