Support Independent AI Research

Ahead of AI is a reader-supported, independent project dedicated to providing clear, insightful updates on AI and machine learning. By subscribing, you directly support this independent research and help me continue delivering high-quality, unbiased content.

For those who wish to support me beyond the subscription, please consider purchasing a copy of my books. If you find them insightful and beneficial, please feel free to recommend them to your friends and colleagues.

If you have a few moments, a review on Amazon would really help, too!

Your support means a great deal and is tremendously helpful in continuing this journey as an independent researcher. Thank you!

Build a Large Language Model (From Scratch)

Link to the official source code repository on GitHub
Book page at Manning
Amazon book page
ISBN-13: 978-1633437166

One of the best ways to understand LLMs is to code one from scratch! In Build a Large Language Model (from Scratch), I’ll guide you step by step through creating your own LLM, explaining each stage with clear text, diagrams, and examples.

The method described in this book for training and developing your own small-but-functional model for educational purposes mirrors the approach used in creating large-scale foundational models such as those behind ChatGPT. The book uses Python and PyTorch for all its coding examples.

‘Build a Large Language Model from Scratch’ by Sebastian Raschka @rasbt has been an invaluable resource for me, connecting many dots and sparking numerous ‘aha’ moments.
This book comes highly recommended for gaining a hands-on understanding of large language models.
– Via Faisal Alsrheed, AI researcher

While learning a new concept, I have always felt more confident about my understanding of the concept if I'm able to code it myself from scratch. Most tutorials tend to cover the high level concept and leave out the minor details, and the absence of these details is acutely felt when you try to put these concepts into code. Thats why I really appreciate Sebastian Raschka, PhD's latest book - Build a Large Language Model (from scratch).
At a time when most LLM implementations tend to use high level packages (transformers, timm), its really refreshing to see the progressive development of an LLM by coding the core building blocks using basic PyTorch elements. It also makes you appreciate how some of the core building blocks of SOTA LLMs can be distilled down to relatively simple concepts.
– Roshan Santhosh, Data Scientist at Meta

A high-level, no-code overview that explains the development of an LLM, featuring numerous figures from the book, which itself focuses on the underlying code that implements these processes:

Build a Large Language Model (From Scratch) Video Course

A 17-hour companion video course where I code through each chapter of my Build A Large Language Model (From Scratch) book. The course is organized into chapters and sections that mirror the book’s structure so that it can be used as a standalone alternative to the book or complementary code-along resource.

Machine Learning Q and AI

Machine Learning and Q and AI — Machine Learning Q and AI

Order directly from No Starch Press
Order from Amazon.com
Link to the Supplementary Materials and Discussions on GitHub
ISBN-13: 978-1718503762

If you’re ready to venture beyond introductory concepts and dig deeper into machine learning, deep learning, and AI, the question-and-answer format of Machine Learning Q and AI will make things fast and easy for you, without a lot of mucking about.

Each brief, self-contained chapter journeys through a fundamental question in AI, unraveling it with clear explanations, diagrams, and exercises.

Multi-GPU training paradigms
Finetuning transformers
Differences between encoder- and decoder-style LLMs
Concepts behind vision transformers
Confidence intervals for ML
And many more!

Reviews

“Sebastian has a gift for distilling complex, AI-related topics into practical takeaways that can be understood by anyone. His new book, Machine Learning Q and AI, is another great resource for AI practitioners of any level.”
– Cameron R. Wolfe, Writer of Deep (Learning) Focus
“Sebastian uniquely combines academic depth, engineering agility, and the ability to demystify complex ideas. He can go deep into any theoretical topics, experiment to validate new ideas, then explain them all to you in simple words. If you’re starting your journey into machine learning, Sebastian is your guide.”
– Chip Huyen, Author of Designing Machine Learning Systems
“One could hardly ask for a better guide than Sebastian, who is, without exaggeration, the best machine learning educator currently in the field. On each page, Sebastian not only imparts his extensive knowledge but also shares the passion and curiosity that mark true expertise.”
– Chris Albon, Director of Machine Learning, The Wikimedia Foundation

Machine Learning with PyTorch and Scikit-Learn

About this book

This 700-page book is a comprehensive resource on the fundamental concepts of machine learning and deep learning. The first half of the book introduces readers to machine learning using scikit-learn, the defacto approach for working with tabular datasets. Then, the second half of this book focuses on deep learning with PyTorch, including applications to natural language processing and computer vision.

While basic knowledge of Python is required, this book will take readers on a journey from understanding machine learning from the ground up towards training advanced deep learning models by the end of the book.

Reviews

“I’m confident that you will find this book invaluable both as a broad overview of the exciting field of machine learning and as a treasure of practical insights. I hope it inspires you to apply machine learning for the greater good in your problem area, whatever it might be.”
– Dmytro Dzhulgakov, PyTorch Core Maintainer
“This 700-page book covers most of today’s widely used machine learning algorithms, and will be especially useful to anybody who wants to understand modern machine learning through examples of working code. It covers a variety of approaches, from basic algorithms such as logistic regression to very recent topics in deep learning such as BERT and GPT language models and generative adversarial networks. The book provides examples of nearly every algorithm it discusses in the convenient form of downloadable Jupyter notebooks that provide both code and access to datasets. Importantly, the book also provides clear instructions on how to download and start using state-of-the-art software packages that take advantage of GPU processors, including PyTorch and Google Colab.”
– Tom M. Mitchell, professor, founder and former Chair of the Machine Learning Department at Carnegie Mellon University (CMU)