Archive - Ahead of AI

How Good Are the Latest Open LLMs? And Is DPO Better Than PPO?

Discussing the Latest Model Releases and AI Research in April 2024

May 12 •

Sebastian Raschka, PhD

April 2024

Using and Finetuning Pretrained Transformers

What are the different ways to use and finetune pretrained large language models (LLMs)? The most common ways to use and finetune pretrained LLMs…

Apr 20 •

Sebastian Raschka, PhD

March 2024

Tips for LLM Pretraining and Evaluating Reward Models

Discussing AI Research Papers in March 2024

Mar 31 •

Sebastian Raschka, PhD

Research Papers in February 2024: A LoRA Successor, Small Finetuned LLMs Vs Generalist LLMs, and Transparent LLM Research

Once again, this has been an exciting month in AI research. This month, I'm covering two new openly available LLMs, insights into small finetuned LLMs…

Mar 3 •

Sebastian Raschka, PhD

February 2024

Improving LoRA: Implementing Weight-Decomposed Low-Rank Adaptation (DoRA) from Scratch

Low-rank adaptation (LoRA) is a machine learning technique that modifies a pretrained model (for example, an LLM or vision transformer) to better suit a…

Feb 18 •

Sebastian Raschka, PhD

Research Papers in Jan 2024: Model Merging, Mixtures of Experts, and Towards Smaller LLMs

Model Merging, Mixtures of Experts, and Towards Smaller LLMs

Feb 3 •

Sebastian Raschka, PhD

January 2024

Understanding and Coding Self-Attention, Multi-Head Attention, Cross-Attention, and Causal-Attention in LLMs

This article will teach you about self-attention mechanisms used in transformer architectures and large language models (LLMs) such as GPT-4 and Llama…

Jan 14

December 2023

Ten Noteworthy AI Research Papers of 2023

This year has felt distinctly different. I've been working in, on, and with machine learning and AI for over a decade, yet I can't recall a time when…

Dec 30, 2023 •

Sebastian Raschka, PhD

Research Papers in Nov 2023: Tackling Hallucinations, Boosting Reasoning Abilities, and New Insights into the Transformer Architecture

This month, I want to focus on three papers that address three distinct problem categories of Large Language Models (LLMs): Reducing hallucinations…

Dec 9, 2023 •

Sebastian Raschka, PhD

November 2023

Practical Tips for Finetuning LLMs Using LoRA (Low-Rank Adaptation)

Things I Learned From Hundreds of Experiments

Nov 19, 2023 •

Sebastian Raschka, PhD

Research Papers in Oct 2023: A Potential Successor to RLHF for Efficient LLM Alignment and the Resurgence of CNNs

From Vision Transformers to innovative large language model finetuning techniques, the AI community has been very active with lots of interesting…

Nov 4, 2023 •

Sebastian Raschka, PhD

October 2023

AI and Open Source in 2023

The Highs and Lows: A Year in Review

Oct 23, 2023 •

Sebastian Raschka, PhD

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts