15 Comments
User's avatar
Aditya Sharan's avatar

Did you read the Claude Code Source for this? The timing is quite aligned. Hehehe

Sebastian Raschka, PhD's avatar

No comment, haha :)

Benjamin Riley's avatar

Terrific and timely summary, thanks for continuing to do the great work that you do breaking down these models.

I'm curious if you've thought at all about what domains of activities would (or will) work well with the agentic architecture you've described here? My sense is that with coding in particular, it's relatively straightforward to "bind" (or harness!) the agent(s) due to the deterministic nature of coding itself. In contrast, OpenClaw presents a different application and my impression is that it's much less reliable, perhaps because the tasks involved are more open-ended.

There's a philosophical debate happening among AI researchers around whether R&D efforts should aim at "specialized intelligence" versus those who think we need truly general, universal models. (For background: https://arxiv.org/pdf/2602.23643v1) Knowing what you know about these tools, I'm curious where you find yourself in that debate.

Sebastian Raschka, PhD's avatar

Yeah, I think there are more degrees of freedom in OpenClaw, which makes it more chaotic / less reliable.

Besides coding, one other natural application is notetaking. I have a markdown knowledgebase and project planner (been an Obsidian user for many years), and I've been using agents recently to clean and maintain and filter it. Works great!

Luba's avatar

This sounds like a very good idea for a next book, "Building coding agents from scratch"

Sebastian Raschka, PhD's avatar

Yes, it’s a natural sequence from building LLMs → reasoning models → coding agents from scratch ☺️

Mike's avatar

Thank you. Just purchased the reasoning models book, count me in if the agent book from scratch goes ahead. Will take a look at your repo with mini agent. For me even understanding how to go to output a python code to actually executing it will be a big help

miko's avatar

A repo for building agents from scratch without framework would be pretty interesting :))

Sebastian Raschka, PhD's avatar

I did the https://github.com/rasbt/mini-coding-agent with my reasoning model first, but the 0.6B size was a tad too small, hence the ollama backend for experimentation. But yeah, I agree with you.

Grzegorz's avatar

This is a very accurate and reliable approach to the problem of encoding agents (PHY_document). The distinction between LLM, reasoning model, and agent is crucial to understanding their functionality. The concept of "agent harness" and the six building blocks aligns with my axioms about the need for a hierarchical architecture and modularity.

The article confirms my own approach, that the effectiveness of an AI system depends not only on the model itself, but also on the surrounding control layer, context management, tools, and feedback mechanisms. The absence of these elements leads to the emergent problems we observed in Claude's case.

Sebastian Raschka, PhD's avatar

Thanks! Regarding "emergent problems we observed in Claude's case" you mean when using Claude outside Claude Code?

Grzegorz's avatar

Yes, I'm referring to both Claude's emergent behaviors (e.g., in the emotion research and the 'Agents of Chaos' report) and the vulnerabilities in Claude Code revealed after the leak. Both of these areas underscore the importance of a robust agent architecture, which you describe.

code_to_joy's avatar

A useful addition to session memory is 'flagging' - user-specified or model-inferred flags : this is (likely) going to be important in the future, flag it. Maintain a 'flags' file. Before you run, check flags in case something flagged in the past is relevant for this cycle. Prompt user if necessary for confirmation.

bsk's avatar

Do you think that tools like https://github.com/yamadashy/repomix are still necessary if you deal with a big project repository which itself includes several subprojects and has in total more than 1 million tokens?

Sebastian Raschka, PhD's avatar

Hm, I don't think that's necessary at all. I actually even worry that this would be harmful in terms of losing the file hierarchy. That's a task that should be done by the harness if necessary.