Understanding Large Language Models

Apr 16, 2023

Awesome, I am glad to hear. And thanks for the feedback!!

Expand full comment

Mine too!!

Expand full comment

Awesome compilation! Really helpful for folks who are starting out. It would be amazing if you can write a similar blog for computer vision to catch up on SOTA like diffusion models , Vision transformers etc..

Expand full comment

Aug 7, 2023

Glad you liked it! And yes, I would love to do that one day (haha, I have quite a long list of things I love to write about :)). In the meantime, I highlighted a few interesting papers from CVPR here: https://magazine.sebastianraschka.com/p/ahead-of-ai-10-state-of-computer

Expand full comment

Great work! Congratulations!!

One minor typo, in the BERT section you write "..The BERT paper above introduces the original concept of masked-language modeling, and next-sentence prediction remains an influential decoder-style architecture.." which I think it should be "..encoder-style architecture..."

Expand full comment

Jul 5, 2023

Awesome, thanks for the note. Fixed it!

Expand full comment

I vote YES as well. Please keep going on ML & AI topics. Thanks for sharing.

Expand full comment

May 12, 2023

Thanks for the feedback, glad to hear!

Expand full comment

Stunning explanations! Thanks

Expand full comment

May 9, 2023

Thanks for the kind words. It's very nice to hear this!!

Expand full comment

Thanks for the article!

A minor note: I think there is a typo in "BlenderBot 3: A Deployed Conversational Agent that Continually Learns to Responsibly Rngage".

Expand full comment

Apr 24, 2023

Thanks! That should have been "Engage" (not "Rngage") of course. Fixed it!

Expand full comment

I like your idea of posting some additional articles related to machine learning and AI.

Expand full comment

Apr 21, 2023

Thanks for the feedback!

Expand full comment

Hi Sebastian,

I love your blog. Your blogs are very helpful to keep me updated on emerging technology.

Request you to write blog on

1) why/how LLM learn in-context without training on data. ( zero shot learning )

2) Prompt Chaining

Expand full comment

Apr 21, 2023

Thanks for the feedback. Nice, I was indeed planning on writing something on in-context learning vs finetuning this weekend.

Expand full comment

This is really good. I also enjoyed reading your book on ML with pytorch and SK Learn. Recommend to everyone.

Expand full comment

Really liked the article. Shall be following the links for further reading. Much thanks! One things, in "

Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning" You misspelt Deshpande, one of the authors. p and h are interchanged. Great read tho! I would love more ML articles. Even historical ones.

Expand full comment

Apr 17, 2023

Thanks for the feedback! Glad to hear you liked the article! (Also thanks for mentioning the misspelling, just fixed it right away!)

Expand full comment

This is a really great and detailed article. It seems like everything is converging towards transformers.

Transformers is literally taking over AI.

Do you think that transformers will be the monad for AI for the coming decade?

Expand full comment

Yes looking at their high performance it will surely be the monad for the coming century and not decade 👍

Expand full comment

Reply

Apr 17, 2023

Glad to hear it was useful. I would say that pretty much everything is converging towards transformers. Even a big field such as computer vision is now heavily driven by attention layers / transformer-based architectures. Whether it's going to last for a whole decade that has to be seen. New methods can always emerge unexpectedly (e.g., the recent trend from GANs to diffusion models.)

Expand full comment

Love your content and eagerly look forward to it. Keep it going!

Expand full comment

For sure, your content is always a great read!

Expand full comment

Reply (2)

Apr 16, 2023

Thanks, glad to hear!

Expand full comment

Yes I really enjoyed it

Expand full comment

Really amazing the way you write which inspires me to learn more from you, I would love to read your insights upon CNNs and older topics in DL and ML as well. I tried reaching out to you.

Expand full comment

Oct 26

Thanks for the kind words Sahasra! Regarding the older topics, I may return to them one day... Right now, I am very focused on LLMs because that's what I am mainly working on. I wish I had more time to write about other topics as well, but yeah, it's not easy to find the time to write given my day job.

Expand full comment

Thanks for responding Dr. Raschka

Expand full comment

Reply