48 Comments
Apr 16, 2023Liked by Sebastian Raschka, PhD

This is fantastic. Really looking forward to going through each of these papers. The rate of progress is so fast that collections like these are essential so that people who are not at the core of the field can keep up with the key insights.

Expand full comment
Apr 16, 2023Liked by Sebastian Raschka, PhD

I'd love to read additional ML & AI articles from you, outside of your existing newsletter format! So you've got my vote ✅

Expand full comment
author

Awesome, I am glad to hear. And thanks for the feedback!!

Expand full comment

Mine too!!

Expand full comment
Aug 6, 2023Liked by Sebastian Raschka, PhD

Awesome compilation! Really helpful for folks who are starting out. It would be amazing if you can write a similar blog for computer vision to catch up on SOTA like diffusion models , Vision transformers etc..

Expand full comment
author

Glad you liked it! And yes, I would love to do that one day (haha, I have quite a long list of things I love to write about :)). In the meantime, I highlighted a few interesting papers from CVPR here: https://magazine.sebastianraschka.com/p/ahead-of-ai-10-state-of-computer

Expand full comment
Jul 5, 2023Liked by Sebastian Raschka, PhD

Great work! Congratulations!!

One minor typo, in the BERT section you write "..The BERT paper above introduces the original concept of masked-language modeling, and next-sentence prediction remains an influential decoder-style architecture.." which I think it should be "..encoder-style architecture..."

Expand full comment
author

Awesome, thanks for the note. Fixed it!

Expand full comment
May 12, 2023Liked by Sebastian Raschka, PhD

I vote YES as well. Please keep going on ML & AI topics. Thanks for sharing.

Expand full comment
author

Thanks for the feedback, glad to hear!

Expand full comment
May 8, 2023Liked by Sebastian Raschka, PhD

Stunning explanations! Thanks

Expand full comment
author

Thanks for the kind words. It's very nice to hear this!!

Expand full comment
Apr 24, 2023Liked by Sebastian Raschka, PhD

Thanks for the article!

A minor note: I think there is a typo in "BlenderBot 3: A Deployed Conversational Agent that Continually Learns to Responsibly Rngage".

Expand full comment
author

Thanks! That should have been "Engage" (not "Rngage") of course. Fixed it!

Expand full comment
Apr 21, 2023Liked by Sebastian Raschka, PhD

I like your idea of posting some additional articles related to machine learning and AI.

Expand full comment
author

Thanks for the feedback!

Expand full comment
Apr 21, 2023Liked by Sebastian Raschka, PhD

Hi Sebastian,

I love your blog. Your blogs are very helpful to keep me updated on emerging technology.

Request you to write blog on

1) why/how LLM learn in-context without training on data. ( zero shot learning )

2) Prompt Chaining

Expand full comment
author

Thanks for the feedback. Nice, I was indeed planning on writing something on in-context learning vs finetuning this weekend.

Expand full comment
Apr 17, 2023Liked by Sebastian Raschka, PhD

This is really good. I also enjoyed reading your book on ML with pytorch and SK Learn. Recommend to everyone.

Expand full comment
Apr 17, 2023Liked by Sebastian Raschka, PhD

Really liked the article. Shall be following the links for further reading. Much thanks! One things, in "

Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning" You misspelt Deshpande, one of the authors. p and h are interchanged. Great read tho! I would love more ML articles. Even historical ones.

Expand full comment
author

Thanks for the feedback! Glad to hear you liked the article! (Also thanks for mentioning the misspelling, just fixed it right away!)

Expand full comment
Apr 16, 2023Liked by Sebastian Raschka, PhD

This is a really great and detailed article. It seems like everything is converging towards transformers.

Transformers is literally taking over AI.

Do you think that transformers will be the monad for AI for the coming decade?

Expand full comment
Jan 5Liked by Sebastian Raschka, PhD

Yes looking at their high performance it will surely be the monad for the coming century and not decade 👍

Expand full comment
author

Glad to hear it was useful. I would say that pretty much everything is converging towards transformers. Even a big field such as computer vision is now heavily driven by attention layers / transformer-based architectures. Whether it's going to last for a whole decade that has to be seen. New methods can always emerge unexpectedly (e.g., the recent trend from GANs to diffusion models.)

Expand full comment
Apr 16, 2023Liked by Sebastian Raschka, PhD

Love your content and eagerly look forward to it. Keep it going!

Expand full comment
Apr 16, 2023Liked by Sebastian Raschka, PhD

For sure, your content is always a great read!

Expand full comment
author

Thanks, glad to hear!

Expand full comment

Yes I really enjoyed it

Expand full comment
Jul 23Liked by Sebastian Raschka, PhD

I learned alot from Dr Sebastian Ras, thx dude

Expand full comment
Jun 9Liked by Sebastian Raschka, PhD

Hi just loved reading your article. Due the nature of my work I have been reading and testing small llm models like tinyllama or phi-3 from Microsoft. I particular the last one is focused on the success of small models is related about the high quality training datathat is presented when training. Do you have any experience with this models? Will you post anything about this? In any case many thanks for sharing your knowledge in such clear and consise way.

Expand full comment
author

Glad you liked it! Yes, these are nice, small models. Fun fact: The TinyLlama model and paper was actually based on the LitLlama/LitGPT framework I help developing. I wrote about Phi-3 a bit here (https://magazine.sebastianraschka.com/p/how-good-are-the-latest-open-llms) in Section 1.3 if useful. I should probably also add some of these small-but-good models to this article some time.

Expand full comment