This is fantastic. Really looking forward to going through each of these papers. The rate of progress is so fast that collections like these are essential so that people who are not at the core of the field can keep up with the key insights.
Awesome compilation! Really helpful for folks who are starting out. It would be amazing if you can write a similar blog for computer vision to catch up on SOTA like diffusion models , Vision transformers etc..
One minor typo, in the BERT section you write "..The BERT paper above introduces the original concept of masked-language modeling, and next-sentence prediction remains an influential decoder-style architecture.." which I think it should be "..encoder-style architecture..."
Really liked the article. Shall be following the links for further reading. Much thanks! One things, in "
Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning" You misspelt Deshpande, one of the authors. p and h are interchanged. Great read tho! I would love more ML articles. Even historical ones.
Glad to hear it was useful. I would say that pretty much everything is converging towards transformers. Even a big field such as computer vision is now heavily driven by attention layers / transformer-based architectures. Whether it's going to last for a whole decade that has to be seen. New methods can always emerge unexpectedly (e.g., the recent trend from GANs to diffusion models.)
Really amazing the way you write which inspires me to learn more from you, I would love to read your insights upon CNNs and older topics in DL and ML as well. I tried reaching out to you.
Thanks for the kind words Sahasra! Regarding the older topics, I may return to them one day... Right now, I am very focused on LLMs because that's what I am mainly working on. I wish I had more time to write about other topics as well, but yeah, it's not easy to find the time to write given my day job.
Enjoyed this. I am looking forward to future articles from you. If you do anything that is AI for normal people, I'm hunting for something for my staffers who are afraid of "evil AI"
Thanks, glad to hear! In my opinion, the best way to really understand AI is to actually implement (/code) one, so that one can develop a better intuition for how it works under the hood, and what the limitations are. In this regard, maybe my Build a Large Language Model From Scratch book may come in handy: https://sebastianraschka.com/books/
This is fantastic. Really looking forward to going through each of these papers. The rate of progress is so fast that collections like these are essential so that people who are not at the core of the field can keep up with the key insights.
I'd love to read additional ML & AI articles from you, outside of your existing newsletter format! So you've got my vote ✅
Awesome, I am glad to hear. And thanks for the feedback!!
Mine too!!
Awesome compilation! Really helpful for folks who are starting out. It would be amazing if you can write a similar blog for computer vision to catch up on SOTA like diffusion models , Vision transformers etc..
Glad you liked it! And yes, I would love to do that one day (haha, I have quite a long list of things I love to write about :)). In the meantime, I highlighted a few interesting papers from CVPR here: https://magazine.sebastianraschka.com/p/ahead-of-ai-10-state-of-computer
Great work! Congratulations!!
One minor typo, in the BERT section you write "..The BERT paper above introduces the original concept of masked-language modeling, and next-sentence prediction remains an influential decoder-style architecture.." which I think it should be "..encoder-style architecture..."
Awesome, thanks for the note. Fixed it!
I vote YES as well. Please keep going on ML & AI topics. Thanks for sharing.
Thanks for the feedback, glad to hear!
Stunning explanations! Thanks
Thanks for the kind words. It's very nice to hear this!!
Thanks for the article!
A minor note: I think there is a typo in "BlenderBot 3: A Deployed Conversational Agent that Continually Learns to Responsibly Rngage".
Thanks! That should have been "Engage" (not "Rngage") of course. Fixed it!
I like your idea of posting some additional articles related to machine learning and AI.
Thanks for the feedback!
Hi Sebastian,
I love your blog. Your blogs are very helpful to keep me updated on emerging technology.
Request you to write blog on
1) why/how LLM learn in-context without training on data. ( zero shot learning )
2) Prompt Chaining
Thanks for the feedback. Nice, I was indeed planning on writing something on in-context learning vs finetuning this weekend.
This is really good. I also enjoyed reading your book on ML with pytorch and SK Learn. Recommend to everyone.
Really liked the article. Shall be following the links for further reading. Much thanks! One things, in "
Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning" You misspelt Deshpande, one of the authors. p and h are interchanged. Great read tho! I would love more ML articles. Even historical ones.
Thanks for the feedback! Glad to hear you liked the article! (Also thanks for mentioning the misspelling, just fixed it right away!)
This is a really great and detailed article. It seems like everything is converging towards transformers.
Transformers is literally taking over AI.
Do you think that transformers will be the monad for AI for the coming decade?
Yes looking at their high performance it will surely be the monad for the coming century and not decade 👍
Glad to hear it was useful. I would say that pretty much everything is converging towards transformers. Even a big field such as computer vision is now heavily driven by attention layers / transformer-based architectures. Whether it's going to last for a whole decade that has to be seen. New methods can always emerge unexpectedly (e.g., the recent trend from GANs to diffusion models.)
Love your content and eagerly look forward to it. Keep it going!
For sure, your content is always a great read!
Thanks, glad to hear!
Yes I really enjoyed it
Really amazing the way you write which inspires me to learn more from you, I would love to read your insights upon CNNs and older topics in DL and ML as well. I tried reaching out to you.
Thanks for the kind words Sahasra! Regarding the older topics, I may return to them one day... Right now, I am very focused on LLMs because that's what I am mainly working on. I wish I had more time to write about other topics as well, but yeah, it's not easy to find the time to write given my day job.
Thanks for responding Dr. Raschka
Enjoyed this. I am looking forward to future articles from you. If you do anything that is AI for normal people, I'm hunting for something for my staffers who are afraid of "evil AI"
Thanks, glad to hear! In my opinion, the best way to really understand AI is to actually implement (/code) one, so that one can develop a better intuition for how it works under the hood, and what the limitations are. In this regard, maybe my Build a Large Language Model From Scratch book may come in handy: https://sebastianraschka.com/books/