11 Comments
Nov 5, 2023Liked by Sebastian Raschka, PhD

non ai guy with a few questions here? what does training a model look like? how to build you own models? which open source models can work on your data to answer questions in a commercial setting?

Expand full comment
author

Thanks for your interest, I gave a talk at SDSC that outlines the typical usage and training, which might answer your questions:

https://www.youtube.com/watch?si=IANKlKOY0Df1gF5s&v=B1RSE73RdRE&feature=youtu.be

Regarding open-source solutions, many people use either Hugging Face libraries or Lit-GPT. The first is a more packaged solution and the second is more of a collection of scripts that focuses on keeping the code lean and customizable

Expand full comment

thank you for linking that talk, i saw the full video, it covers a lot of stuff about LLMs at a high level but i am guessing i am going to need to do multiple courses on deep learning, tensorflow , pytorch, RNN,CNN,RHLF etc. any recommendations

Expand full comment
author

It’d say that learning PyTorch is definitely a good time investment since it’s the most widely used deep learning framework at the moment. I have a course here that might be useful to get started: https://lightning.ai/courses/deep-learning-fundamentals/ (it’s entirely free)

Expand full comment
Nov 6, 2023Liked by Sebastian Raschka, PhD

Amazing read, subscribed. Thanks for sharing!

Expand full comment
Nov 4, 2023Liked by Sebastian Raschka, PhD

fantastic article. thanks for sharing. your thoughts.

Expand full comment
Nov 1, 2023Liked by Sebastian Raschka, PhD

Great summary!

Expand full comment

Such a well written piece! Thanks! Wrt RLHF, what do you think stops it from a wide adoption? data availability? algorithm or implementation complexity?

Expand full comment

Hi Sebastian any idea llava 1.5 weights are open to commercial use ?

Expand full comment
author

The LLaVA repo is licensed under Apache 2.0 but I would reach out to the author to double-check whether that's also true for the weights: https://github.com/haotian-liu/LLaVA

Expand full comment