11 Comments
User's avatar
pirateapp's avatar

non ai guy with a few questions here? what does training a model look like? how to build you own models? which open source models can work on your data to answer questions in a commercial setting?

Sebastian Raschka, PhD's avatar

Thanks for your interest, I gave a talk at SDSC that outlines the typical usage and training, which might answer your questions:

https://www.youtube.com/watch?si=IANKlKOY0Df1gF5s&v=B1RSE73RdRE&feature=youtu.be

Regarding open-source solutions, many people use either Hugging Face libraries or Lit-GPT. The first is a more packaged solution and the second is more of a collection of scripts that focuses on keeping the code lean and customizable

pirateapp's avatar

thank you for linking that talk, i saw the full video, it covers a lot of stuff about LLMs at a high level but i am guessing i am going to need to do multiple courses on deep learning, tensorflow , pytorch, RNN,CNN,RHLF etc. any recommendations

Sebastian Raschka, PhD's avatar

It’d say that learning PyTorch is definitely a good time investment since it’s the most widely used deep learning framework at the moment. I have a course here that might be useful to get started: https://lightning.ai/courses/deep-learning-fundamentals/ (it’s entirely free)

Nacho Dramis's avatar

Amazing read, subscribed. Thanks for sharing!

Umesh Bhatt's avatar

fantastic article. thanks for sharing. your thoughts.

Filippo B's avatar

Great summary!

Lei Tang's avatar

Such a well written piece! Thanks! Wrt RLHF, what do you think stops it from a wide adoption? data availability? algorithm or implementation complexity?

Papasani Mohansrinivas's avatar

Hi Sebastian any idea llava 1.5 weights are open to commercial use ?

Sebastian Raschka, PhD's avatar

The LLaVA repo is licensed under Apache 2.0 but I would reach out to the author to double-check whether that's also true for the weights: https://github.com/haotian-liu/LLaVA