Discussion about this post

User's avatar
Bufort's avatar

Amazing summary thank you !

Just a quick question regarding the qwen 2 training.

I read in the report

"Similar to previous Qwen models, high-quality multi-task instruction data is integrated into the

Qwen2 pre-training process to enhance in-context learning and instruction-following abilities."

=> it means that there is some QA format no ? (more than a simple quality stage)

Expand full comment
Maria Mouschoutzi's avatar

This deep dive into LLM pre-training and post-training paradigms is fascinating. It's amazing to see how much the field has evolved with different models like Qwen 2, Apple's AFM, and Llama. Definitely learned a lot—thanks for sharing this! 🙏

Expand full comment
26 more comments...

No posts