A well-curated list. Another noteworthy paper in October was Microsoft's BitNet. It demonstrated something quite remarkable - they managed to run a 100B parameter language model on a single CPU while maintaining human-level reading speed (5-7 tokens per second) by using 1.58-bit quantization. This breakthrough has huge implications for making large language models accessible on local devices without requiring specialized hardware.
Thanks! And yes, you are absolutely right regarding BitNet. Had it on my paper bookmark list (https://magazine.sebastianraschka.com/p/llm-research-papers-the-2024-list) but ultimately decided to pick the scaling laws for November because that's right now a bit more relevant for my work. Not saying that BitNet is super (!) impressive, but it was a bit tough to pick only one for each month 😅.
A well-curated list. Another noteworthy paper in October was Microsoft's BitNet. It demonstrated something quite remarkable - they managed to run a 100B parameter language model on a single CPU while maintaining human-level reading speed (5-7 tokens per second) by using 1.58-bit quantization. This breakthrough has huge implications for making large language models accessible on local devices without requiring specialized hardware.
https://arxiv.org/abs/2410.16144
Thanks! And yes, you are absolutely right regarding BitNet. Had it on my paper bookmark list (https://magazine.sebastianraschka.com/p/llm-research-papers-the-2024-list) but ultimately decided to pick the scaling laws for November because that's right now a bit more relevant for my work. Not saying that BitNet is super (!) impressive, but it was a bit tough to pick only one for each month 😅.
So much progress and novel papers--it's definitely hard to pick one.
Thank you for the wealth of information!