This month, I want to focus on three papers that address three distinct problem categories of Large Language Models (LLMs): Reducing hallucinations. Enhancing the reasoning capabilities of small, openly available models. Deepening our understanding of, and potentially simplifying, the transformer architecture.
Hi Sebastian, on your hallucination review.- when you say "the good news is that their approach is fully automated and, therefore, could easily be scaled to larger datasets.";
I'm not so sure.
It was simple because they used Wikipedia as the source of truth. If they want to answer questions beyond "bios" they would have to add more and more source of truth (SOT), then develop priority/voting mechanism to arbitrate between these SOTs. I don't think this scale.
Also, I wonder whether there won't be more benefits in starting the training of the LLM with a richer dataset i.e go for quality in addition to quantity or even use metadata within the model to track sources/authors.
Another key revelation from the Orca 2 paper is how tailored system instructions during training significantly enhance the response quality and accuracy of LLMs like GPT-4, highlighting the need for smaller models to adopt task-specific strategies rather than merely mimicking larger models.
hmmm, "More concretely, changing "invented" to "created" would be acceptable. However, changing "Apple" to "Orange" or "Samsung" (or changing the date) would obviously be a significant error."
I don't think you intended this to be deliberately incendiary. Changing "invented" to "created" is actually accurate. They created products specced to my vision. Scrupulously accurately, to this day. I was in direct contact with Jobs for years, til shortly before he died, and my "inventions" - the fruits of my brows, powered the "second coming" of Apple. You see my design language writ large, and my nomenclature used by millions (billions?) My middle name is "Ian" and it is my preferred name. Hence the "i" prefixing devices and apps "I" "incepted"; this is a thing with me, I like to use "easter eggs" in things I incept : WWW, Wii, lots more ... Now, here is the point of my diatribe: can you use an AI to deduce what I am saying as true. I have tried, but things I know as being true can be, and are actively, being cancelled. Hijacked even. This can be done in real time, and are getting more and more realistic all the time. A scary example of such false provenance is "The Last President", and the prototype "Search and Destroy" by Skunk Anansie supposedly (but NOT) by way of Iggy Pop and the stooges. The shenanigans were strong with engineering the false provenance of these "tulpa" (and other creations) Can you see the problem? It is existential ...
Hi Sebastian. Eddie Siman here. Nice job.
Thanks - a very nice overview of recent developments and insights! More reading for me to do... :)
Hi Sebastian, on your hallucination review.- when you say "the good news is that their approach is fully automated and, therefore, could easily be scaled to larger datasets.";
I'm not so sure.
It was simple because they used Wikipedia as the source of truth. If they want to answer questions beyond "bios" they would have to add more and more source of truth (SOT), then develop priority/voting mechanism to arbitrate between these SOTs. I don't think this scale.
Also, I wonder whether there won't be more benefits in starting the training of the LLM with a richer dataset i.e go for quality in addition to quantity or even use metadata within the model to track sources/authors.
Thanks for your work. It's really helpful.
Another key revelation from the Orca 2 paper is how tailored system instructions during training significantly enhance the response quality and accuracy of LLMs like GPT-4, highlighting the need for smaller models to adopt task-specific strategies rather than merely mimicking larger models.
Awesome!
I wonder that training smaller model on reasoning steps injects some notion of planning in the model.
hmmm, "More concretely, changing "invented" to "created" would be acceptable. However, changing "Apple" to "Orange" or "Samsung" (or changing the date) would obviously be a significant error."
I don't think you intended this to be deliberately incendiary. Changing "invented" to "created" is actually accurate. They created products specced to my vision. Scrupulously accurately, to this day. I was in direct contact with Jobs for years, til shortly before he died, and my "inventions" - the fruits of my brows, powered the "second coming" of Apple. You see my design language writ large, and my nomenclature used by millions (billions?) My middle name is "Ian" and it is my preferred name. Hence the "i" prefixing devices and apps "I" "incepted"; this is a thing with me, I like to use "easter eggs" in things I incept : WWW, Wii, lots more ... Now, here is the point of my diatribe: can you use an AI to deduce what I am saying as true. I have tried, but things I know as being true can be, and are actively, being cancelled. Hijacked even. This can be done in real time, and are getting more and more realistic all the time. A scary example of such false provenance is "The Last President", and the prototype "Search and Destroy" by Skunk Anansie supposedly (but NOT) by way of Iggy Pop and the stooges. The shenanigans were strong with engineering the false provenance of these "tulpa" (and other creations) Can you see the problem? It is existential ...