A good synopsis. Lots of value can be delivered through smaller models focused on specific problems and domains.
Founder & CEO of KYield. Pioneer in Artificial Intelligence, Data Physics and Knowledge Engineering.
The title should have been: "In AI systems, smaller is almost always better". Good to see this article on small language models at the WSJ, which is the optimal method for internal chatbots run on enterprise data. Unfortunately, it still misses the bigger issue that language models have limited use, and doesn't mention the efficiency, accuracy and productivity in providing relevant data to begin with -- tailored to each entity. Even if limiting reporting to language models, which shouldn't be done when attempting to cover all of AI systems, please go beyond LLM firms and big techs as they have natural conflicts -- they are scale dependent. Mentioning big tech and LLM firms is like citing fast food giants for stories on good nutrition. Yes, one can find an occasional story, but that's not where most of the value is. It gives readers the wrong impression. There is an entire health food industry out there. The same is true for responsible AI. That said, it's an improvement over the LLM hype-storm. ~~~~~ “It shouldn’t take quadrillions of operations to compute 2 + 2,” said Illia Polosukhin. “If you’re doing hundreds of thousands or millions of answers, the economics don’t work” to use a large model, Shoham said. “You end up overpaying and have latency issues” with large models, Shih said. “It’s overkill.”