Tarik Dzekman’s Post

View profile for Tarik Dzekman, graphic

Lead AI Engineer @ Affinda

I keep reading that “AI progress is slowing down”. But it was 6 years from the paper “Attention is all you need” until ChatGPT was released. That’s 6 years of incremental updates to get to an AI model that regular consumers would find useful. ChatGPT is now 2 years old. Maybe it only seems like incremental improvements in that time. But I’ve seen plenty of use cases where the models finally get good enough to be useful. Not only that but we’re getting better at using them. “But aren’t we running out of training data?” No. No we are not. The big players in AI have all sorts of ways of getting more data. That’s not the biggest problem right now. The biggest problem is that additional data is only giving incremental gains. But everyone knows that. It’s not some secret. New papers come out every week about how to get more out of these models. It also seems to take more researchers, more engineers, and more compute to make each incremental update. That’s because everyone is focused on one paradigm and we keep finding ways to squeeze more juice out of it. Paradigm shifts happen on slower time scales while incrementalism gets less and less effective. But incrementalism is how we discover paradigm shifts. “Attention is all you need” was an insight that came from trying to squeeze more performance out of the last paradigm.

To view or add a comment, sign in

Explore topics