👀 As this graph shows, the OpenAI “o” model family represents a significant shift, but what does this mean for businesses and the average user? 1. It seems the scaling “law” of large models still holds, but where it used to be scaling compute during training, now it’s scaling compute at inference time. (when you’re using them). 2. The o models spending more time during inference mean better performance at specific tasks. That’s the “o1 preview” changing the curve on this graph, and the subsequent models, including o3, follow this new, steeper trajectory. The performance improvements seem extraordinary, making them even more helpful for businesses. 3. But it comes at a cost, and the cost type differs. Scaling compute at training is a fixed cost borne by the model providers. Scaling inference, however, is a variable cost, and the amount depends on how much you use it. When the o models spend much more time on inference to improve their performance, it drives up the variable cost, and that cost is pushed to you. So, if the reported performance improvements hold when businesses start using them, the model will become much more useful, but it will also come at a much higher cost. Let’s hope the general cost per token continues the rapid race to the bottom while the model's usefulness continues heading north. 👏