Vitaly Kleban’s Post

View profile for Vitaly Kleban, graphic

ML and large-scale telco infra @ Everynet | Forbes Author | TinyML & LoRaWAN Contributor | Tech due diligence for VCs, UN and WorldBank

Cost-aware inference is important, but currently LLM prompts are not interoperable. Prompts that work good on one LLM may show worse performance on the other, even more advanced LLM or even on the different versions of the same LLM. Once interoperability problem is solved it will also be possible to implement cost-aware inference that strategically manages computational resources for prompt processing, ensuring sustainable costs. This approach includes model routing https://www.notdiamond.ai/ - automatically selecting the most appropriate model for each task. Startups in this area are actively supported by the industry veterans.

  • No alternative text description for this image

To view or add a comment, sign in

Explore topics