Servando Torres’ Post

View profile for Servando Torres, graphic

ML, AI Consulting & Development | Founder ControlThrive

💸 Revenue Team Wants AI Costs, But Your MVP's Still Loading... 🤯 A Founding MLE Guide to Pre-MVP Cost Estimation 🔮 Initial Cost Estimation 1. Select Representative Models 📊:    Choose models at various scales. For NLP, consider BERT, GPT-J, (FLAN-)T5-XXL, and Falcon-40B. 2. Match Models to Hardware 🖥️:   Pair each model with appropriate GPU hardware. Example: GPT-J on A10G, Falcon-40B on A100 40GB. 3. Estimate Request Completion Time ⏱️:   Approximate time for each model to complete a request. Example:   - GPT-J: 1 second on A10G (made up number!)   - Falcon-40B: 10 seconds on A100 40GB (made up number!) 4. Calculate Hourly Costs 💸:   Research current GPU pricing. Example:   - A10G: ~$2 per hour (Modal Labs pricing)   - A100 40GB: ~$5 per hour (Modal Labs pricing) 5. Compute Cost per 1000 Requests 🧮:   Use the formula: (Seconds per request * Cost per hour) / (3600 seconds) * 1000   Examples:   - GPT-J: (1 * $2) / 3600 * 1000 ≈ $0.60 to serve 1000 requests   - Falcon-40B: (10 * $5) / 3600 * 1000 ≈ $3.00 to serve 1000 requests 6. Provide Order of Magnitude (OOM) Estimates 📏:   Present a range of costs based on different models. In this case, $0.60 to $3.00 per 1000 requests. 7. Factor in SLAs and Latency Requirements ⚡:   SLAs affect costs and can help constriant the solution space. For example, achieving a p99 latency of Xms might be 10x more expensive due to keeping a machine warm 🔧 Ongoing Cost Optimization (Thanks for the beautiful post Outerbounds) - Analyze Top-line Costs 📈:    Regularly review cloud bills to focus optimization efforts. - Identify Cost-driving Instances and Workloads 🔍:    Use tools to pinpoint expensive instances and tasks. - Monitor Resource Utilization 📊:    Avoid over-provisioning; pay attention to actual usage. - Optimize Workloads 🎛️:    Right-size resource requests based on real usage patterns. - Choose Optimal Execution Environments 🌐:    Leverage multi-cloud strategies for cost advantages. - Refine Based on Specific Needs 🎯:   Narrow estimates by understanding customer problems and required model scales. - Explore Serverless Options ☁️: - Stay Informed on Pricing 📚: Valuable resources: https://lnkd.in/gwragJBw https://lnkd.in/gVkGenhi https://lnkd.in/gq8pi4qd This framework is inspired by countless interactions with the ML/AI community #MachineLearning #CostEstimation #DataScience #AI #CloudOptimization

  • diagram
Ivan Falco

Head of Demand Gen @ ColdIQ | AI-powered Acquisition Funnels | GTM Systems & Sales Tech

5mo

Good stuff!

Like
Reply

To view or add a comment, sign in

Explore topics