Modal

Modal

Software Development

New York City, New York 6,349 followers

The serverless platform for AI, data and ML teams.

About us

Deploy generative AI models, large-scale batch jobs, job queues, and more on Modal's platform. We help data science and machine learning teams accelerate development, reduce costs, and effortlessly scale workloads across thousands of CPUs and GPUs. Our pay-per-use model ensures you're billed only for actual compute time, down to the CPU cycle. No more wasted resources or idle costs—just efficient, scalable computing power when you need it.

Industry
Software Development
Company size
11-50 employees
Headquarters
New York City, New York
Type
Privately Held

Locations

Employees at Modal

Updates

  • 🖼️ Interested in productionizing diffusion models? Our developer advocate Charles Frye is hosting a webinar on Wednesday, Dec 18. Using a live example of building an artsy QR code generator, we'll cover general ML topics (iteration, evaluation, and GPU acceleration) as well as Modal-specific ones (repo structure, job handling, and CI/CD). Link to register below. https://lnkd.in/gH_AxfwW

    Productionizing Diffusion Models on Modal: QArt Code Deep Dive

    Productionizing Diffusion Models on Modal: QArt Code Deep Dive

    crowdcast.io

  • View organization page for Modal, graphic

    6,349 followers

    NVIDIA L40S GPUs are now available on Modal at $1.95/hr. Dropping just in time for the holiday content freeze 🫡 L40S's are juicier than the our current most popular inference-focused accelerator, the A10. They are priced between A10s and A100s on our platform. When might you use them? ⬆️ To run bigger models (like Flux schnell) that don't fit on an A10. At 48GB, the L40S has 2x the memory of the A10. ⚡ To run faster inference. Compared to using A10s, users can get up to a 1.4x speedup for memory-bound jobs like small-batch inference and up to 2x speedup for compute-bound jobs. We ran a quick benchmark on LLaMA 3.1 8B with vLLM and were able to able to achieve a 1.2x speedup on requests per second without optimizations. Give them a try!

    • No alternative text description for this image
  • Ever wondered what CUDA kernels actually get compiled to? Or tried to figure out just what all the components of the CUDA Toolkit do? Or the difference between "CUDA Cores" & "Tensor Cores"? We've put together a one-stop shop with all the answers! Introducing: the GPU Glossary The glossary covers the entire stack, from device hardware like warp schedulers and register files, through device software like PTX and kernels, up to host software like the CUDA Toolkit and Nsight Systems. We've included explanatory diagrams for key concepts, like the paired thread and memory hierarchies of the CUDA programming model and how they map onto the hardware. Check out the full resource here: https://lnkd.in/gZcpfBy2

    • No alternative text description for this image
    • No alternative text description for this image
  • ⚽️ New case study ft. one of the top football teams in the world right now ⚽️ Really cool to get an insider's look at how ML is applied to sports and entertainment. The best teams are leveraging computer vision data to get an edge when making decisions both on and off the field. Before Modal, this team tried using a GPU cluster from a hyperscaler for their data processing. The inflexible instance types and long cold starts made scaling and developer iteration very slow, though. By switching to Modal, this team was able to: 💰 Cut their costs in half with Modal’s usage-based serverless pricing ⏳ Reduce data processing time from hours to minutes with Modal’s fast autoscaling Excited to be a small part of this team's quest to be top of the table!

    View profile for Advay Pal, graphic

    advaypal.com

    ️I've followed and played ⚽ for decades. So naturally, when I saw a chance to help one of the best teams in the world, I jumped at the opportunity! Computer vision and data analytics have revolutionized the world of sports in recent years. Using Modal, one of the world's best teams collects 3.5 million spatio-temporal data points on players and the ball for each game. This data is turned into high-dimensional embeddings and is used to answer questions like - Was it the right time to take a shot? - How effective was the positioning of the players? - How do other teams handle such situations? Read more at https://lnkd.in/g-awZYt8

    • No alternative text description for this image
  • View organization page for Modal, graphic

    6,349 followers

    🧬 Biotech is one of the fastest growing industry segments on Modal. The team at Chai Discovery uses Modal to build state-of-the-art computational bio models. Last week, they updated the license of their Chai-1 molecular structure prediction model so that it can be used commercially. Today, we added a detailed example on how you can run this model yourself on Modal! If you're a biotech company looking to speed up batch data preprocessing or parallelize experimentation on GPUs, we'd love to chat!

  • Dive into our newest blog post, where Eric Zhang breaks down the technical process of implementing static IP address capabilities for serverless containers! ✏️

    View profile for Eric Zhang, graphic

    Founding Engineer, Modal

    Excited to share a low-level networking project we've been working on at Modal, static IPs for serverless function containers using WireGuard proxies. Engineers often need to balance design tradeoffs, and one of the most common hesitations to using serverless functions we've heard at Modal is the lack of ability to set IP whitelists on your databases. This is important for security and regulatory compliance! We've engineered a plug-and-play solution that encrypts traffic and transparently masks the source IP, no matter how many containers you run around the world. Blog post explains how this is done (to my knowledge, it's first of its kind) & some network diagrams :) https://lnkd.in/esrsmAWc

    WireGuard at Modal: Static IPs for Serverless Containers

    WireGuard at Modal: Static IPs for Serverless Containers

    modal.com

  • Modal reposted this

    View profile for Charles Frye, graphic

    Building useful technology with large neural networks

    Last-minute sponsor addition: Modal! Compete to win the coveted "But I'm Not a Wrapper" prize for best self-hosted solution. $5k in credits & a stuffed llama.

    View organization page for Senso, graphic

    2,731 followers

    Voice/Video AI is transforming our daily digital interactions. We're excited to channel this momentum into CreatorsCorner's third #GenAI Agents Hackathon 🔥 Following previous successes, Amazon Web Services (AWS), Tandem, Unified.to, Retell AI, Marly, Coval (YC S24), Temporal Technologies, Simli, SpeedLegal, Senso, and other pioneers are joining forces again to provide you with cutting-edge tools to build the next generation of AI solutions. 📅 When: Saturday, December 7th, 2024 ⏰ Time: 10 AM - 8 PM 📍 Location: Downtown San Francisco Prize Pool: $20,000+ in Credits & Cash Whether you're a seasoned AI builder or just starting your journey, this is your chance to help shape the future of human-AI interaction through voice and video. Apply now: https://lu.ma/i8bow7sr

    • No alternative text description for this image
  • Yesterday was a fun day for text-to-video models 🍵 On the non-protest-leak side of the house, Genmo released LoRA fine-tuning scripts for its Mochi model. Mochi is open-source and currently the top trending video diffusion model on Hugging Face. You can get started by running these scripts on Modal—no waiting around to get a GPU or fussing with dependencies. We tried this out ourselves on fine-tuning Mochi to create "explosion" videos of popular logos!

Similar pages

Browse jobs

Funding

Modal 2 total rounds

Last Round

Series A

US$ 16.0M

See more info on crunchbase