Tianshu Cheng’s Post

Name: Tianshu Cheng on LinkedIn: #ai #genai #genaiapp #docker #gpu #cloud #llm #whisper #opensource…
Uploaded: 2024-12-05T18:30:29.872Z
Channel: Tianshu Cheng

Tianshu Cheng

Make GenAI inference affordable @Baseten | ex-mosaicML(Databricks) | ex-Twitter

Super excited to share my first project as a Forward Deployed Engineer at Baseten. 🚀 With the launch of Custom Servers, developers can now deploy a production-ready model server from any Docker image (open source or in house) onto a highly reliable, auto-scaling GPU cloud by ONE CLICK. 💡 "You bring your Docker image, we make it production-ready from day 1." Try it today! Quick demo: see post below 👇 Blog: https://lnkd.in/gggmrsmu Doc: https://lnkd.in/gXuKqi7F #ai #genai #genaiapp #Docker #GPU #cloud #LLM #whisper #opensource #startup #aistartup #developers #genaidevelopers

Baseten

5,629 followers

1w Edited

We’re excited to introduce Custom Servers on Baseten! To run customers’ mission-critical inference workloads, Baseten had to be great at 2 things: 1: Performance optimizations at the model level 2: Massive-scale infrastructure with cross-cloud horizontal scaling All wrapped in an expressive DevEx. We've built extensive tooling for performance optimizations—like our optimized Engine Builder in Truss. However, some of our customers come to us with pre-optimized models, mainly wanting to take advantage of the seamless autoscaling Baseten provides. With the launch of Custom Servers, you can bring your Docker image—untouched—and turn on cross-cloud autoscaling, fast cold starts, and low-latency chaining of models for compound AI systems. This makes it easier than ever to leverage our world-class infra, with blazing-fast inference and effortless autoscaling for any demand, coupled with the most expressive DevEx. Shoutout to our engineers Tianshu C., Sidharth Shanker, and Bola Malek across our Forward-Deployed and Core Product teams for this feature! Learn more in the launch blog: https://lnkd.in/e_6_GyAh Check out the demo by the lead engineer behind Custom Servers, Tianshu C.!

3 Comments

Tianshu Cheng

Make GenAI inference affordable @Baseten | ex-mosaicML(Databricks) | ex-Twitter

More background story from our founders. https://www.linkedin.com/posts/amirhaghighat_its-launch-day-but-first-the-backstory-activity-7270523259308126208-_WQq?utm_source=share&utm_medium=member_ios

Godwin Josh

Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer

The emphasis on "production-ready from day 1" suggests a streamlined deployment pipeline leveraging infrastructure as code tools like Terraform or Pulumi. Auto-scaling GPU cloud instances likely utilize container orchestration platforms like Kubernetes, ensuring efficient resource allocation for diverse AI workloads. Given the focus on open-source and in-house Docker images, how would Baseten's platform address potential security vulnerabilities stemming from untrusted custom models during deployment?

Jiaqi Wang

Director of Product @Mobvoi | AIGC creator | ex-TikTok

cool man!

See more comments

To view or add a comment, sign in

More Relevant Posts

Baseten

5,629 followers
1w Edited
Report this post
We’re excited to introduce Custom Servers on Baseten! To run customers’ mission-critical inference workloads, Baseten had to be great at 2 things: 1: Performance optimizations at the model level 2: Massive-scale infrastructure with cross-cloud horizontal scaling All wrapped in an expressive DevEx. We've built extensive tooling for performance optimizations—like our optimized Engine Builder in Truss. However, some of our customers come to us with pre-optimized models, mainly wanting to take advantage of the seamless autoscaling Baseten provides. With the launch of Custom Servers, you can bring your Docker image—untouched—and turn on cross-cloud autoscaling, fast cold starts, and low-latency chaining of models for compound AI systems. This makes it easier than ever to leverage our world-class infra, with blazing-fast inference and effortless autoscaling for any demand, coupled with the most expressive DevEx. Shoutout to our engineers Tianshu C., Sidharth Shanker, and Bola Malek across our Forward-Deployed and Core Product teams for this feature! Learn more in the launch blog: https://lnkd.in/e_6_GyAh Check out the demo by the lead engineer behind Custom Servers, Tianshu C.!

4 Comments
Like Comment
To view or add a comment, sign in
Collabnix - Docker, Kubernetes and IoT

1,141 followers
4mo Edited
Report this post
Leverage Karpenter for clusters with workloads experiencing fluctuating resource demands or diverse compute requirements. For static workloads, consider Managed Node Groups or Autoscaling Groups. https://lnkd.in/guCRX75i

Benefits of Karpenter: Simplifying Kubernetes Cluster Autoscaling

https://collabnix.com
Like Comment
To view or add a comment, sign in
Paul Brazell

Senior Technical Account Manager at Amazon Web Services (AWS)
4mo
Report this post
Karpenter is a flexible, efficient, and high-performance Kubernetes compute management solution that helps improve application availability, reduce operational overhead, and increase cluster compute utilization. Alongside the graduation from beta, this 1.0 release includes three new features for Karpenter: 1/ the ability to specify disruption reasons, e.g. underutilization, emptiness, drift, for disruption budgets, 2/ a forceful disruption mode that helps customers balance application availability against security requirements, and 3/ an expansion of consolidateAfter which lets customers better tune Karpenter’s consolidation feature to meet their cost-efficiency and application availability requirements. https://lnkd.in/drDi3rXY

Announcing Karpenter 1.0 - AWS

aws.amazon.com
Like Comment
To view or add a comment, sign in
Syself

902 followers
2mo
Report this post
Who wants to run AI workloads in Kubernetes? 🙋 With Syself Autopilot and the new GPU servers from Hetzner, we make it possible to train AI models in a GDPR-compliant location. With the scalability of Kubernetes, fully managed. 🚀🤖 Check out the GPU servers here: https://lnkd.in/eqYv_VCK Contact us today to get your trial: https://syself.com/demo New to this? 🔶 We offer a declarative, fully managed approach on Kubernetes, so you can focus on using Kubernetes. 🔶 No need for provisioning, maintaining, patching servers, or ensuring compatibility while upgrading Kubernetes. 🔶 We even test upgrades extensively, so you don’t have to spend any time on them. 🔶 No need to use Terraform—just use Kubernetes to manage Kubernetes and your tooling (e.g., kubectl, ArgoCD, Flux). 🔶 Our intelligent software manages your infrastructure, heals itself, and removes unhealthy servers.
Like Comment
To view or add a comment, sign in
JS-techies

5 followers
1mo
Report this post
Serverless computing can reduce overhead and increase scalability. What’s your experience with serverless solutions, and what are the main advantages or challenges? #Serverless #CloudArchitecture #ScalableTech
Like Comment
To view or add a comment, sign in
Sinapi LLC

7,686 followers
1mo
Report this post
Serverless Computing: Is it Right for Your Project? Wondering if serverless is the perfect fit for your next project? This blog post breaks down the pros and cons. Key takeaways: • Cost-effective: Pay only for what you use. • Scalable: Easily handle fluctuating workloads. • Faster development: Focus on code, not infrastructure. Click here to read the full blog post: https://lnkd.in/eYvQGBs5 #serverless #cloudcomputing #softwaredevelopment #blog #blogpost #sinapiblog
Like Comment
To view or add a comment, sign in
Lyon Till

🔧 Software Engineer @ Microsoft
6mo
Report this post
#Azure - We recently announced the public preview of Standby Pools for Virtual Machine Scale Sets with Flexible Orchestration. Standby Pools is a new service that enables you to increase your scaling performance by creating a pool of pre-provisioned virtual machines from which your scale can pull from when scaling out. Standby pools reduce the time to scale out by performing various initialization steps such as installing applications/ software or loading substantial amounts of data. These initialization steps are performed on the virtual machines in the standby pool before to being moved into the scale set. #Compute #VirtualMachines #Scaling

Announcing the Public Preview of Standby Pools for Virtual Machine Scale Sets

techcommunity.microsoft.com
Like Comment
To view or add a comment, sign in
AntStack

4,714 followers
2mo
Report this post
Why does Lambda get all the attention? Watch Sheen Brisals explain how Lambda reaches its limits, drives new use cases, and when containers become the ideal choice for heavier workloads. Catch the full conversation here: https://bit.ly/3MEFR0k #TechTrends2024 #AIForBusiness #CloudComputing #Serverless #DigitalTransformation #AutomationTools #Productivity #TechTalks #WebTech #AntStack #AppModernization
Like Comment
To view or add a comment, sign in
Sinapi LLC

7,686 followers
1mo Edited
Report this post
Serverless Computing: Is it Right for Your Project? Wondering if serverless is the perfect fit for your next project? This blog post breaks down the pros and cons. Key takeaways: • Cost-effective: Pay only for what you use. • Scalable: Easily handle fluctuating workloads. • Faster development: Focus on code, not infrastructure. Click here to read the full blog post: https://lnkd.in/eRwPc35z #serverless #cloudcomputing #softwaredevelopment
Like Comment
To view or add a comment, sign in
Kritech Technologies Pvt ltd

303 followers
7mo
Report this post
Unleash Your Development Potential: Dive into Serverless Computing Serverless computing is revolutionizing the way applications are built and deployed. Say goodbye to managing servers and hello to a world of scalability, cost-efficiency, and faster development cycles. Focus on Code, Not Infrastructure: Serverless removes the burden of server management, allowing developers to focus on writing great code. Pay-Per-Use Model: Only pay for the resources your application uses, making serverless ideal for event-driven workloads and short-lived applications. Scalability Made Simple: Serverless infrastructure scales automatically based on demand, ensuring your application can handle any traffic surge. Ready to break free from the chains of traditional development? Explore serverless computing and see how it can empower your development team! #ServerlessComputing #CloudNative #CloudDevelopment #KritechTechnologies
Like Comment
To view or add a comment, sign in

1,568 followers

33 Posts

View Profile Connect

Tianshu Cheng’s Post

More Relevant Posts

Explore topics