Gremlin

Software Development

San Jose, California 11,434 followers

The Reliability Management Platform for high-velocity engineering teams

See jobs Follow

Discover all 65 employees

About us

Gremlin’s Reliability Management Platform enables high-velocity engineering teams to standardize and automate reliability across their organizations without slowing down software delivery. Gremlin's Reliability Score sets the standard for reliability so there's no guesswork, and an automated suite of Reliability Management tools makes it easy to integrate reliability throughout the software lifecycle so there's no slowdown.

Website: http://www.gremlin.com
External link for Gremlin
Industry: Software Development
Company size: 51-200 employees
Headquarters: San Jose, California
Type: Privately Held
Founded: 2016
Specialties: Distributed Systems, Resilience, Failures as a Service, DevOps, and Chaos Engineering

Locations

Primary

55 S Market St

Ste 1205

San Jose, California 95113, US

Get directions
555 Montgomery St

Ste 811

San Francisco, California 94111, US

Get directions

Employees at Gremlin

See all employees

Updates

Gremlin

11,434 followers
19h
Report this post
📈 72% of organizations were using AI at the start of 2024, and that number is likely to keep growing throughout 2025. With expectations around AI reaching historic highs, the reliability of these systems is more important than ever. Enter, Gremlin GPU- the easiest way to build more resilient machine learning and AI models, and test and validate the scalability of your systems. Learn more at the link in the comments.

2 Comments

Like Comment Share
Gremlin

11,434 followers
20h
Report this post
2025 is fast approaching, which means it’s the perfect time to look at how you can build a culture of reliability within your organization. Urgent troubleshooting burns out teams and creates a reactive environment. Introduce proactive reliability efforts and you start to build a calm, collaborative culture. So where do you start? See how Ritchie Bros. created a culture of reliability and drove innovation within their team: https://lnkd.in/gmcj26jH

How Ritchie Bros Creates a Culture of Reliability

gremlin.com

Like Comment Share
Gremlin

11,434 followers
1d
Report this post
2025 is just a few weeks away–how are you going to change your reliability efforts going into the new year? 🚀 Where to start: run reliability tests to test for cracks in your system. And while you’re at it, make regular reliability management part of your team’s workflow (we recommend a weekly cadence). Get ahead of potential issues and scale effectively with Gremlin’s Reliability Management Quick Start Guide- link in the comments. ⬇️ 🚀 Bonus: Once you’ve run the essential tests, put those insights to work. Make a plan for addressing any uncovered risks, and create reports for stakeholders in your organization. A more reliable 2025 starts today.

1 Comment

Like Comment Share
Gremlin

11,434 followers
5d
Report this post
No matter what type of AI model you use, they all have one thing in common: they need to crunch a lot of data, and GPUs are the most effective tool. 💡 But what happens when those GPUs are busy? Can your infrastructure scale to meet changing demand? ➡️ Enter, Gremlin GPU experiments, the best way to test and validate the scalability of your systems. If you have an LLM deployed, run a GPU experiment alongside it to simulate heavy loads or additional workloads. While the experiment is running, monitor the performance, throughput, and availability of your LLM to determine what (if any) impact there is. It’s easy to get started- find it in the Gremlin web app, or learn more (and get started with your free trial) at the link in the comments.
2 Comments

Like Comment Share
Gremlin

11,434 followers
1w
Report this post
Even the most robust systems can experience downtime. This year’s CrowdStrike outage—and smaller outages across the industry—served as a warning: no system is immune. So, how do you prepare for the unpredictable? 1️⃣ Run reliability tests regularly. Identify weak spots before they become critical failures. 2️⃣ Simulate outages. Test your disaster recovery plan with realistic simulations. 3️⃣ Automate where possible. Reduce human error and recover faster. These outages aren’t just wake-up calls—they’re opportunities. The companies that proactively invest in reliability today will minimize customer impact and protect their reputations tomorrow. 💡 See how the team at Ritchie Bros. uses Chaos Engineering to prepare for the unexpected and ensure resilience: https://lnkd.in/gmcj26jH

How Ritchie Bros Creates a Culture of Reliability

gremlin.com

Like Comment Share
Gremlin

11,434 followers
1w
Report this post
This content isn’t available here

Access this content and more in the LinkedIn app

Like Comment Share
Gremlin

11,434 followers
1w
Report this post
The end of the year is the perfect time to implement proactive reliability measures that will pay dividends in 2025. Why? 🔹 Teams are less overwhelmed by day-to-day incidents. 🔹 It’s the ideal moment to reflect, plan, and prepare. 🔹 Setting up proactive measures now creates breathing room for innovation later. Start here: ✔️ Set up a regular cadence for reliability testing in 2025. ✔️ Automate the low-hanging fruit to avoid unnecessary toil. ✔️ Build a roadmap for handling risks uncovered during testing. Let Gremlin help you take the first steps- link in the comments.

1 Comment

Like Comment Share
Gremlin

11,434 followers
1w
Report this post
GPU Gremlin is officially here! 🎉 Our latest addition will help you build more resilient machine learning and AI models, and make simulations and video streaming much more reliable. Available now to all Gremlin customers- learn more at the link in the comments! #AI #MachineLearning #GPU
1 Comment

Like Comment Share
Gremlin

11,434 followers
1w
Report this post
This content isn’t available here

Access this content and more in the LinkedIn app

Like Comment Share
Gremlin

11,434 followers
1w
Report this post
Black Friday 2024 has come and gone. Given everything you learned, how do you start preparing for Black Friday 2025? Proactive reliability means that even the busiest days will feel manageable—because you’ve already established the solid foundations to guide your team. Here are 3 resources to help you get started as you look towards next year: 🚀 https://lnkd.in/gipz7Wnz 🚀 https://lnkd.in/gjs7QYRM 🚀https://lnkd.in/guAdtn7b

Seven tests to measure and improve reliability: what matters and how it works

gremlin.com

Like Comment Share

Browse jobs

Funding

Gremlin 3 total rounds

Last Round

Series B Oct 28, 2018

US$ 18.0M

Investors

Redpoint

See more info on crunchbase

Gremlin

Software Development

San Jose, California 11,434 followers

The Reliability Management Platform for high-velocity engineering teams

About us

Locations

Employees at Gremlin

Josh Leslie

CEO of Gremlin | Making applications more reliable | GTM-focused investor & advisor

Jason Heller

Helping teams build more reliable systems

Stefano Pirovano

IT Sales leader with a special focus on Startups - 4x IPOs

Kolton Andrus

CTO and founder of Gremlin Inc.

Updates

Join now to see what you are missing

Similar pages

Dremio

Dune

Gregory Event Services

Monte Carlo

MotherDuck

Height

Preset

Hex

Marketers Talking

Datadog

Browse jobs

Director Solutions Marketing jobs

Marketing Account Manager jobs

Director of Product Marketing jobs

Head of Product Marketing jobs

Chief Officer jobs

Principal jobs

Senior Product Marketing Manager jobs

Vice President Marketing jobs

Head of Marketing jobs

Marketing Officer jobs

Marketing Director jobs

Marketing Executive jobs

Marketing Manager jobs

Vice President jobs

Director jobs

Manager jobs

Funding