At the beginning of this month, the infra team at Orb gathered for our Q2 infra offsite at HanaHaus in Palo Alto. Orb's customer growth in Q1 far exceeded even our optimistic projections — and is often the case in any org, this put a tremendous amount of load on Orb's infrastructure stack. Here's what is top of mind for our team as we look to the next few months: 1. We take uptime & reliability extremely seriously and have a strong track record of this — we proactively invest in our stack to avoid scrambling when issues happen. Over the last six months, we've invested in read/write workload isolation, more robust rate limiting, and making it much easier to spin up tenant-specific clusters of our core services and datastores. This has been a huge win already for our largest customers and in Q2, we're driving towards complete tenant workload isolation all the way from the API layer down. 2. Infrastructure often gets saddled with a really tough oncall rotation. At Orb, we share this workload across engineering with a larger rotation, but infrastructure owns the horizontal platform to make product engineers successful. In Q1, the infra team invested energy in aggressively making oncall better, and in the last month the number of overnight pages has dramatically decreased. Example investments include automated alerting and triage for long transactions and expensive queries. 3. When we first started building out Orb, we were careful to be pragmatic in where we spent energy with respect to scaling efforts. One of the choices we made was to use RDS Postgres as a job queue — although this has worked very well up to thousands of asynchronous jobs each minute, we've hit the EOL signs you'd expect (https://lnkd.in/gyQVEhKQ). We're spending time on moving to a different job queueing solution this quarter. 4. In Q1, we rearchitected parts of our ingestion and compute services to more aggressively pre-compute data, to avoid variable load through the API. We're finding that this is also going to help us ship much faster alerting features in Q2, enabling Orb customers to more efficiently power flows like spend management on Orb (without needing to pay for dedicated provisioning). If you're interested in joining this small but mighty infra crew, we'd love for you to apply (DM me, or look for link in the comments)
Would love to learn more, Kshitij!
Co-Founder and CTO at Orb
7moApply here! https://jobs.ashbyhq.com/orb/2b527d31-12c5-41e4-80fc-8b63e36143f7