After Hours On-Call Support

The company I work for is the antithesis of Google; where Google kills products almost religiously, my workplace has deprecated almost nothing in thirty years of existence, but still loves to ship new products and systems.

There are valid reasons for this operating strategy; our customers appreciate the long-term support and reliability, but couple this with an unprecedented shortage of software engineers in the industry, and you have a recipe for trouble.

I’m on the after hours on-call roster, which is voluntary. There are not many others on this list, and the number of services we monitor is high - as well as a catch-all Jira service desk that is routed to my team.

With so many things that could fail, the likelihood of receiving a callout is very high. There can be massive gaps of several months between callouts, interspersed with so-called “hell weeks”, where there can be a callout almost every night.

Being called-out this often is a problem, but I always come-up empty when thinking of a solution. The business wants high-availability for all services, but we don’t have the head-count to work on reliability of those services.

Last modified on 2022-08-25