DORA metrics and beyond: Improving developer productivity and reducing burnout with API observability

Publish on 30 Jul, 2024 - by Budhaditya Bhattacharya
Last updated: 16 Aug, 2024

Would you like to support your developers to be more productive? How about reducing their cognitive load and risk of burnout? The DevOps Research and Assessment (DORA) team’s four metrics can be a great starting point but be ready to go beyond them for maximum results.

Tyk recently discussed this with Ari-Pekka Koponen, Head of Platform at Swarmia. Ari shone a spotlight on the intersection of API observability and engineering efficiency, covering key points including:

DORA metrics are a good starting point but not a silver bullet.
API observability can support you to achieve your DORA metric goals.
Going beyond DORA to look at developer productivity, business outcomes and developer experience is the key to taking things to the next level.
The right culture and structure are just as important as the right tools when it comes to improving productivity and reducing burnout.

Improving developer productivity and reducing burnout with API observability

API observability can make your engineering organisation more effective in achieving its goals. DORA metrics can help with this. The most well-known modern software delivery metrics, they are well-researched and have a proven relationship to business performance.

DORA metrics aren’t a silver bullet to solve all productivity challenges, but they are a good set of baseline metrics if you’re just getting started with improving your engineering effectiveness. They can help you understand if there are things you need to fix and shape high-quality software as fast as possible.

DORA metrics and API observability

How do DORA metrics intersect with API observability? Let’s look at the time to restore service metric as an example of this.

Time to restore service is the time it takes to restore a service in production after a change failure. A good goal is to achieve this in an hour or less on average. API observability can help here, as it leads to faster recovery.

Many incidents in complex systems start small and then snowball. With observability in place, alerts can kick in and you can revert changes before they ever turn into incidents.

Observability can help with the change failure rate metric, too. That’s the share of incidents, rollbacks and failures out of all your deployments. Your goal should be to keep this low – below ten percent.

Tracking change failures starts with defining what counts as a change failure. That’s often not straightforward. Incidents are clearly change failures, but what about smaller bugs or performance regressions?

This is where service level objectives (SLOs) come into play. Defining good SLOs helps your teams track their software’s quality. Having API observability in place allows you to know when you’re not meeting your SLOs and which deployments caused change failures. It means you can learn much faster about the kinds of changes that cause the failures. And once you know the cause, you can implement a solution and ship fewer bugs to production.

The third DORA metric is deployment frequency. This speed metric measures how often a software team pushes changes to production. The goal is to be able to deploy on demand as many times per day as you need to.

Again, API observability can help. After all, when engineers feel confident that they can catch issues quickly, they’re more likely to deploy more often. This requires other DevOps capabilities too, of course, such as automatic testing and the ability to roll back changes quickly. But wherever you are in your DevOps journey, every team will benefit from having more confidence in knowing when things don’t work as they should. Less risk equals more deployments.

The final DORA metric is change lead time – the time it takes to get committed code to run in production. Your goal should be to ship changes in less than a day.

Once more, observability can have a positive impact. The more comfortable you are with catching bugs early and simply rolling back a change in production, the lighter you can make your QA and release processes. This has a huge impact on your ability to ship and iterate faster (use feature flags to do so efficiently).

What lies beyond DORA metrics?

There are three major areas of engineering effectiveness that you should consider when going beyond DORA metrics:

Developer productivity
Business outcomes
Developer experience

Developer productivity looks at whether your engineering organisation is making fast and consistent progress towards its goals. It encompasses everything from issue trackers and code review processes to working progress limits, lead times, DORA metrics and other frameworks such as the SPACE framework (Satisfaction and wellbeing, Performance, Activity, Communication and collaboration, and Efficiency and flow).

Business outcomes is about checking your organisation is focusing its investment on the right outcomes. It includes consideration of how you structure your teams and leadership, how you set priorities, how you balance investment, how you manage cross-team initiatives and which projects and features you’re investing time in.

Finally, developer experience (DX) includes things such as interruptions and flow, workload and meaningful work.

Improving DX and reducing burnout

Heavy workloads and a lack of meaningful work contribute to burnout. They result in too many interruptions and a lack of flow state, when really a developer should be spending as much time in flow as possible, happily getting deep work done.

Everything from Slack pings to meetings to code reviews can mess up a developer’s day. Even a single high priority customer request can derail things and lead to a long stop in a developer’s flow, costing hours of productive work.

API observability can help. It can ensure alerts ping the right person from the outset, avoiding unnecessary interruptions where issues are escalated repeatedly until they reach the right individual.

Having API observability in place can help in other ways, too. Anxiety hinders flow, so if you’re anxious about possible firefighting, or incidents, or the results of your pull request being deployed, it becomes hard to focus on meaningful work. This contributes to burnout. On the flip side, confidence that there is sufficient observability in place that the developer will be notified of anything urgent makes getting to flow easier and burnout less likely.

What are the key metrics for defining developer productivity?

DORA metrics are a good starting point when you’re getting your deployment pipelines and DevOps practices into a decent shape. At that point, you might focus on pull request cycle time, which usually shows if you have issues with teamwork or if you’re not structuring work into small enough increments.

Next, it’s time to see if you’re focusing enough on the right things and look at your investment balance, essentially seeing how much of your work is road map work. You can look at your key initiatives and expand visibility into where time is going. This usually leads to productive conversations about where time is being spent and what could reduce interruptions and bottlenecks.

How do you free up overloaded developers?

Thinking carefully about your team structures is key to avoiding developer cognitive overload. The book Build – Elements of an Effective Software Organization covers this in detail. Essentially, you have to be mindful about having the right teams with the right expertise and then look at their workload. Otherwise, the team ends up focused on fighting fires just to keep the lights on – and that means your investment balance is off.

Building a platform team that handles part of the stack is one solution. The platform team provides services for other teams, such as product teams that focus on building features. That can work well to reduce the load on product teams.

Communicating your metrics

Whether it’s your DORA metrics or those beyond DORA, thought needs to go into who’s responsible for communicating them. Who that is will, of course, depend on the size of your organisation. It can range from a single enthusiastic individual to an entire developer productivity team.

The gains can be huge. An organisation with hundreds or thousands of engineers, for example, might find that speeding up a particular build tool can save them tens of thousands of hours per year.

Ultimately, communicating your metrics is about empowering people to do things themselves and developing a culture of mutual trust. When you don’t have trust, people stop communicating and then you lose crucial visibility. So, give people the tools and trust they need, have cross-team retrospectives and focus on building the right culture and fixing the right bottlenecks.

The final word on developer productivity and observability

Whether you start focusing on developer productivity, business outcomes or developer experience, your journey will deliver rewards. Get started wherever make most sense for your organisation. Start small, start easy and then build, iterate and optimise.

Analyse and optimise API products API platform governance & optimisation Monitor, troubleshoot & update APIs