How to measure developer productivity?

The nineteenth-century retail innovator John Wanamaker is credited with the famous quote "Half the money I spend on advertising is wasted; the trouble is I don't know which half." Replace John Wanamaker with Jeff Bezos, however, and Bezos might lament that the same is true of software development.

Since Wanamaker's day, the Internet has revolutionized the ability of marketers to measure how advertising contributes to sales and growth, to the point where VCs expect scrappy startups to quantify these metrics. The same cannot be said for software development. Despite the widespread adoption of agile and lean practices, software development remains an expensive endeavor whose cost and revenue contributions are difficult to trace.

McKinsey recently caused a stir among developers with the article "Yes, you can measure software developer productivity" by claiming they could do just that. In a two-part series of articles, Kent Beck and Gergely Orocz address potential flaws in McKinsey's recommendations.

As realists, however, Beck and Orocz also know that CTOs and other executives will want to measure developer productivity anyway. With that in mind, how can we identify ways to improve developer productivity? The key to productivity may be to focus not on the outcomes that people want, but on the decisions that lead to those outcomes.

As a former Chief Decision Scientist at Google, Cassie Kozyrkov knows a lot more about decision-making than most people do. When she recently created the online course "Decision Intelligence" on LinkedIn, I was eager to take the course and gain some insights. After all, to be the best, one needs to learn from the best!

Within the first few minutes of the course, she warns against outcome bias:

Always evaluate decisions based only on what was known at the time that decision was made.

You don't know the outcome when you make a decision, so you can't use the outcome to determine whether the decision was good or not! Otherwise, you risk punishing people for information they could not have known when a decision was made.

To hold people accountable for decisions, keep a decision log and update it as soon as the decision is made. The more time passes after a decision is made, the less likely one is to remember the circumstances leading to the decision.

However, make sure to reward people for making decisions because doing nothing is also a decision. The military general and statesman Colin Powell penned a rule called the "40-70 rule," which has since been adopted across the US Marine Corps. In short, a decision made with less than 40 percent of the total information is likely a bad decision, and if you wait for more than 70 percent events will unfold regardless of your decision.

You want to reward a decision apart from its outcome. If it was the best decision at the time with the information available, then reward the decision, and as Amazon founder Jeff Bezos puts it, encourage "high-quality, high-velocity decisions." After all, many decisions are reversible so if things go wrong, making a different decision with new information will help you change course toward the desired outcome.

In contrast, ignoring decisions can lead to perverse incentives. By rewarding only outcomes, companies may inadvertently incentivize decisions such as sandbagging (deliberate underperformance) and shirking (avoiding difficult work) that favor individuals at the expense of the whole.

For example, a well-meaning company might seek to maximize outcomes such as "story points completed in-sprint" and minimize "bugs released to production." Some people, however, will seek to control these outcomes by gaming the system. In this example, a developer might avoid complex stories that require new skills and have more potential for bugs in favor of several simple stories that are more likely to be completed in the same sprint with few bugs.

By rewarding outcomes without consideration for decisions, selfish decisions such as deliberately delaying or avoiding difficult work may achieve outcomes that appear favorable upon a cursory glance but are detrimental in the long run, such as no one on a team acquiring a new skill.

Your best developers, meanwhile, are probably making decisions that seek to maximize their impact for a given amount of effort. For example, developers who write utility functions, templates, and frameworks can improve the performance of their entire team, creating a larger impact than focusing solely on individual tasks. Training teammates also has a multiplying effect. For example, training on code styles can make code reviews more efficient for all team members, saving small amounts of effort initially that lead to exponential gains over time by making code maintenance easier.

How to measure developer productivity?

Measure decisions instead of outcomes