Beyond Cards on a Board
Analyzing the flow of the invisible work of software development
Note: This is an updated version of a post we published almost a year ago when we first launched the Polaris Advisor program. It has been updated to reflect our learnings over the last year of working with clients to implement the program on the ground.
We’ve all seen them wherever software teams work: the whiteboards with sticky notes posted in neat little rows and columns, or more likely these days, the electronic Scrum or Kanban board filled with card-like things in even neater rows and columns.
People sit around quietly at their desks tapping away and every now and then, someone moves a card from one column to the next. Every now and then, there is a high-five when a card reaches the last column. If this was “bring your kids to the office” day, the little ones would not be out of line thinking their mum or dad "made" software simply by moving cards on a board.
However, everything interesting about making software happens between the time a card enters one column and moves to the next. Yet, almost all current techniques and tools for analyzing software development processes treat cards as the basic primitives and the movement of cards between columns in the board as the basic operation in a software process.
We count cards to measure throughput. We put limits on the number of cards in a column to manage bottlenecks. We measure stats on how long cards took to move between columns to analyze bottlenecks, usually after all the movements are done, and the bottlenecks have moved on as well.
It is not surprising that we analyze software development as though it was a manufacturing process, since this is where these ideas originated. But this is a very approximate model for analyzing how software is made.
The core production activity in software is people making changes to a shared code base. Even though "making all work visible" is a key principle of Lean thinking, most of the actual value creating work and work products are hidden behind the scenes on a Kanban board.
While there is huge value in knowing how long cards took to move between columns, if you want to know why it took so long to move, or why its been sitting in one place for so long, you need to have some way of looking at what was happening in between those movements, while people were sitting quietly at their desks tapping away.
The trouble is that the folks looking on from the outside have no visibility into the actual process of making software, and without this situational awareness, we enter the dreaded “negative feedback loop of distraction”.
Here's an example.
- The engineering manager wants to know why that card has not moved in two days, so
- He walks over to (or emails, or Slacks) the engineer to find out why, and in doing so
- Breaks the zen-like state in which she is cranking out code while typing quietly away.
Her answer is probably not going to be of much help to the manager except to make him feel somewhat better (or worse), but in the mean-time he’s probably broken the flow state she was in - setting the card back an extra two hours and we are back at step 1.
So, because we don't have visibility into the actual work, the rituals of engineering management end up reducing engineering productivity without necessarily adding value.
Developer focus and attention are one of the most critical resources in software development, but we don't really build our engineering process around the best ways for software teams to focus on making and shipping code, and even when we do, we have no way to quantify the impact of doing this on the rest of business.
It is how most of us work today. And how can it not be?
The work and work product are invisible, so at best, we are managing software processes entirely via indirect representations and proxy metrics.
What is the work product of a software team?
Our thesis is that the set of code changes that make up a releasable, deployable code increment of the product is the actual work product from a software team.
To model how software process delivers value, we need to model how a team collaborates to define, make and ship a set of changes to a shared code base.
The cards on the board represents the value that the customer expects when we deliver the software, but in terms of analyzing a software delivery process, they merely represent demand signals on the shared capacity and developer attention needed to make and deliver code increments. If we are to use a manufacturing analogy, think of them as the work orders needed to create the actual work product.
You can go a long way to improving flow in a software delivery process simply by analyzing the flow of these cards, which is what most current approaches to this problem do. We can identify imbalances between delivery capacity and demand, we can set the correct WIP levels, manage aging cards, identify capacity imbalances at a functional level in engineering, test and delivery etc.
Card level flow analysis also allows us to shed light on the value added and non-value added work being performed, and identify waste in terms of work that gets started but never finishes etc. features that get built, but never get used etc. All hugely valuable concerns in Lean software development, and for process optimization. So we are not proposing throwing out all the valuable techniques we use today to optimize flow.
But if we want to understand why those capacity imbalances exist, or why software delays happen on the ground, we have to drop down to one more level of detail and analyze flow at the level of how code increments are made and delivered.
Card Flow vs Code Flow
The code flow of a software process is the engineering process by which the code changes for a feature gets promoted from feature branches through code reviews and merges and lands on a releasable deployable artifact that is shipped to the customer. Modeling and analyzing this code flow is just as critical to optimizing a software process. This engineering process is the actual manufacturing process for software development.
The two notions of flow, card flow and code flow are intimately connected with each other, and you need to analyze them in tandem in order to understand both your capacity to deliver code, and the how to engineer your process to use that capacity most effectively.
We need to start looking at the cards on the Kanban board as demand signals for development and delivery capacity, and the work product as a deployable code increment that delivers the changes from a batch of cards.
A few things become apparent once you do this.
- Unlike manufacturing, the cards dont represent independent, isolated units of work that can be planned and delivered independently. When you analyze the code flow needed to deliver any batch of cards, it becomes very clear that that there are many hidden dependencies between cards, in terms of the knowledge and skills that need to come together at the right place and time to build a single, stable, code increment that incorporates the changes from all the cards in a batch.
- Planning and estimation processes typically never account for the complexity introduced by these hidden dependencies. How can they? These are completely invisible at the planning stage. Making the code flow explicit and tracking it alongside card flow gives us the levers to manage this more effectively.
- If you view engineering capacity in terms of the time and effort needed to make and ship code changes, the cards on the board only reflect a small fraction of the visible work that draws on this capacity. A good fraction of the work that engineers do, is not deemed "business facing" and is most often not tracked on the board, and yet it can take up anywhere from 30-50% of the effort that engineers put in. If you define your flow policies without accounting for this missing capacity, then you are vastly over-estimating your ability to deliver, and it should not be a surprise when plans don't pan out.
- Flow concepts like work in progress, throughput and cycle time apply just as directly to the flow of code through your code promotion process, as it does the the flow of cards on the board, and they are intimately connected to each other. We can apply Lean principles from manufacturing to optimizing code flow and reducing waste.
- Unshipped code is the primary unit of inventory, rather than incomplete cards. Unshipped code is a perishable good. It has a very short shelf-life and when not merged and shipped in a relatively short period of time the effort you put into it is much more likely to be abandoned, or require significant additional effort to integrate and ship. This means production costs of aging code are a huge hidden inefficiency in a software process unless it is made visible and managed.
These are the fundamental problems we seek to solve in our approach to crafting fast, predictable engineering processes in our Polaris Advisor program.
The Polaris Advisor Program
Our thesis is that the way we design engineering processes today, is oblivious to the ergonomics of software delivery - the time and effort it takes to efficiently deliver high quality software. As a result, the ROI on software investments is much lower than it could potentially be, if we optimized this.
The basic idea to operationalize this is simple: version control systems have a detailed history of how a code base was changed to implement cards. By connecting each card to the actual code changes being made by engineers to implement the requirements in the card, in real time, we’d know how far along the card was in implementation at any time. This allows us to analyze work in progress in detail as it is happening - something that is very hard to do by just looking at cards on a board.
Not only that, we’d know how the card was implemented: what code changes were made by whom, when they were made, how long they took, what areas of the code base were affected, and a whole lot more.
As a bonus, the card would tell you why the code changes were made, which provides a tremendous amount of business context around a particular set of code changes, that you cannot analyze by looking at code changes in isolation.
It gives you the tools to understand the actual process of how every team builds software, and it is different for every team and potentially every card that is implemented.
When combined with Continuous Measurement, a technique where we aggregate real time measurements on a low level model mapping cards to code changes, to build a true picture of engineering capacity, we not only get a lot of valuable data that engineers can use, but we are also able to build tools on top of this data that give everyone on the team, engineers, tech leads, managers, product owners and executives the ability to visualize the detailed flow of work in engineering in real time.
We can answer questions going all the way from an engineer asking “who broke my code just now” to the CTO asking “why do features take so long to build”, or “are engineers working on the right priorities”, or the product owner asking “was the customer value we got for this feature worth the engineering dollars we spent on it” and of course, the engineering manager asking “why hasn't that card moved for the last two days”.
All from a single, consistent model of your end to end delivery process from requirements to release.
The Polaris Platform, which we use to run this program, is a modern measurement platform for Lean software development, that connects to standard work tracking and DevOps platforms and automates the process of collecting the real time data needed to analyze software product development flow at both the value and code increment granularity.
Our program is designed to help teams analyze how they work to deliver software and systematically execute on strategies that improve the ergonomics of software delivery.
We help you ship high quality software faster, and manage engineering capacity to effectively and efficiently deliver maximum customer value.
Our approach is process-agnostic and is focused entirely on measuring and improving process outcomes such as speed, predictability, quality and costs, while connecting them to nitty gritty details of changing, testing and shipping code. So you can use it on top of whatever flavor of agile (or even waterfall, if that is still your thing) you may have in place. It can work for teams of all sizes.
Give us a call if you might be interested in seeing what we can do for you.
Please Subscribe and Share
We'll be sending a lot more updates about these ideas in follow up posts on this blog. So if you think you might be interested in following along, please subscribe to our mailing list below.
And if you think there are ideas here that are worth spreading, please share on your social networks. We'd love to get the word out :)
Smarter Engineering Newsletter
Join the newsletter to receive the latest updates in your inbox.