Saturday, June 18, 2022

Prioritizing development decisions


Great startup stories tend to share the same mold: a great founder dreamed of an equally great vision and followed it fearlessly till the world was conquered. History is written by the victors, of course, but it is relatively common for people to have a rough idea of what they want to build before starting the work. That’s the easy part.

But constructing a concrete step-by-step plan to deliver not even the vision but a mere release is hard work. A good plan needs to take advantage of both business and development expertise without letting one overpowers the other. If the business makes all the calls, the development time might be painfully long and the product crashes when traffic starts to peak. If the development decides, we might have a technical wet dream of solving a non-existent problem. That’s where the planning game comes in.

I can’t decide if a game or a dance is a better metaphor for depicting the collaborative nature of planning. Business and development, each possesses knowledge unavailable to the other and is unable to produce the entire plan. The work can only be done by combining the strength of both sides. In economics, a “game” refers to a situation where players take their own actions but the payoff depends on the actions of all players. Game Theory suddenly sounds less conspiratorially adventurous, doesn’t it? Dance on the other hand isn’t used as much in research literature so I ended up siding with the economists. That’s a sidetrack.

In an Agile team, a planning game looks like this:

  1. The Product Owner decides the scope of the plan. Based on the purposes of the projects, the Product Owner prepares a set of use cases and explains why they are valuable problems to solve and why they should be done first.
  2. The whole team breaks each use case down into stories. The idea is usually that anything requiring the team to do something other than normal company overhead needs a story.
  3. The developers “size” the stories. They estimate the time each would take or its complexity. And then group stories that are too small, split ones too big, and decide what to do with stories they can’t estimate.
  4. The Product Owner prioritizes the stories. Some stories won’t be worth adding, either unimportant or too far in the future.
There are many things that can be said about the planning game, from Work In Progress should be minimized, the releases should be small and often, to the best answers for “why does it cost so much?”. Those are stories for another time.

In this piece, I want to discuss specific friction in step 4 of the game where one story is prioritized over another. The stories laid out in step 2 do not necessarily project the same values to different team members. Some are pretty straightforward, implement feature X to earn Y money, the contract was signed. While others are more tricky such as implementing plug-and-play UI components so that future web pages are built faster. The second category usually comes from the development team who is one layer away from the users and so perceives values differently from the Product Owner. That is the breeding ground for misalignment.

Product Owners want to release a solid, usable product. They also have to balance that with the desire to save money and meet market windows. As a result, they sometimes ask developers to skip important technical work. They do so because they aren’t aware of the nuances of development trade-offs in the same way the developers are.

Some developers note down all the development options like a shopping list, “outsource” Product Owners to choose, and then roll in agony at the wrong decisions. If such a strategy didn’t work for the guys at the Pentagon, it wouldn’t work anywhere. Just as Product Owners are the most qualified to decide the product direction, developers are the most qualified to make decisions on development issues. Don’t delegate the decision, take the matter into your own hand. If a development decision isn’t optional then it shouldn’t be prioritized either. Just do it.

Instead of:
Our notifications is crucial at informing customers the health of their business. To make the data pipeline behave transactionally, we have several options. Please let me know how should we prioritize them.

    • Experiment with Flink’s TwoPhaseCommit, this is new to us so it would take time and be hard to estimate.
    • Get Sentry to cover all the projects, this is a passive measure as we passively wait for exceptions.
    • Add a check at the end of the pipeline to make sure no duplicated notifications are generated, the check will have to handle its own state.
    • Move the final stage of the pipeline to Django, it is a web framework that supports transactional requests by default and we are familiar with it.

Try this:
Our notifications is crucial at informing customers the health of their business. The data pipeline is long and consists of multiple nodes, each needs to successfully finish its work to produce a notification. To achieve this notion of exactly-once delivery, we need the pipeline to behave transactionally and every exception to surface swiftly. That is done via Flink’s TwoPhaseCommit and Sentry integration. The work will be done at the beginning of the project as it is easier to handle when the code base is still small. TwoPhaseCommit in particular is new to us so we will have a couple of spike stories to understand the technology.


When there is a business choice to be made, don’t ask Product Owners to choose between technical options. Instead, interpret the technology and describe the options in terms of business impact. To continue our notification example, before any notification is sent, there is a need to make sure the data we have is the latest. The conversation can go like this:

We are thinking about adding another Kafka queue to request the latest data. We then need to join the request flow with the future trigger with some sort of sliding window, will also need to thinking about out of bound data. Our other option is to set not one but two future triggers so that one can request data and the other handles notifications. Which would you prefer?

Try this instead:

We have two choices for ensuring a notification always works on the latest data. We can use a deterministic approach or an empirical approach. The deterministic approach would add a new data request flow right before the notification is sent. The notification is processed after the data request flow so we always sure the latest data is used. But because technically data procession and future notifications are asynchronously independent from each other, it would require several more stories for us to join them together. The empirical approach won’t take any extra work. We observe that it usually takes less than 5 minutes for a data request, so we can set two future triggers instead of one, 10 minutes apart from each other. The first one request data, the second notification. But the margin of error is larger because sometime there can be delay in data request. Which would you prefer?


And finally, no software engineering discussion would be completed without a talk about code refactoring. In the context of the planning game, it is mostly about justifying the refactoring effort. While it is tempting to do a “spring cleaning” hoping to refactor the whole thing back into shape, the sad truth is halting the development of working software for refactoring is hardly justifiable. Refactoring effort deals with risk (the old code can implode at any time) and potential (the new code is easier to work on). Those values are intangible compared to the usual subjects of a business decision (new features lead to a new set of customers lead to greater revenue).

What do we do? Boy scout rule “always leave the campground cleaner than you found it.” Whenever you need to implement a new feature or fix a bug you see if that part needs improvement. Refactoring shouldn’t be a separate phase, it is part of everyday development. Once you nurture this culture of quality, there is nothing to justify.

None of the above suggests the easiest way to avoid friction is to keep the business side in the dark while going on waving the engineer's magic wand. Communication remains the key to any successful project. There is more to a project's success than just business decisions, and working out a way to be a (constructive) part of the conversation is more powerful than a baseless delegation.