Thursday, March 28, 2013

Software Estimation - meaning and sources of error

Do you remember when did you make your first software estimate? I could not. But it must have come very naturally and unconsciously. Perhaps when I was examining my first assignment requirement, looking for room for an extra feature. Or when we divided work in my first group project. Or when I was asked "how long will it take?" during my internship. Nowadays, the projects I am estimating have gotten more complicated that I can no longer depend on my intuitive judgment for reliable estimates. It is an obvious need to upgrade techniques and mindset accordingly to meet the new situation. Reliable software estimate is not a black magic nor a fictional story, it is a skill that can be improved through retrospection, discipline and practice. In this post, I will briefly capture the meaning of accurate estimate to an organization and identify sources of estimation errors. (Slides are available)

Estimation in software development

Estimation is considered the building block of many activities in software development life circle. These include schedule planning (detailed schedule, complete work breakdown structure), budget planning (prioritize functionality, divide iterations) and resources planning. Accuracy of estimate affects these activities and in general the project ability to hit target. So when a consultant company like us is asked for an estimate, we are not asked for a tentative judgement that we can change our mind later on. We are asked for a commitment or a plan to meet a target.

Given that responsibility, developers' attitude towards estimation is still neglected. If I ask you how long would it take to finish your coming release, there is a huge change that I would get a single-point number answer. In fact a project outcomes follow a probability distribution (McConnell, 2006). A project might be done faster, or later. And the chance the project will be done in the middle of the distribution is most likely. Moreover, there is always a limit on how well a project can be done so the part on the left will be truncated.

What we have now is a representation of the probability distribution of project outcomes. The single-point "estimate" we had earlier is actually a target.

What is estimate used for in your company? If you don't really care how software development works, estimation is all about charging customer. Actually there is much thinking following that probability statement. Take estimate as a means of visualization, project planners will be able to find the gap between a project's business targets and its estimated schedule and cost.
A good estimate is an estimate that provides clear enough view of the project reality to allow the project leadership to make good decisions about how to control the project to hit targets - McConnell, 2006
As it is hard to get 100% accurate estimate and error is inevitable, what is better, overestimate or underestimate? In overall, the penalty for underestimation is more severe than that of overestimation. However
The focus of the estimate should be on accuracy, not conservatism. Once you move the focus away from accuracy, bias can creep in from different sources and the value of the estimate will be reduced - McConnell, 2006

Why your estimate is not accurate

The cone of uncertainty

The amount of uncertainty during software development process varies at different moments and typically narrows down over time. This implies that estimate made at the beginning of the project is less accurate that estimates made at later stages. At the beginning of a project, the targets haven't been fully conceptualized yet. Decisions at this level are broad and subject to changes in future. On the other hand, most decisions in development phase are significantly smaller and focus on implementation details. Do not expect the cone to narrow itself though. Unless decisions to remove uncertainty are made, the variation does not go away.

Organization structure that kills both productivity and predictability.

An environment that
  • Makes room for multitasking to creep in
  • Employs incompetent technical skills
  • Apply incomplete/unskilled project management
Is a great condition to surpress our productivity. It suffers and goes down, rarely linearly but usually exponentially. The key is, the parameters of this exponential function are highly subject to human nature and context. In other words, they are a myth. Thank to this, there gone our predictability.

Unstable requirements

Everyone in the industry knows how destructive requirement changes can be. They prevent us from narrowing down the cone of uncertainty. They are also not well tracked and the project isn't re-estimated when it should be. Project control strategies are out of the scope of this post. But in term of estimation strategy, we can incorporate an allowance for requirement growth and changes (McConnell, 2006). 20%, 30% or 50% depends on your own judgement.

Omitted activities

Include time in your estimates for stated requirements, implied requirements and non-functional requirements. Nothing can be build for free and your estimates shouldn't imply that it can.

Unfounded optimism

Developers are hopelessly optimistic creatures. Yes, we are. That's a fact. It's a thing that we can't deny. This behavior has a close relationship with omitted activities. Our mind is not prepared for multi tasking. Instead of juggling things simultaneously, people have a tendency to focus on only one thing that they are most interested in, they know best or they perceive as being most important to the project and discard the rest. The moment you fail to conceive the project as a whole picture, subjectivity and bias creep in.

Estimate Influences

Because everything that helps you understand the nature of software development improve estimation accuracy.

Project size.

Project size implies the number of potential variations in a project. The bigger the project, the harder it is to be estimated. If building a 10-story building take 20 man-months, how long would it take to build a 100-story? Much more than 200 man-months. During the construction, many issues would emerge. Architecture to support the height, wind and natural disaster. Logistic so that materials are always ready and idle time is minimized. Power plan to keep the structure habitable. That's a perfect metaphor for software development. As the project gets bigger, there are more modules. These modules need to communicate with each other. The number of these communication paths is corresponding to the effort required for to complete the project. And it grows exponentially. This phenomena is widely known as diseconomy of scale.

One of the biggest problem I had related to this was that I failed to anticipate the rich interaction between components. The initial estimate was usually correct, but then missing pieces of how it interact with others kicked in and overflew the estimate.

Type of software

Different software types have different focus and difficulty. Developing an e-commerce website is generally easy and focuses on reusability to maximize profit. Developing a life-critical system is not only complicated on its own but also constrained by industrial or legal regulations. Although not all software development activities produce code, these differences are somehow reflected into LOC over time and demonstrate a strong different in velocity across different project types.


Researchers have found a difference of 10-fold in productivity and quality between good and bad programmers with the same levels of experience and also between different teams in the same industries (McConnell, 2008). This degree of variation is not unique to the software industry but more a common behavior among occupations, including professional writing, sport and police work. The top 20 percent of the people produced about 50 percent of the output. On the other hand, 10 percent of the people contribute negatively to the output, i.e. can't fix the problems they created.

Steve McConnell. (2006). What Is an "Estimate"?. In: Software Estimation - Demystifying the black art. Redmond: Microsoft Press.
Steve McConnell. (2006). Value of Accurate Estimates. In: Software Estimation - Demystifying the black art. Redmond: Microsoft Press.
Steve McConnell. (2006). Estimate Influences. In: Software Estimation - Demystifying the black art. Redmond: Microsoft Press.
Steve McConnell. (2008). 10x Software Development. Available: Last accessed 26th Mar 2013.

Monday, March 18, 2013

[Lightning talk] Group by time interval in SQL

The other day I needed to write a query that grouped entries by time intervals (week, month, etc.)

I thought of PSQL's generate_series first as it is capable of creating series of numeric values with precise step. So my first attempt would be to select start and end dates of a series of one-week intervals, starting from the first case creation date.

It was not easy as it seems. I went through a hell of type casting. Then the easy step would be to group entries whose created date fails into these intervals.

In fact, I didn't even get to this step. I, in fact, went to the wrong direction, given the context of the feature I was implementing, it made more sense to group entries by calendar's week, not any random 7-day interval.

In order to achieve that, the most critical step was to be able to trace back the beginning of the week from any given date. Thank god, that was easy.

The price I had to pay for going the wrong way was high, but for this case, I guess I would eat it

Sunday, March 10, 2013

3 Days of Lean and Kaizen

This week, from March 4th to 6th, the first Lean Mindset Workshop and Kaizen Camp Gathering in HCMC was held. The Agile Vietnam guys did an truly awesome job in bringing together top-notches around the globe the event. That included experts on Lean Software Development, Mary and Tom Poppendieck as well as Jim Benson and Tonianne Demaria Barry, founders of Personal Kanban.

Joining the event was both a last-minute decision and a bet to me. Three days before the event, Saigon Service Design Jam was held. It was also the first of its kind in Vietnam, and also lasted for three days. I couldn't afford the two events together. Selecting the Lean and Kaizen one was truly a leap of faith as I had stopped attending the events of Agile Vietnam a while back. Although I had to commuted 30km in Saigon's heat for the event, it was well chosen.

What was there?

The Lean Mindset Workshop occupied the whole March 4th.

Mary and Tom were an admirable 70-something couple who were spending their retired time travelling around the world giving Lean Mindset workshops and writing books. Honestly, I was overwhelmed by the amount of information. The usually-two-day workshop was compressed into one so that was a lot of talking and for a short period in the afternoon I was lost.

The workshop started by examining the death of companies that were driven under the name of productivity, developed securing manual and policy which ironically prevented innovations and finally failed to focus on its customer's values. The stability periods between economy crisis are getting shorter, the market is more fluctuating than ever and yet disruptions remain extremely hard to forecast. There is a rising need for organizations to structure themselves to be flexible against changes from time to time. The ideas of Lean Mindset are evolved around this concept.

We moved on to identify the seven flow disrupters in software development. Together in a group of 7-8, we discussed about the source, effects, and solution of each. The next step was to give our organization a health check based on the basic disciplines of a healthy organization. The list had a lot to do with quality control and as expected Cogini failed hard on testing disciplines (though we got most of the rest right).

Due to the time limit, we could only afford wrapping up Iteration, Kanban, and Continuous Delivery quickly. Kanban was still a pretty new concept in Vietnam software industry so it was important to stress that Kanban doesn't ask an organization to change anything that its currently does. Positioning itself as a supporting framework to visualize processes and policies, Kanban can be integrated smoothly with organization's current process. The three processes covered in the workshop are tools serving the purpose of making organizations more Lean and agile.

To end the day, we had an interesting talk about improving productivity

The legendary 10,000 hours

Balancing challenge and skill

The science of motivation

Kaizen Camp took over the last two days of the event. 

I found the way the workshop and the camp lined up and supported each other fascinating. If there had been only the workshop, most attendants would have came out like me, overwhelmed and confused by the massive amount of information. If there had been only the Kaizen Camp, there wouldn't have been enough topics to feed two days full of action. During the camp, we discussed work, life, and community. And under the spirit of kaizen, we also discussed improving all of them. The camp shared the same format with Barcamp and even though there were facilitators, the topics were community-centric and highly diverse. Fascinating!

The topics gave me the opportunities to challenge and practice what I was introduced to during the workshop. One of the reasons I stopped attending events of Agile Vietnam was that even though the sharing there was interesting, I found that each organization was unique in certain ways. How could I apply something that had been successful else where was always blur. Facilitating a group of motivated people isn't about teaching them or telling them what to do but more about asking triggering questions to get the participants to open up and speak out what is on their mind. And Mary, Tom, Jim and especially Toni made awesome facilitators. Many questions were answered specifically to Cogini context and the fog was just lifted. I ended up facilitating a session or two myself.

Nice people

I have attended a number of Barcamp, locally and internationally. Yet the attendants to this event managed to be the most diverse. I didn't met as many managers as I would like to, but there was a gentleman from TMA that really knew what he was talking. I met a girl from the North that seemed to have full of doubts in life yet decided to join a startup. I made friends with two engineers with tremendous experience from KMS, and a head of department of a game company. There was also a startup guy who seemed to be making similar mistakes as I did when getting to know Agile methodologies. And who could have known that I would run into a high-school friend there. Behind each attendant was a more sophisticated story. A startup with bureaucracy problems of a Forbes 500. A company that sent only programmers and no manager to the event. A sponsor that didn't mind that its name couldn't show up on media prints. Despite all of those differences, I was moved by their genuine desire to learn, share and make friends. Once again, my belief that engineers are nicest people on earth was strengthen. Thank you.

* Special thank to Huy Tran for allowing me to use his photos to illustrate for the post