Working with Agile

Sunday, October 23, 2011

Velocity Assumptions: Take them into account when using it.

Using story points and velocity have been practices for Agile processes for some time now. Velocity is observed as:

(A) velocity = story points completed in an iteration.

By using velocity we calculate the remaining duration as:

(B) number of iterations left = remaining story points / velocity

If you look at the formula above, one thing should jump at you: time is the only known factor. Story points were estimated and are not really known.

The formulas above came from simple physics calculations where velocity is space divided by the time it took to traverse it.

To continue with the analogy, imagine a traveler trying to guess how long it will take to reach a destination.She doesn’t have a map or other instruments to know the distance between where she is and her destination. She estimates the distance the best she can, establishes a few milestones and measure the time it takes to travel from one to another. She doesn’t have an accurate measurement of how far she traveled, and uses her initial estimate to calculate velocity. Would you trust her Estimated Time to Arrival? That’s what we do in software.

To be fair, using velocity and story points might be the best we have to put forward, but be aware of the number you are using; It is still an estimate.

Another few assumptions and weaknesses you should take into account are below.

1. Estimates hold their proportionality.

The first thing we try to make sure when using story points/velocity for estimation is to make sure that a five point story is a little bit larger than twice a two point story.

During a sprint or iteration you might find that given the established velocity, a story that was estimated as a 2 pointer should take only 3 days, but it ends up taking 7. The advice is to not re-estimate the story but look at the current backlog and see if the proportions still hold. In other words, if there’s reason to believe that a 5 point story is no longer twice a two pointer, you should either raise the 2 pointers to the appropriate amount or reduce the 5 point ones to 1[1]. If you follow this advice your velocity will be skewed. The reasoning behind this advice is that the factors that led the story to take so much longer will repeat themselves with other stories, so keep them in for better estimation.

Unless we have more stories of the same type, we don’t know if the proportions still hold. The registration story might be twice the size of the log in one, but it is just a guess. In the equation above (B) we are trying to find 2 variables, duration and remaining points.

Once again, Story Points are an estimate and not really known. We think we have the proportions right, but that might change with time.

Using our traveler example, after she has traveled a day, she looks at what she went through and decides if the unknown terrain ahead would offer the same challenges she faced before. Unless she knows some of what lies ahead, she won’t be able to predict if she will be make the same distance in the same amount of time. It is fair to expect that as a project progresses, the team will be better prepared to estimate the duration of the stories.

2. Distractions/Interruptions remain the same

One other assumption is that, over time, the distractions and interruptions consume a relatively constant amount of time. This is why we calculate velocity at the end of the iteration, instead of story by story. How many times though did you have an iteration with a unforeseen 2 day whole team production issue? How often does that happen? According to the law of large numbers [2] if you collect enough measures (probably more than 20), your estimate will converge to the expected value. In other words, after you collect more than 20 velocity data points, your “interruptions” will average out and you will converge to your team’s real velocity.

We don’t have enough data points for the law of large numbers to apply. We start measuring and using velocity right away, with 2 or 3 iterations. Those data points are not enough to properly estimate velocity according to the law of large numbers. Most times we provide project cost projections without measuring velocity at all.

To make matters worse, we continue to work on process improvement so that velocity goes up. Measuring these improvements is hard because we can’t tell if velocity went up because of them or flawed estimates.

3. Teams will remain constant

This assumption coupled with the previous one is what concerns me the most. Teams do not remain constant. Team members come and go, not because they are not fully assigned, but people leave the company or feel they need a new challenge. Aside from that, people get sick, take vacations, etc.

4. Incomplete stories get no credit

It is common practice to grant no credit towards velocity for incomplete stories. This advice is to prevent the discussion of how much of the story was actually complete. Besides you will get the full credit for the story in the next iteration allowing you to “average out” the velocity.

So after all of this, what do I think you should do?

What if instead of measuring velocity at the end of the iteration, we measured it at the end of the story?

You will get a better idea if you broke the proportionality rule. In other words, if you estimated at a 2 and it took 6h, when all the other 5s took 6h you know that your 2 pointer was actually a 5 pointer and you can adjust it.

Another advantage of this is quicker convergence to the law of large numbers. Your project only has a limited number of iterations, but it is not uncommon for your projects to have more than 20 stories.

Team or skills will remain constant for a single story and the number of variations you will have for each story will be taken care of by the law of large numbers as well. Unless you bias the experiment on purpose by assigning a type of work to a single team.

If you are familiar with the Cycle Time concept you will notice that that's what I'm advocating.

The idea of velocity measured at the end of an iteration is a powerful idea, but it doesn't converge to the expected value quickly enough. It takes longer than we can afford. Be aware of the assumptions behind it in order to know how to better use it.

Using Cycle Time for estimating duration might provide you with an improved solution, but it won't give you a perfect estimate. If you really want to improve your estimates, work on stabilizing your system by reducing variations and noise (disruptions). It won't be quick, but it will be effective.

[1] Cohn, Mike. 2005. Agile Estimating and Planning. Prentice Hall. p. 61

[2] http://en.wikipedia.org/wiki/Law_of_large_numbers

Friday, June 24, 2011

Code Reviews and Quality

Are code reviews the best way of building in code quality? Although good practice, code reviews do not assure quality, instead it inspects it. To build in quality you need to ensure your requirements are good, give your team members time and make sure they have the knowledge they need to build the software.

Test/behavior driven development increase code quality; it drives requirements to the specification point like no other tool we have right now. If you start writing test cases that only cover the happy path, and then you will probably end up with the same quality when producing only user stories or nothing at all.

Of course, reviewing the acceptance tests with the engineers writing the code before they start is a good practice; it communicates the intent of the test cases and features.

Deadline pressure drive developers to compromise quality. Sometimes you have to compromise on elegance when the feature you are working on is now two weeks, or more, late and you're stuck on a problem that you can't seem to find a solution for. Having mandatory code reviews will aggravate this problem, not address it. Developers now have to set time aside to allow for code reviews thus compressing the time they have to create elegant solutions.

Why not eliminate deadlines? (Just kidding) Instead of estimating features, perhaps use SLAs based on past history. If small features in the past took 2 weeks to get done with 95% confidence, the chance of it being done in two weeks is 95%. This technique helps to set the right expectations with your stakeholders.

Another approach is to review the estimates, that is accomplished with Planning Poker for example.

One other reason for bad code is developer lack of knowledge about the domain, tool used, etc. Code reviews help, but instead of reducing rework it increases it. Precious resources were spent writing the code and more will be spent reworking it to conform to re-viewer's suggestions.

Pair programming might be more appropriate for this problem. Have the new developer pair with a more knowledgeable developer. If the new developer is the one writing the code with the assistance of a more experienced one, learning will probably occur in a less adversarial environment and code will come out already reviewed. No rework necessary.

Having code inspections/reviews are good, but they won't build quality in. At most it will inspect it afterwards. Code quality creates an adversarial environment where people feel threatened and defensive -- not the environment most conducive for learning.

Saturday, August 7, 2010

The Most Important Role in Scrum: The Product Owner - Part II

I came across an interesting observation in a book I have been reading lately ^[1]. Alan Shalloway wrote "imagine having completed a software development project, only at the end to lose all of the source code".

Although a scary scenario, it illustrates well that the hard work in a software development project is not the writing of the code itself but figuring out what the software is really supposed to do.

Alan goes on stating that it would probably take less time to write the software the second time than it took the first time. I have lost source code in the past, not the whole project though, but rewriting the lost code was faster than the first time, and its design more elegant.

Nowadays losing code should not be a common occurrence because of all of the infrastructure we put in place for a project, but I'm sure that this simple observation will still resonate with developers.

So, if coding is not the bulk of time spent, then what is it? Most of the time is spent on product management activities like discovering the customer needs and on finding ways to realize those needs and not necessarily writing the code. In other words, the time is spent on minimizing the biggest risk in any project: building what the customer doesn't need.

The product owner role is not only responsible for understanding and uncovering customer needs, but communicating them to the team (communication is the second biggest risk). This will take the shape of Minimum Marketable Features in the beginning of the project, User Stories during the project and Acceptance Tests (hopefully with the collaboration of QA staff) closer to the development of software.

After two blog posts about the importance of product owner's role, I hope that this oft neglected role is a little more important in the reader's mind, and in fact, bring it to the same level of attention as the Team and Scrum Master roles.

References:

[1] Lean-Agile Software Development, Alan Shalloway, Guy Beaver, James R. Trott

Tuesday, June 29, 2010

TDD for Operations

Software developers have enjoyed the benefits of Test Driven Development for a long time now. System Operations professionals have not yet been test infected.

Test Driven Development (TDD) allows developers to refactor and add new features with the security the impact of their changes are restricted to the intended components. System Operations professionals don't always have such a tool and rely in human knowledge to make sure all integrated systems will behave the same after a change like an OS upgrade.

In Software Development teams often create more code to test the executable code. What could be used for the System Operations case? Monitors!

Monitors are a nice analogy to the green/red way of writing code. Instead of writing a test that doesn't pass, creating the code and then seeing the test pass; operation professionals create a set of monitors which alerts until a certain component is installed.

For example, before installing a new web application, a monitor is created for watching if the web server is up. This monitor would alert until a web server is actually put in place and listens to the proper domain and port desired. Once completed, another monitor would be created for say the number of database connections in the pool and so on.

This approach allows for more frequent changes to infrastructure. If there's a solid deployment process with easy roll back of failed changes, software modifications can be pushed to production at any time at a low risk (Continuous Deployment).

Testing the application constantly in pre-production environments will ensure there are few to no bugs in the software; however, it doesn't ensure configuration issues are not present once it is moved to other environments. An option to mitigate this risk is to run a complete regression test suite against all environments.

There are tools which can effectively use functional tests as transactional monitors such as HP SiteScope. Transactional monitors based on functional tests are great, but it won't provide the more granular results an individual monitor does. As with regular functional tests, these monitors are great for detecting an issue, however, they don't help pinpointing the root cause quickly. If using functional monitors, make sure to include execution times. This ensures the monitors go off if the system degrades beyond agreed service levels.

The automation effort has slowly moved from development to QA. It is time for it to infect the operations teams as well. These teams will greatly benefit from deployment automation and integrated monitoring.

Tuesday, June 1, 2010

Component Teams as Open Source Projects

I share Michael Cohn's principle: component teams are not good and that they should be avoided ^[1]. One way I've been considering lately to avoid component teams is to create what I call private open source projects.

Component teams are attractive to software developers. They make them feel that their component is a software product. This sentiment is a good thing if it wasn't for the company not being in the software component business. Such an arrangement may lead to feature bloat and lack of focus on the company's core business.

The private open source model has all the same benefits of the public one, the main difference being the size of the community. In a private model the community would be restricted to members of the company.

The private open source model would require the same type of collaboration infrastructure privately that public open source projects create externally. The component team could then be reduced to a few part time committers. These would be selected by either meritocracy or management appointment.

The project that needs a new feature added to the component would either provide the feature itself, or fund a team to do it (Just like the commercial open source model). The committers would then review the proposed changes and commit them to the shared code base.

The private open source model ensures that features being added to the component are relevant to the business and not developer favored features.

Even when a component team exists, allowing others to contribute might be good. The next time the component team needs to expand, for example, it might consider hiring a contributor outside of the team.

Let's look at the downsides of such a model.

The committers might have a diverging idea of what the technical direction for the component is than the rest of the community. This conflict could result in fragmentation of the community and forks. In the past, external communities experienced forks that eventually got resolved e.g. the xfree project). In an internal community this possible fracture could mean that a component is no longer shared, as the community might not be big enough to sustain two projects. Hopefully, due to the homogeneity of the community the fragmentation risk will be low.

There are times when the crowd is wrong in vetoing a specific idea ^[2]. The worst case would be a disruptive innovation being discarded because the community doesn't understand it well. A mitigating approach is to have an incubator area like the Apache Foundation and others.

There might not be a community large enough to sustain a successful open source project. In some cases the component in question might be shared across multiple projects, but not multiple teams.

As with public open source projects, a complex code base might serve as an inhibitor to contributions. Another common complaint about public open source projects is the lack of documentation. A private one might suffer from the same affliction.

Sharing a reusable component is important and we should strive to do so. Just be careful with the creation of component teams as they might be hard to undo.

[1] http://blog.mountaingoatsoftware.com/nine-questions-to-assess-team-structure

[2] http://en.wikipedia.org/wiki/The_Wisdom_of_Crowds

Monday, May 3, 2010

The Most Important Role in Scrum: The Product Owner

The product owner has a lot of responsibilities; one of them is to address 3 out of the 5 generally cited levels of planning. He or she is responsible for the Vision, Roadmap and Release Planning.

"The vision describes why a project is being undertaken and what the desired end state is (Schwaber 2004, p. 68 [2])." Without the vision, projects drift from release to release never fully achieving any significant ROI, and eventually being cancelled. I find it to be a smell when the team can't describe why a project is being undertaken.

Much has been written about the vision, and you can find more in this great article by Roman Pichler [1].

A product owner is also responsible for the product roadmap. This important artifact lists what high level features will be available on each release. The roadmap also creates a cadence to customers on how often and what to expect in a new release (if/when it is made public). The knowledge of the roadmap creates a sense of security on customers and can lead to a better rate of acquisition and retention.

The Release Plan lists all the minimal marketable features in a lower level of detail than on the roadmap. The product owner balance between the release and the time it takes to produce it. Release externally too often and you might fail to generate excitement about it, release not often enough, and risk losing your customer base to your competition.

A product owner's job doesn't end at the release planning, he/she will have to help the team break down the features into smaller stories so that they can be estimated, define success criteria, etc. In projects I have seen where the development teams were most productive (hyper productive?), the success definition or acceptance criteria was so clear, that the team was able to estimate the story with accuracy. The design and testing was simple, and the number of defects low. It is important to note though that this type of backlog grooming sometimes requires the whole team. The product owner should be able to rely on other team members to help.

In Summary, the product owner is the role that can make or break a product or project and a team as a consequence. It is not only responsible for making sure that the team is producing high ROI, but instrumental in helping achieve hyper productivity.

References

[1] http://www.scrumalliance.org/articles/115-the-product-vision

[2] Schwaber, Ken. Agile Project Management with Scrum. Microsoft Press. 2004.

Pragmatic's Product Management Framework http://www.scribd.com/doc/28560062/null

http://www.scrumalliance.org/articles?tag=product+owner

Sunday, April 4, 2010

Quality is More than Absence of Defects

For years I insisted in automated unit tests with a naive assumption that if you take care of the pennies, the dollars will be taken care of by themselves. I even watched unit tests coverage numbers closely to make sure they were high enough, but still, the perceived quality of the software was not good. It turned out that “I was penny-wise and pound-foolish.”

I couldn’t understand what these people were talking about!

Our code coverage was above 85%! A test code audit ensured that good assertions were actually in place.
Static analysis tools we were using didn’t show anything important.
The number of open bugs in the QA system was below 100, and a number of them were deemed as not important by the client.
Iteration demos to the user were successful with no serious issues detected.
There were no broken builds in our continuous integration process.
Quantitative analysis (cyclomatic complexity, Crap4j, etc.) pointed to code that was very good.

In other words, all of the standard industry metrics for quality signaled that the code was good.

As we started to ask more questions about why certain groups thought our quality was low, we discovered a few issues.

Our unit tests ran outside the container, so in some cases when we ran them in the container the application would not start up. When it started up, links were not available, or clicking on them would take you to a page full of display errors, etc. The unit tests were indeed verifying a lot of the functionality, but they were not verifying the UI logic or configuration settings. In our case, we were stopping at the controller level.

A second problem that we uncovered was that demos were being done on individual developer machines and not on an “official server.” So teams would spend a significant amount of time preparing for the demo. They were configuring the application on the individual machine by hand, and sometimes not using a build out of the continuous integration process.

Another common complaint was that the application would not run the first time it was deployed to the test environment, and would require developers to get involved in troubleshooting and re-configuring the environment.

The quality issue was not related to software defects, but an unpolished product. Complaints didn’t necessarily come from the end user, but internal stakeholders.

To solve the problems, we changed our definition of done to require in-container manual tests besides unit tests. This resolved a lot of small issues that were being found by QA.

One other important improvement that we are now undertaking is to fully automate our deployment. We expect that the same way that continuous integration improves the build process, this new process will improve deployment scripts/instructions.

With the deployment process automated, the next step is to at least automate a few smoke tests. This should allow us to quickly identify if the deployment succeeded or not and what problems we might have.

What did we learn from this?

Question/listen to all of your customers, including team members (Product Owner, Scrum Master and team) and some non-team members (e.g. Operations). Not all team members are comfortable speaking up during retrospectives.
Quality is a complete package, not only the absence of bugs. Your deployment process is part of the application as well.
Automated unit tests are not enough. Even if they are not purely unit tests and bleed into the functional and integration tests realm.
Question your definition of done. Is it complete enough?

Unit tests are a great tool; however, those passing should not be the extent of your definition of done. Keep your eyes on acceptance tests, load tests and others; they will help you avoid other possible issues besides bugs. And above all, listen carefully to all stakeholders in your project, having them in the retrospective doesn’t mean that they are speaking their minds.