Tuesday, June 29, 2010

TDD for Operations

Software developers have enjoyed the benefits of Test Driven Development for a long time now. System Operations professionals have not yet been test infected.

Test Driven Development (TDD) allows developers to refactor and add new features with the security the impact of their changes are restricted to the intended components. System Operations professionals don't always have such a tool and rely in human knowledge to make sure all integrated systems will behave the same after a change like an OS upgrade.

In Software Development teams often create more code to test the executable code. What could be used for the System Operations case? Monitors!

Monitors are a nice analogy to the green/red way of writing code. Instead of writing a test that doesn't pass, creating the code and then seeing the test pass; operation professionals create a set of monitors which alerts until a certain component is installed.

For example, before installing a new web application, a monitor is created for watching if the web server is up. This monitor would alert until a web server is actually put in place and listens to the proper domain and port desired. Once completed, another monitor would be created for say the number of database connections in the pool and so on.

This approach allows for more frequent changes to infrastructure. If there's a solid deployment process with easy roll back of failed changes, software modifications can be pushed to production at any time at a low risk (Continuous Deployment).

Testing the application constantly in pre-production environments will ensure there are few to no bugs in the software; however, it doesn't ensure configuration issues are not present once it is moved to other environments. An option to mitigate this risk is to run a complete regression test suite against all environments.

There are tools which can effectively use functional tests as transactional monitors such as HP SiteScope. Transactional monitors based on functional tests are great, but it won't provide the more granular results an individual monitor does. As with regular functional tests, these monitors are great for detecting an issue, however, they don't help pinpointing the root cause quickly. If using functional monitors, make sure to include execution times. This ensures the monitors go off if the system degrades beyond agreed service levels.

The automation effort has slowly moved from development to QA. It is time for it to infect the operations teams as well. These teams will greatly benefit from deployment automation and integrated monitoring.

Tuesday, June 1, 2010

Component Teams as Open Source Projects

I share Michael Cohn's principle: component teams are not good and that they should be avoided [1]. One way I've been considering lately to avoid component teams is to create what I call private open source projects.

Component teams are attractive to software developers. They make them feel that their component is a software product. This sentiment is a good thing if it wasn't for the company not being in the software component business. Such an arrangement may lead to feature bloat and lack of focus on the company's core business.

The private open source model has all the same benefits of the public one, the main difference being the size of the community. In a private model the community would be restricted to members of the company.

The private open source model would require the same type of collaboration infrastructure privately that public open source projects create externally. The component team could then be reduced to a few part time committers. These would be selected by either meritocracy or management appointment.

The project that needs a new feature added to the component would either provide the feature itself, or fund a team to do it (Just like the commercial open source model). The committers would then review the proposed changes and commit them to the shared code base.

The private open source model ensures that features being added to the component are relevant to the business and not developer favored features.

Even when a component team exists, allowing others to contribute might be good. The next time the component team needs to expand, for example, it might consider hiring a contributor outside of the team.

Let's look at the downsides of such a model.

The committers might have a diverging idea of what the technical direction for the component is than the rest of the community. This conflict could result in fragmentation of the community and forks. In the past, external communities experienced forks that eventually got resolved e.g. the xfree project). In an internal community this possible fracture could mean that a component is no longer shared, as the community might not be big enough to sustain two projects. Hopefully, due to the homogeneity of the community the fragmentation risk will be low.

There are times when the crowd is wrong in vetoing a specific idea [2]. The worst case would be a disruptive innovation being discarded because the community doesn't understand it well. A mitigating approach is to have an incubator area like the Apache Foundation and others.

There might not be a community large enough to sustain a successful open source project. In some cases the component in question might be shared across multiple projects, but not multiple teams.

As with public open source projects, a complex code base might serve as an inhibitor to contributions. Another common complaint about public open source projects is the lack of documentation. A private one might suffer from the same affliction.

Sharing a reusable component is important and we should strive to do so. Just be careful with the creation of component teams as they might be hard to undo.

[1] http://blog.mountaingoatsoftware.com/nine-questions-to-assess-team-structure

[2] http://en.wikipedia.org/wiki/The_Wisdom_of_Crowds