Appropriate Checkin Comments

I read this post today with a list of funny checkin comments today. Some of them are funny simply because of the lacking description. Here are some comments I’ve seen in my personal experience:

  • many small changes
  • Microsoft IE sucks!
  • cleanup
  • oops
  • fix the bug

Worse, I’ve seen entirely empty changeset comments.

The above lists, along with those found on the funny checkin comments page provide some examples of inappropriate commit comments. Why? They are unprofessional and lacking in detail and meaning. Some projects are audited and reviewed by external third parties. As a project manager or architect, would you be embarrassed for  an auditor to see the comment “fix sucky code?” I would. Even worse than being embarrassed, there is a productivity problem that can arise from poor checkin comments.

What is an appropriate comment? An appropriate comment must (minimally) have a few things:

1. Appropriate level of detail about the change, including why the change was made, what impact there may be, etc.
2. Appropriate to the changeset. Along with this, a single changeset should, as much as possible, reflect a single ticket or change. Many lazy developers check in a large set of code with a number of intertwined changes that are unrelated. When it comes time to revert a particular change or track a defect this creates problems, and ultimately it defeats one major purpose of version control.
3. Details about the completeness of the change. Generally a changeset should complete a ticket or work item, but this is not always the case. If there is remaining work to be done, “TODO” items or further functional requirements that impact the changeset, this should be noted.
4. Finally, perhaps the most important part, the checkin comment should refer to a ticket. Not all changesets have tickets written, sure, but in general, if the ticket is a defect, enhancement or requirement implementation, there should be one or more tickets that are related. Any modern version control and ticket system will be able to tie these together.

One developer writes:

Many developers are sloppy about commenting their changes, and some may feel that commit messages are not needed. Either they consider the changes trivial, or they argue that you can just inspect the revision history to see what was changed. However, the revision history only shows what was actually changed, not what the programmer intended to do, or why the change was made. This can be even more problematic when people don’t do fine-grained commits, but rather submit a week’s worth of changes to multiple modules in one large pile. With a fine-grained revision history, comments can be useful to distinguish trivial from non-trivial changes in the repository. In my opinion, if the changes you made are not important enough to comment on, they probably are not worth committing either.

Getting developers to write good checkin comments seems to be an ongoing battle. In the business of writing software, its easy to convince oneself that checkin comments are a waste of time. The real waste of time is later, when trying to track the introduction of a defect or trace requirement implementation to code. There is simply no good excuse for lacking checkin comments.

Is the checkin comment “cleanup” appropriate? Yes, in some cases, as long as its true. If I am cleaning up formatting of code, including things like indents and spacing and correcting whitespace, then yes, “cleanup” is an appropriate changeset comment. Generally, however, real comments are required.

[Vistamix: The Humor of Code Checkin Comments]
[Loop Label: Best Practices for Version Control]

Handling Development in a Medical Device World (Part 8)

CONTINUOUS INTEGRATION BUILDS
To answer the first question, before it is even asked: Yes, a software project should have continuous integration (CI) builds. This goes for projects with a large team as well as projects with a small team. While the CI build is very useful for a large group to collaborate there is tremendous value even for the smallest team. Even a software engineer working alone can benefit from a continuous integration build.

What’s so great about CI builds (using Hudson)?

1. A build can be traced to a Subversion changeset, and therefore to our ticketing system (and
entire process).

Okay, so I’m starting to sound repetitive, but tracing is a good thing, and with this setup I can trace all over the place! With all of the process in place, from our standard operating procedures to work instructions to use cases and requirements to tickets to changests, how do we know that the team members working on a project are actually following procedure? The CI environment, at least with regard to the ongoing development of code, gives us a single point of overview for all other activities. From here we can see changesets, tickets, build status and test coverage. Using the proper add-ons, we can even gain insight into the quality of the code that is being developed.

To pull this of, of course, we need to use our ticketing system wisely. With Redmine, and just about any good ticketing system, we can capture elements of software requirements and software design as parent tickets. These parent tickets have one to many sub-tickets, which themselves can have sub-tickets. A parent ticket may not be closed or marked complete until a child ticket is completed.

2. Immediate feedback

At its most basic level, Hudson only does one thing: It runs whatever scripts we tell it to. The power of Hudson is that we can tell it to run whatever we want, log the outcome, keep build artifacts, run third party evaluation tools and report on results. With Subversion integration, Hudson will display the changeset relevant to a particular build. It can be configured to generate a build at whatever interval we want (nightly, hourly, every time there is a code commit, etc.).

Personally, every time I do any code commit of significance, one of the first things I do is check the CI build for success. If I’ve broken the build I get to work on correcting the problem (and if I cannot correct the problem quickly, I roll my changeset out so that the CI build continues to work until I’ve fixed the issue).

Hudson can be configured to email team members on build results, and it may be useful to set it up so that developers are emailed should a build break.

3.  A central location for builds

The build artifacts of your project, in general, should not be a part of the repository (there are exceptions to this rule). Build artifacts belong in your continue integration build tool, where they can be viewed and used by anyone on the team who needs them. These artifacts include test results, compiled libraries and executables. Too often a released build is created locally on some developer’s machine. This is a serious problem, because we have no good way of knowing what files were actually used to create that build. Was there a configuration change? A bug introduced? An incorrect file version?

While developers have good reason to generate and test build locally, a formal build used for testing (or, more importantly, release) much never be created locally. Never.

Build artifacts are not checked in to the source control repository for a number of reasons, but the biggest reason is because we never want to make assumptions about the environment in which those items were build. The build artifacts should instead remain in our CI environment, where we know the conditions within which the build was generated.

Also, because these builds remain easily accessible and labeled in the CI build environment, any team member can easily access any given build. In particular, it may become necessary to use a specific build to recreate an issue in a build that has been released for internal or external use. Because we know the label of the build (the version number given to it) as well as the repository changeset number of the build (because our build and install scripts include it), we know precisely which build to pull from the CI build server to recreate the necessary conditions.

4. Unit tests are run over and over (and over)

Developers should do whatever they can to keep the CI build from breaking. Of course, this doesn’t always work well. I’ve broken the CI build countless times for a number of reasons:
- I forgot to add a necessary library or new file
- I forgot to commit a configuration change
- I accidentally included a change in my changeset
- It worked locally but not when built on the CI build server (this is why the CI build server should, as much as possible, mimic the production environment)
- Unit tests worked locally but no on the CI build server because of some environmental difference

If not for a CI build environment with good unit tests, such problems would only be discovered at a later time or by some other frustrated team member. In this regard, the CI build saves us from many headaches.
The most difficult software defects to fix (much less, find) are the ones that do not happen consistently. Database locking issues, memory issues and race conditions can result in such defects. These defects are serious, but if we never detect them how can we fix them?

It’s a good idea to have unit tests that go above and beyond what we traditionally think of as “unit tests,” and go several steps further, automating functional testing). This is another one of those areas where team members often (incorrectly) feel that there is not sufficient time to do all the work.

With Hudson running all of these tests ever time a build is performed we sometimes encounter situations where a test fails suddenly for no apparent reason. It worked before, an hour ago, and there has not been any change to the code, so what did it fail this time? Pay attention, because this WILL happen.

5. Integration with other nice-to-haves, such as Findbugs, PMD and Cobertura

Without going into too much detail, there are great tools out there that can be used with Hudson to evaluate the code for potential bugs, bad coding practices and testing coverage. These tools definitely come in handy. Use them.

[Hudson]

[TestNG]

Handling Development in a Medical Device World (Part 3)

ONE APPROACH
After that long introduction, my desire is to write about one approach to handling design and development activities of a software project. This is an approach that works well for any software project, not just one in which a 510K is being pursued. Given the fact that the biggest needs are for good communication and tracing, perhaps we shouldn’t be surprised that the open source community has taught us a thing or two about team collaboration and communication on software project. It is this community that has learned a great deal about pulling together work from a community across the global to create high quality software.

When it comes to a regulated project we are very concerned with traceability. We want to know that use cases, business rules, requirements, documents, code revisions, hazards and tests can be traced backward and forward. Despite having some excellent technology available (for free!), many companies insist on manual documentation of such tracing. This creates a mess of documentation that is difficult (if not impossible) to maintain and likely to contain errors. You’re just begging an auditor to find a problem! If it is your full time job to edit and verify these documents, you may want to stop reading now.

Here goes: You don’t need to hire someone to work full time to shuffle through this mess of documentation! Sure, you may need a technical writer, but there is plenty of technology available to make this aspect of design and development easy. There are tools out there already that, when used properly, can make software project management from the highest level to the most detailed tracing downright fun.

WHY ISN’T EVERYONE DOING IT THIS WAY THEN?

(Well, in fact, many are…)

  1. There is a perception that using 3rd party tools is difficult within the confines of the CFR.
  2. “But how do we validate all this software?” you ask, “The CFR requires us to validate all the tools we use.” Show me where. What the CFR states is that we must show intended usage of our software and show that the tools we use work the way we think they should work for our needs. Simple.
  3. The CFR doesn’t exist so that we may engineer software poorly. Nor does it exist to tie our hands from using the best new tools available. It exists to tell us how to do it right.
  4. Open source software cannot be trusted.
  5. This objection can only be addressed in a separate article! All I can say to this is that we have learned repeatedly that widely-used open source software is often better and just as well supported as its closed-source counterpart. One need only look as far as GNU for proof (www.gnu.org).
  6. Technologies change. How do I know Subversion will be used in 10 years?
  7. You don’t, and it may not be. But for everyone out there using other legacy systems (Visual Sourcesafe anyone?), we can’t let this be a concern. The technology may change, but our servers can be maintained and archived for as long as necessary.
  8. We don’t have the time for such overhead or IT support.

At the risk of sounding cliché, you don’t have time to not do this. A little setup and thought up front streamlines the process and mitigates the risk of serious problems down the road. If you ever find yourself wondering how to update tracing long after the completion of a software requirement, document change or test, your procedures have failed you. And if you don’t have a practical approach to applying your procedures, this WILL happen at some point in the project. Sure, there is some upfront cost to be considered, but it comes with greater efficiency throughout the project lifecycle.
THE TOOLS

When it comes to a setup such as the one I am about to describe, there is no “one size fits all” approach. The tools used must be evaluated with consideration to the environment, project and corporate needs. So while the tools I am about to list will no doubt work for a wide range of needs, there are many other tools that are worthy alternatives (Trac, Git, Mercurial, etc.).
The main tools in my example are:

  • Subversion – http://subversion.tigris.org/ – Version control (of everything!)
  • Redmine – http://www.redmine.org/ – Ticketing and issue tracking system
  • Hudson – http://hudson-ci.org/ – Continuous Integration builds
  • Other tools that I will mention include:
  • TortoiseSVN – http://tortoisesvn.tigris.org/
  • PMD – http://pmd.sourceforge.net/
  • Cobertura – http://cobertura.sourceforge.net/
  • Findbugs – http://findbugs.sourceforge.net/

The tools I’ve listed above are ones that I am very familiar with and like. They serve the needs of a software project (regulated or non-regulated) very well when integrated, they are widely used and supported and they integrate very nicely. Best of all, being open-source, they are all free (and written by some of the best software developers in the world). For use in the typical corporate environment, each tool can use LDAP authentication. As far as environment restrictions go, each of the tools can be run on Windows, Linux, Solaris and MAC OS X.