Valuable Unit Tests in a Software Medical Device, Part 6

Many of the benefits of functional unit testing listed above are gained only when unit tests are written alongside design and development (test-driven methodologies aside). It is imperative that the development team develop and observe test results while design and activities take place. This is of benefit to the quality assurance team as well, as Dean Leffingwell notes:

A comprehensive unit test strategy prevents QA and test personnel from spending most of their time finding and reporting on code-level bugs and allows the team to move its focus to more system-level testing challenges. Indeed, for many agile teams, the addition of a comprehensive unit test strategy is a key pivot point in their move toward true agility—and one that delivers “best bang for the buck” in determining overall system quality.

Also, it is probably becoming clear that a key benefit of functional unit tests is the real-time feedback offered to the development team. One author refers to the unit tests that are executed with each software change as “commit tests. [1]”

Commit tests that run against every check-in provide us with timely feedback on problems with the latest build and on bugs in our application in the small. [8]

-Jez Humble, David Farley, Continuous Integration

Project unit tests, which should offer significant amount coverage (at least 80%), provide the team with built-in software change-commit acceptance criteria. If a developer causes the CI build to fail because of a code change, it is immediately known that the change involved does not meet minimum accepted criteria, and it requires urgent attention.

Humble and Farley continue,

Crucially, the development team must respond immediately to accepted test breakages that occur as part of the normal development process. They must decided if the breakage is a result of a regression that has been introduced, an intentional change in the behavior of the application, or a problem with the test. Then they must take appropriate action to get the automated acceptance test suite passing again. [8]

-Jez Humble, David Farley, Continuous Integration

Continuous Delivery, Jez Humble, David Farley. Addison-Wesley, Copyright © 2011, Pearson Education, Inc. Boston, MA
Agile Software Requirements, Dean Leffingwell. Addison-Wesley. Copyright © 2011, Pearson Education, Inc. Boston, MA

Valuable Unit Tests in a Software Medical Device, Part 5

What is the value of unit testing?

Immediate Feedback within Continuous Integration: Developer Confidence

Too often we view testing as an activity that occurs only at specific times during software development. At worst, software testing takes place upon completion of development (which is when it is learned that development is nowhere near complete). In other more zealous environments, it may take place at the end of each iteration. We can do better! How about complex unit tests perform validation continuously, with each code change? It is possible to perform full regression tests with every single code change. It sounds like a significant amount of overhead, but it is not. The real cost to a project is not inattention to complex functional unit tests; The danger is that we put off testing until it is too late to react to a critical issue discovered during some predetermined testing phase.

The most effective way of killing a project is to organize it so that testing becomes an activity that is so critical to its success that we do not allow for the possibility that testing can do what it is supposed to do: Discover a defect prior to go-live.

At its most basic level, a continuous integration build environment does just one thing: It runs whatever scripts we tell it to. To that end, it is important that the CI build execute unit tests, and that a failure of any single unit test is considered a failure of the continuous integration build. The power of a tool such as Hudson or Jenkins-CI is that we can tell it to run whatever we want, log the outcome, keep build artifacts, run third party evaluation tools and report on results. With integration of our software version control system (e.g., Subversion, Git, Mercurial, CVS, etc.), we know the changeset relevant to a particular build. It can be configured to generate a build at whatever interval we want (nightly, hourly, every time there is a code commit, etc.). When a test fails we know immediately what changeset was involved.

Personally, every time I do any code commit of significance, one of the first things I do is check the CI build for success. If I’ve broken the build I get to work on correcting the problem (and if I cannot correct the problem quickly, I roll my changeset out so that the CI build continues to work until I’ve fixed the issue).

Easy Refactoring

As a developer, refactoring can be a scary thing. Refactoring is perhaps the most effective way of introducing a serious defect while doing something that seems innocuous. With thorough unit tests performing a full regression test with each and every committed software changeset, however, a developer can have confidence that his or her simple code changes have not introduced a defect. We have continuous integration builds running our tests for many reasons, not the least of which is to alert developers to the possibility that their changes have broken the build.

As a developer I strive to avoid breaking the continuous integration build. When I do break it, however, I am very pleased to know that what was done to cause a problem has been discovered immediately. Correction of a defect becomes much more costly when its discovery is not noticed until the end of a development phase!
Regression Tests with Every Code Change

By “repeated” I mean something different than repeatable. The fundamental benefit with repeated tests is the fact that a test can be executed many more times by automation than by a human tester. Sometimes, even without a related code change, and much to our surprise, we see a test suddenly fail where it succeeded numerous times before. What happened?

The most difficult software defects to fix (much less, find) are the ones that do not happen consistently. Database locking issues, memory issues, deadlock bugs, memory leaks and race conditions can result in such defects. These defects are serious, but if we never detect them how can we fix them?

As stated previously, it is imperative that have unit tests that go above and beyond what we traditionally think of as “unit tests,” and go several steps further, automating functional testing). This is another one of those areas where team members often (incorrectly) feel that there is not sufficient time to deal with the creation of unit tests. Given a proper framework, however, creation of unit tests need not be overwhelming.
Another occasional issue has to do with misuse of the software version control system. Many developers know the frustration that can come with an accidental code change resulting from one developer stepping over the modifications of another. While this is a rare issue in a properly used version control environment, it does still happen, and unit tests can quickly reveal such a problem at build time.

Concurrency Tests

Concurrency tests are tricky, and it is in concurrency testing that the repeated and rapid nature of functional unit tests can shine where human testers cannot. I personally have witnessed many occasions in which a CI build suddenly fails for no obvious reason. There was no code commit related to the particular point-of-failure, and yet a unit test that once succeeded suddenly fails? Why?

This can happen (and it does happen) because concurrency problems, by their very nature, are hit or miss. Sometimes they are so unlikely to occur that we never witness them during the course of normal testing. When a continuous integration environment runs concurrency tests dozens of times a day, however, we increase the likelihood of finding a hidden and menacing problem. Additionally, unit tests can simulate many concurrent users and processes in a way that even a team of human testers cannot.
Repeatable and Traceable Test Results

This is the key to making our unit tests adhere to the standards we have set forth in our quality system so that we may use them as a part of our submission (see the following section on Regulated Environment Needs). If we are going to put forth the effort, and since we already know that unit tests result in a quality improvement to our software, why wouldn’t we want to include these test results?

Our continuous integration server can and should be used to store our unit test results right alongside each and every build that it performs.

This is a benefit not only in the world of an FDA-regulated environment, of course. In any software project it can be difficult to recreate conditions under which a defect was discovered. With a CI build executing our build and test scripts under a known environment with a known set of files (the CI build tool pulls from the version control system), it is possible to execute the tests under exact and specific circumstances.

Regression Tests with Every Code Change

By “repeated” I mean something different than repeatable. The fundamental benefit with repeated tests is the fact that a test can be executed many more times by automation than by a human tester. Sometimes, even without a related code change, and much to our surprise, we see a test suddenly fail where it succeeded numerous times before. What happened?

The most difficult software defects to fix (much less, find) are the ones that do not happen consistently. Database locking issues, memory issues, deadlock bugs, memory leaks and race conditions can result in such defects. These defects are serious, but if we never detect them how can we fix them?
As stated previously, it is imperative that have unit tests that go above and beyond what we traditionally think of as “unit tests,” and go several steps further, automating functional testing). This is another one of those areas where team members often (incorrectly) feel that there is not sufficient time to deal with the creation of unit tests. Given a proper framework, however, creation of unit tests need not be overwhelming.

Another occasional issue has to do with misuse of the software version control system. Many developers know the frustration that can come with an accidental code change resulting from one developer stepping over the modifications of another. While this is a rare issue in a properly used version control environment, it does still happen, and unit tests can quickly reveal such a problem at build time.

Repeatable and Traceable Test Results

This is the key to making our unit tests adhere to the standards we have set forth in our quality system so that we may use them as a part of our submission (see the following section on Regulated Environment Needs). If we are going to put forth the effort, and since we already know that unit tests result in a quality improvement to our software, why wouldn’t we want to include these test results?

Our continuous integration server can and should be used to store our unit test results right alongside each and every build that it performs.

This is a benefit not only in the world of an FDA-regulated environment, of course. In any software project it can be difficult to recreate conditions under which a defect was discovered. With a CI build executing our build and test scripts under a known environment with a known set of files (the CI build tool pulls from the version control system), it is possible to execute the tests under exact and specific circumstances.

Valuable Unit Tests in a Software Medical Device, Part 4

“The hardware system, software program, and general quality assurance system controls discussed below are essential in the automated manufacture of medical devices. The systematic validation of software and associated equipment will assure compliance with the QS regulation; and reduce confusion, increase employee morale, reduce costs, and improve quality. Further, proper validation will smooth the integration of automated production and quality assurance equipment into manufacturing operations.

Medical devices and the manufacturing processes used to produce them vary from the simple to the very complex. Thus, the QS regulation needs to be and is a flexible quality system. This flexibility is valuable as more device manufacturers move to automated production, test/inspection, and record-keeping systems.”

-Device Advice: Regulation and Guidance, Software Validation Guidelines
(http://www.fda.gov/MedicalDevices/DeviceRegulationandGuidance)

What is a GOOD Unit Test?

In his book Safe and Sound Software, Thomas H. Farris describes unit tests:

Software testing may occur on software modules or units as they are completed. Unit testing is effective for testing units as they are completed, when other units are components have not yet been completed. Testing still remains to be completed to ensure that the application will work as intended when all software units or components are executed together.

This is a start, but unit tests can achieve so much more! Farris goes on to break out a number of different categories of software:

    Black box test
    Unit test
    Integration test
    System test
    Load test
    Regression test
    Requirements-based test
    Code-based test
    Risk-based test
    Clinical test

Traditionally this may be a fair breakout. Used wisely, and with the proper frame work, however, we can perform black box, integration, system, load, regression, requirements, code-based, risk-based and clinical tests with efficient unit tests that simulate a true production environment. The purpose of this article is not to go into the technical details of how (to explain unit test frameworks, fixtures, mock objects and simulations would require much more space). Rather, I simply want to point out the benefits that result. To achieve these benefits, your software team will need to develop a deep understand of unit tests. It will take some time, but it will be time very well spent.

It’s a good idea to have unit tests that go above and beyond what we traditionally think of as “unit tests,” and go several steps further, automating functional testing). This is another one of those areas where team members often (incorrectly) feel that there is not sufficient time to do all the work.

As Harris goes on to state:

Software testing and defect resolution are very time-consuming, often draining more than one-half of all effort undertaken by a software organization [3].

Testing need not wait until the entire product is completed; iteratively designed and developed code may be tested as each iteration of code is completed. Prior to beginning of verification or validation, the project plan or other test plan document should discuss the overall strategy, including types of tests to be performed, specific functional tests to be performed, and a designation of test objectives to determine when the product is sufficiently prepared for release and distribution.

Harris is touching on something that is very important in our FDA-regulated environment, and this is the fact that we must document and describe our tests. For our unit tests to be useful we must provide documentation of what each test does (that is, what specifically it is testing) and what the results are. The beauty of unit tests and the tools available (incorporation into our continuous integration environment) is that this process is streamlined in a way that makes the traceability and re-creation of test conditions required for our 510k extremely easy!

To achieve all of this we will need to have a testing framework capable of application launch, simulations, mock objects, mock interfaces and temporary data persistence. This all sounds like much more overhead than it actually is, so fear not: The benefits far outweigh the costs.

Valuable Unit Tests in a Software Medical Device, Part 3

Automating Functional Tests Using Unit Test Framework

Most software projects, especially in any kind of Agile environment, undergo frequent changes and refactoring. If the traditional single-flow waterfall model worked, recorded test scripts such as those noted previously would probable work just fine as well, albeit with little benefit.

But it should be well known by know that the traditional single-flow waterfall model has failed, and we live in an iterative/Agile world. As such, our automated tests must be equally equipped for ongoing change. And because the functional unit tests are closely related to requirements at both a white-box and black-box level, developers, not testers, have an integral role in the creation of automated tests.

To achieve this level of unit testing, a test framework must be in place. This requires a bit of up-front effort, and the details of creating such a framework go well beyond the scope of this article. Additionally, the needs of a test framework will vary depending on the project.

Test fixtures become an important part of complex functional unit testing. A test fixture is a class that incorporates all of the setup necessary for running such unit tests. It provides methods that can create common objects (for example, test servers and mock interfaces). The details included in a test fixture are specific to each project, but some common methods include test setup, simulation and mock object creation and destruction, and declaration of any common functionality to be used across unit tests. To provide further detail on test fixture creation would require much more detail than can be provided here.

Given what may seem like extreme overhead in the creation of complex unit tests, we may begin to question the value. There is, no doubt, a significant up-front cost to the creation of a versatile and useful unit test framework (including a test fixture, which includes all the necessary objects and setup needed to simulate a running environment for the sake of testing). And given the fact that manual function and user acceptance testing remains a project necessity, it seems like there may be an overlap of effort.

But this is not the case!

With a little up-front creation of a solid unit test framework, we can make efforts to create unit tests simple. We can even go as far as requiring a unit test for any functional requirement implementation prior to allowing that requirement (or ticket) to be considered complete. Furthermore, as we discover potential functionality problems, we have the opportunity to introduce a new test right then and there.

“The hardware system, software program, and general quality assurance system controls discussed below are essential in the automated manufacture of medical devices. The systematic validation of software and associated equipment will assure compliance with the QS regulation; and reduce confusion, increase employee morale, reduce costs, and improve quality. Further, proper validation will smooth the integration of automated production and quality assurance equipment into manufacturing operations.

Medical devices and the manufacturing processes used to produce them vary from the simple to the very complex. Thus, the QS regulation needs to be and is a flexible quality system. This flexibility is valuable as more device manufacturers move to automated production, test/inspection, and record-keeping systems.”

-Device Advice: Regulation and Guidance, Software Validation Guidelines
(http://www.fda.gov/MedicalDevices/DeviceRegulationandGuidance) [1]

Valuable Unit Tests in a Software Medical Device, Part 2

While I personally have never been a fan of test-driven development (I feel that the assumptions required by test-driven development do not allow for a true iterative approach), I do believe that creation of unit tests in parallel with development leads to much more quality software. In the world of Agile, this means that no functional requirement (or user story) is considered fully implemented without a corresponding unit test. This strict view of unit tests may be a bit extreme, but it is not without merit.

The first unit test a developer may ever write is likely so simple that it was nearly useless. It may go something like this.

Given a method:

public int doSomething (int a, int b) {

return c;
}

A simple unit test may look something like this:

public class MyUnitTests {

@Test
public void testDoSomething() {
assertEquals(doSomething(1, 2), expectedResult);
}
}

Given a very simple method the developer is able to assert that, essentially, a + b = c. This is easy to write, and there is little overheard involved, but it really isn’t a very useful unit test.

Early Attempts to Automated Functional Testing

Long ago I was involved with a project in which management invested a significant amount of time and training in an attempt to implement automated testing. The chosen tool was Rational Robot ™ (now an IBM product). The idea behind tools such as Robot was that a test creator could record test macros, note points of verification and replay the macros later with test results noted. Tools such as Rational Robot and WinRunner attempted to replace the human tester with record scripts. These automated scripts could be written using a scripting language or, more commonly, by recording mouse movements, clicks and keyboard actions. In this regard, these tools of test automation allowed black-box testing through a user interface.

In this over-simplified view of automated testing, there were simply too many logistical problems with test implementation to make them practical. Any minor changes to the user interface could result in a broken test script. Those responsible for maintaining these automated scripts often found themselves spending more time maintaining the tests than using them for actual application testing.

Rational Robot and tools like it are alive and well, but I refer to them in the past tense because such tools, in my experience, have proven themselves to be a failure. I say this because I have personally spent significant amounts of time creating automated scripts in such tools, and I have been frustrated to learn later that they would not be used because of the substantial amount of interface code that changes as a project progresses. Such changes are absolutely expected, and yet, a recorded automated test does not lend itself well to an iterative development environment or an ongoing project.

Valuable Unit Tests in a Software Medical Device, Part 1

In computer programming, unit testing is a method by which individual units of source code are tested to determine if they are fit for use. A unit is the smallest testable part of an application. In procedural programming a unit may be an individual function or procedure. In object-oriented programming a unit is usually a method. Unit tests are created by programmers or occasionally by white box testers during the development process.

-Wikipedia

Note: In the world of Java, we have a number of popular options for the implementation of unit tests, with JUnit and TestNG being, arguably, the most popular. Examples provided in this article will use TestNG syntax and annotations.

Traditionally (and by traditionally, I mean in their relatively brief history), unit tests have been thought of as very simple tests to validate basic inputs and outputs of a software method. While this can be true, and such simple tests can serve of some amount of value, it is possible to achieve much more with unit tests. In fact, it is not only possible, but recommended that we implement much of our user acceptance, functional and possibly even some non-functional tests within a unit test framework.

To further enhance quality, we can augment the acceptance with unit tests. [6]
Dean Leffingwell, Agile Software Requirements

While I personally have never been a fan of test-driven development (I feel that the assumptions required by test-driven development do not allow for a true iterative approach), I do believe that creation of unit tests in parallel with development leads to much more quality software. In the world of Agile, this means that no functional requirement (or user story) is considered fully implemented without a corresponding unit test. This strict view of unit tests may be a bit extreme, but it is not without merit.

  • [1] Device Advice: Regulation and Guidance, Software Validation Guidelines, http://www.fda.gov/MedicalDevices/DeviceRegulationandGuidance
  • [2] Safe and Sound Software – Creating an Efficient and Effective Quality System for Software Medical Device Organizations, Thomas H. Farris. ASQ Quality Press, Milwaukee, Wisconsin, 2006, Figure 4.9, “Types of software testing” pg. 120
  • [3] Safe and Sound Software – Creating an Efficient and Effective Quality System for Software Medical Device Organizations, Thomas H. Farris. ASQ Quality Press, Milwaukee, Wisconsin, 2006, pg. 118
  • [4] Safe and Sound Software – Creating an Efficient and Effective Quality System for Software Medical Device Organizations, Thomas H. Farris. ASQ Quality Press, Milwakee, Wisconsin, 2006, Figure 4.9, “Types of software testing” pg. 118
  • [5] CFR – Code of Federal Regulations Title 21. Subpart C – Design Controls, Section 820.30 Design Controls
  • [6] Agile Software Requirements, Dean Leffingwell. Addison-Wesley. Copyright © 2011, Pearson Education, Inc. Boston, MA, page 61
  • [7] Agile Software Requirements, Dean Leffingwell. Addison-Wesley. Copyright © 2011, Pearson Education, Inc. Boston, MA, page 196
  • [8] Continuous Delivery, Jez Humble, David Farley. Addison-Wesley, Copyright © 2011, Pearson Education, Inc. Boston, MA, page 124
  • [9] Safe and Sound Software – Creating an Efficient and Effective Quality System for Software Medical Device Organizations, Thomas H. Farris. ASQ Quality Press, Milwakee, Wisconsin, 2006, Figure 4.9, “Types of software testing” pg. 123

Good Developers Understand Business Needs

In agile development, it is the developer’s job to speak the language of the user, not the user’s job to speak the language of the developers.

-Dean Leffingwell, Agile Software Requirements

Dean Leffingwell is discussing the need for developers to understand users when gathering user stories and features, of course, but this statement holds true to so much more. Over and over again I have observed that the best software developers I come across are those who insist on understanding user needs (and asking the user questions to understand his or her own needs) before jumping in to design.

“We’ll refactor that later…”

Not that long ago a very wise coworker said this in a meeting:

The biggest lie in software is, “We’ll clean that code up later.”

Insightful. I admit, I’ve said this before, and, like others, I haven’t followed through. I suspect this happens often because, being busy developers, if a section of code works there seems to be no immediate value in refactoring it. It works, we have other things to do, and we move on…

I was thinking about the “biggest lie in software” again recently. This time the problem was a little different, but still something that I have come across numerous times.

I can’t recall the number of times I’ve been asked to put together a prototype or a very simple application for a temporary purpose only to find that the prototype/temporary application somehow morphs into a “real” application. (By “real,” I mean, people start using it daily in an ongoing manner that was never intended.)

This happened to me again somewhat recently, and I was very surprised when I was asked if there are a requirements document for the application (which I had spent all of a half a day putting together).

What’s the solution here? I’m not sure. Things move fast and everyone has needs to be met. Sometimes, even in the year 2011, people don’t really understand what software can do for them, so when they see a simple prototype or temporary application, they are excited about the increase in productivity that it can offer and they run with it.

I have a few more thoughts on this subject, but for now I’ll end with a list.

  1. Set expectations clearly. If an application is a prototype or throwaway, make it very well-known that this is the case.
  2. This is a little more difficult: Never write code with the assumption that it will be thrown away after demonstrable use or serving its temporary purpose. For example, I’ve created a few GUI applications that grow into something bigger. I used no real patterns or MVC layout–I simply threw something together using AWT and Swing in NetBeans. Its full of global variables and improper forward and backward class access to methods and values that should probably be private. If any real software developer were to look at the code I would probably blush.
  3. To pull off #2, write your own code with reuse in mind. That temporary application is likely to be useful when implementing our full-blow project.
  4.  

I’m writing these rules as if I follow them myself. To be honest, I don’t always do this. Sometimes there are a number of groups tugging developers in different directions, and its difficult engineer something both appropriately and quickly all at once. That said, we shouldn’t be surprised when that throwaway application we whipped up one afternoon a few weeks ago takes on a life far beyond anything we ever anticipated.