Multithreaded unit testing with ConTest

来源：互联网发布：期权交易软件编辑：程序博客网时间：2024/04/29 11:07

http://www.ibm.com/developerworks/java/library/j-contest/index.html

It's no secret that concurrent programs are prone to bugs. Writingsuch programs is a challenge, and the bugs that creep in during theprocess aren't easily discovered. Many concurrent bugs are found only atsystem test, functional test, or by the user. And by that time,they're expensive to fix -- if you can fix them at all -- becausethey're so hard to debug.

In this article, we introduce ConTest, a tool for testing, debugging,and measuring the coverage of concurrent programs. As you'll quicklysee, ConTest isn't a replacement for unit testing but a complementarytechnology that addresses the failures of unit testing for concurrencyprograms.

Note that this article includes an examplespackage that you can experiment with on your own, once youunderstand the basics of how ConTest works.

Why unit testing isn't enough

Ask any Java™ developer and they'll tell you that unit testing isa good practice. Make the proper investment in a unit test, and it paysoff later. With unit testing, you find bugs earlier and fix them moreeasily than you would without. But the common methodologies of unittesting, even when done thoroughly, aren't so good at findingconcurrent bugs. That's why they tend to escape to later stages inthe program.

Why do unit tests consistently miss concurrent bugs? It is oftensaid that the problem with concurrent programs (and bugs)is that they are nondeterministic. But for unit testing purposes,paradoxically, the problem is that concurrent programs aretoodeterministic. The following two examples illustrate this point.

The naked name printer

Our first example is a class that does nothing more than print atwo-part name. For educational purposes, we've split this task amongthree threads: one prints the first name, one prints a space, and one prints the surname and a new-line. A full-fledgedsynchronization protocol, including synchronizing on a lock and callingwait() and notifyAll(), ensures that everything happens in the rightorder. As you can see in Listing 1,main() serves as a unit test, invoking this class for the name "Washington Irving":

Listing 1. Name printer

public class NamePrinter {   private final String firstName;   private final String surName;   private final Object lock = new Object();   private boolean printedFirstName = false;   private boolean spaceRequested = false;   public NamePrinter(String firstName, String surName) {      this.firstName = firstName;      this.surName = surName;   }   public void print() {      new FirstNamePrinter().start();      new SpacePrinter().start();      new SurnamePrinter().start();   }   private class FirstNamePrinter extends Thread {      public void run() {         try {            synchronized (lock) {               while (firstName == null) {                  lock.wait();               }               System.out.print(firstName);               printedFirstName = true;               spaceRequested = true;               lock.notifyAll();            }         } catch (InterruptedException e) {            assert (false);         }      }   }   private class SpacePrinter extends Thread {      public void run() {         try {            synchronized (lock) {               while ( ! spaceRequested) {                  lock.wait();               }               System.out.print(' ');               spaceRequested = false;               lock.notifyAll();            }         } catch (InterruptedException e) {            assert (false);         }      }   }   private class SurnamePrinter extends Thread {      public void run() {         try {            synchronized(lock) {               while ( ! printedFirstName || spaceRequested || surName == null) {                  lock.wait();               }               System.out.println(surName);            }         } catch (InterruptedException e) {            assert (false);         }      }   }   public static void main(String[] args) {      System.out.println();      new NamePrinter("Washington", "Irving").print();   }}

If you want, you can compile and run this class and verify that the name isprinted as expected. Next, let's remove all of the synchronization protocol, as shown in Listing 2:

Listing 2. The naked name printer

public class NakedNamePrinter {   private final String firstName;   private final String surName;   public NakedNamePrinter(String firstName, String surName) {      this.firstName = firstName;      this.surName = surName;      new FirstNamePrinter().start();      new SpacePrinter().start();      new SurnamePrinter().start();   }   private class FirstNamePrinter extends Thread {      public void run() {         System.out.print(firstName);      }   }   private class SpacePrinter extends Thread {      public void run() {         System.out.print(' ');      }   }   private class SurnamePrinter extends Thread {      public void run() {         System.out.println(surName);      }   }   public static void main(String[] args) {      System.out.println();      new NakedNamePrinter("Washington", "Irving");   }}

This move renders our class totally incorrect: it nolonger includes the instructions to ensure things happen in the rightorder. But what happens when we compile and run the class? Everything isexactly the same! "Washington Irving" is printed in perfect order.

What is the moral of this experiment? Imagine that the name printer,complete with its synchronization protocol, is your concurrent class.You ran your unit test -- many times, perhaps -- and it worked perfectlyevery time. Naturally, you think you can rest assured that it is correct. But as you've just seen, the output is equally correct with no synchronizationprotocol at all and, you may safely conclude, with many wrongimplementations of the protocol. So, while youthink you've tested yourprotocol, you didn't really test it.

Now let's look at another example.

The buggy work queue

The following class is a model of a common concurrent utility: awork queue. It has one method to enqueue tasks and another method towork them out. Before removing a task from the queue, thework() method checks to see if it is empty and if so, waits.The enqueue() method notifies all waitingthreads (if any). To make this example simple, the tasks are juststrings and the work is to print them. Again,main() serves as a unit test. By the way, this classhas a bug.

Listing 3. Print queue

import java.util.*;public class PrintQueue {   private LinkedList<String> queue = new LinkedList<String>();   private final Object lock = new Object();   public void enqueue(String str) {      synchronized (lock) {         queue.addLast(str);         lock.notifyAll();      }   }   public void work() {      String current;      synchronized(lock) {         if (queue.isEmpty()) {            try {               lock.wait();            } catch (InterruptedException e) {               assert (false);            }         }         current = queue.removeFirst();      }      System.out.println(current);   }   public static void main(String[] args) {      final PrintQueue pq = new PrintQueue();      Thread producer1 = new Thread() {         public void run() {            pq.enqueue("anemone");            pq.enqueue("tulip");            pq.enqueue("cyclamen");         }      };      Thread producer2 = new Thread() {         public void run() {            pq.enqueue("iris");            pq.enqueue("narcissus");            pq.enqueue("daffodil");         }      };      Thread consumer1 = new Thread() {         public void run() {            pq.work();            pq.work();            pq.work();            pq.work();         }      };      Thread consumer2 = new Thread() {         public void run() {            pq.work();            pq.work();         }      };      producer1.start();      consumer1.start();      consumer2.start();      producer2.start();   }}

After running the test, everything seems all right. As the developer of theclass, you would likely feel quite content: The test seemed nontrivial(two producers, two consumers, and an interesting order among them thatwould exercise thewait), and it workedcorrectly.

But there's that bug we mentioned. Did you see it? If not, just wait;we'll soon catch it.

Determinism in concurrency programming

Why didn't the two example unit tests expose our concurrency bugs?Although the thread scheduler, in principle,may switch threadsin the middle and run them in different order, it tends not to.Because concurrent tasks in unit tests are usually small and few, they usually run to completion before the scheduler switches the thread,unless it is forced to (say, bywait()). Andwhen it does perform a thread switch, it tends to do it in thesame place whenever you run the program.

Like we said previously, the problem is that the program istoo deterministic: You end up testing just oneinterleaving (the relative order of commands in differentthreads) out of the myriad possible ones. When will more interleavingsbe exercised? When there are more concurrent tasks and more complexinterplay between concurrent classes and protocols. That is, when you run the system and function tests -- or when the whole product runs at the user's site. That's where all the bugs will surface.

Unit testing with ConTest

What's needed is for the JVM to be less deterministic, more "fuzzy"when doing unit tests. This is where ConTest comes into play. If you runtheNakedNamePrinter from Listing 2 with ConTest several times, you get allkinds of results, as shown in Listing 4:

Listing 4. Output of naked name printer with ConTest

>Washington Irving (the expected result)> WashingtonIrving (the space was printed first)>Irving Washington (surname + new-line printed first)> IrvingWashington (space, surname, first name)

Note that you won't necessarily get these results in this order orone after the other; you're likely to see the first two several timesbefore you see the last two. But quite soon, you'll see all of them.ConTest makes all kinds of interleavings happen; because the interleavingsare chosen randomly, a different one is likely to result each time yourun the same test. By comparison, if you the run theNamePrinter shown in Listing1 with ConTest, you'll always get the expected result. In that case,the synchronization protocol forces the correct order, so ConTest onlygenerateslegal interleavings.

If you run PrintQueue with ConTest,you'll get a different order of the flowers, which may be considered anacceptable result for a unit test. But run it several times andsuddenly you'll get aNoSuchElementExceptionthrown by LinkedList.removeFirst() in line24. The bug lurks in the following scenario:

Two consumer threads are started, find the queue to be empty, and do wait().
A producer enqueues a task and notifies both consumers.
One consumer gets the lock, works the task, and leaves the queue empty. It then releases the lock.
The second consumer gets the lock (it can proceed because it was notified) and tries to work a task, but now the queue is empty.

While not an ordinary interleaving for this unit test, the abovescenario is legal and could happen in more complex uses of the class.With ConTest, you can make it happen in the unit test. (By the way, doyou know how to fix the bug? Careful: replacing thenotifyAll() with notify() solves the problem in this scenariobut will fail on others!)

How ConTest works

The basic principle behind ConTest is quite simple. Theinstrumentation stage transforms the class files, injecting intochosen places calls to ConTest run-time functions. At run time, ConTestsometimes tries to cause context switches in these places. The chosenplaces are those whose relative order among the threads is likely toaffect the result: entrance and exit from synchronized blocks, access toshared variables, etc. Context switches are attempted by callingmethods such asyield() or sleep(). The decisions are random so that differentinterleavings are attempted at each run. Heuristics are used to try toreveal typical bugs.

Note that ConTest does not know whether a bug has actually been revealed -- ithas no notion of how the program is expected to behave. You, the user,should have a test and know which test results would beconsidered correct and which would indicate a bug. ConTest just helps exposethe bug. On the other hand, there are no false alarms: Allinterleavings that occur with ConTest are legal as far as the JVMrules are concerned.

As you've seen, you get an added value from running the same testmany times. In fact, we recommend running it over and over again for awhole night. You can then be highly confident that all possibleinterleavings have been executed.

Features of ConTest

In addition to its basic methodology, ConTest brings several key featuresinto play to expose concurrency bugs:

Synchronization coverage: Measuring code coverage is a highlyrecommended practice in unit testing, but when it comes to concurrentprograms, code coverage is misleading. In our first two examples, thenaked name printer and buggy print queue, the given unit tests wouldshow full statement coverage (except for InterruptedException handling) without exposing thebugs. Synchronization coverage bridges this gap: It measures how muchcontention exists among synchronized blocks; that is, whether they did"interesting" things and, therefore, whether you covered interestinginterleavings. SeeResources for additional information.
Deadlock prevention: ConTest can analyze whether locks weretaken nestedly in conflicting order, which indicates a danger ofdeadlock. This analysis is done offline after running thetest(s).
Debug aids: ConTest can produce some run-time reports usefulfor concurrent debugging: a report on the status of locks (which threadshold which locks, which are in wait, etc.), a report of the currentlocation of threads, and a report on the last values assigned to andread from variables. You can also do these queries remotely; forexample, you can query the state of a server (running with ConTest) froma different machine. Another feature useful for debugging is approximatereplay, which attempts to repeat the interleaving of a given run(not guaranteed, but in high probability).
UDP network perturbation: ConTest carries the idea ofconcurrent perturbation into the domain of network communication by UDP(datagram) sockets. UDP programs cannot rely on the reliability of thenetwork; packets may be lost or reordered, and it's up to theapplication to handle these situations. Similar to multithreading, thispresents a testing challenge: In a normal environment, the packets tend to arrive in perfect order, and the perturbation-handlingfunctionality is not actually tested. ConTest can simulate badnetwork conditions, thus exercising this functionality and exposing itsbugs.

Challenges and future directions

ConTest was created for the Java platform. A version for C/C++,for the pthread library, is now in use internally at IBM but does notcontain all the features of the Java version. Java code is much easierfor ConTest to manipulate than C/C++ for two reasons: synchronization ispart of the Java language, and the bytecode is very easy toinstrument. We're working on ConTest for other libraries, such as MPI.If you would like to use ConTest for C/C++, please contact the authors.Hard real-time software is also a problem for ConTest, as the toolworks by adding delays. We're investigating methodologies similar tomonitoring of hard real-time software for the use of ConTest, but atpresent, we're not sure how to surmount this problem.

As far as future directions, we're currently working on publishing alisteners architecture, which will allow us to applylistener-based tools on top of ConTest. Using the listeners architecturewould make it possible to create atomicity checkers, deadlock detectors,and other analyzers and to try new delay mechanisms without having towrite the infrastructure involved.

In conclusion

ConTest is a tool for testing, debugging, and measuring the coverageof concurrent programs. Developed by researchers at the IBM Researchlaboratory in Haifa, Israel, ConTest isavailable from alphaWorks in a limitedtrial version. Please contact the authors if you have further questions about ConTest.