Counting Objects in C++

来源：互联网发布：博思通软件招聘编辑：程序博客网时间：2024/05/16 08:43

Scott Meyers shows the ins and outs of counting object instantiations accurately.

Sometimes easy things are easy, but they're still subtle. For example,suppose you have a class Widget, and you'd like to have a way to find out atrun time how many Widget objects exist. An approach that's both easy toimplement and that gives the right answer is to create a static counter inWidget, increment the counter each time a Widget constructor is called, anddecrement it whenever the Widget destructor is called. You also need a staticmember function howMany to report how many Widgets currently exist. If Widgetdid nothing but track how many of its type exist, it would look more or lesslike this:

class Widget {public:    Widget() { ++count; }    Widget(const Widget&) { ++count; }    ~Widget() { --count; }    static size_t howMany()    { return count; }private:    static size_t count;};// obligatory definition of count. This// goes in an implementation filesize_t Widget::count = 0;

This works fine. The only mildly tricky thing is to remember to implement the copyconstructor, because a compiler-generated copy constructor for Widget wouldn'tknow to increment count.

If you had to do this only for Widget, you'd be done, but counting objects issomething you might want to implement for several classes. Doing the same thingover and over gets tedious, and tedium leads to errors. To forestall suchtedium, it would be best to somehow package the above object-counting code soit could be reused in any class that wanted it. The ideal package would:

be easy to use — require minimal work on the part of class authors who want to use it. Ideally, they shouldn't have to do more than one thing, that is, more than basically say "I want to count the objects of this type."
be efficient — impose no unnecessary space or time penalties on client classes employing the package.
be foolproof — be next to impossible to accidently yield a count that is incorrect. (We're not going to worry about malicious clients, ones who deliberately try to mess up the count. In C++, such clients can always find a way to do their dirty deeds.)

Stop for a moment and think about how you'd implement a reusableobject-counting package that satisfies the goals above. It's probably harderthan you expect. If it were as easy as it seems like it should be, you wouldn'tbe reading an article about it in this magazine.

new, delete, and Exceptions

Whileyou're mulling over your solution to the object-counting problem, allow me toswitch to what seems like an unrelated topic. That topic is the relationshipbetween new and delete when constructors throw exceptions. When you ask C++ todynamically allocate an object, you use a new expression, as in:

class ABCD { ... }; // ABCD = "A Big Complex Datatype"ABCD *p = new ABCD; // a new expression

Thenew expression — whose meaning is built into the language and whose behavioryou cannot change — does two things. First, it calls a memory allocationfunction called operator new. That function is responsible for finding enoughmemory to hold an ABCD object. If the call to operator new succeeds, the newexpression then invokes an ABCD constructor on the memory that operator newfound.

But suppose operator new throws a std::bad_alloc exception. Exceptions of thistype indicate that an attempt to dynamically allocate memory has failed. In thenew expression above, there are two functions that might give rise to thatexception. The first is the invocation of operator new that is supposed to findenough memory to hold an ABCD object. The second is the subsequent invocationof the ABCD constructor that is supposed to turn the raw memory into a validABCD object.

If the exception came from the call to operator new, no memory was allocated.However, if the call to operator new succeeded and the invocation of the ABCDconstructor led to the exception, it is important that the memory allocated byoperator new be deallocated. If it's not, the program has a memory leak. It'snot possible for the client — the code requesting creation of the ABCD object— to determine which function gave rise to the exception.

For many years this was a hole in the draft C++ language specification, but in March 1995 the C++ Standards committee adopted the rule that if, during a new expression, the invocation of operator new succeeds and the subsequent constructor call throws an exception, the runtime system must automatically deallocate the memory that operator new allocated. This deallocation is performed by operator delete, the deallocation analogue of operator new. (For details, see "A Note About Placement new and Placement delete" at the end of this article.)

It is this relationship between new expressions and operator delete affects usin our attempt to automate the counting of object instantiations.

Counting Objects

Inall likelihood, your solution to the object-counting problem involved thedevelopment of an object-counting class. Your class probably looks remarkablylike, perhaps even exactly like, the Widget class I showed earlier:

// see below for a discussion of why// this isn't quite rightclass Counter {  public:              Counter() { ++count; }    Counter(const Counter&) { ++count; }    ~Counter() { --count; }    static size_t howMany()        { return count; }private:    static size_t count;};// This still goes in an// implementation filesize_t Counter::count = 0;

Theidea here is that authors of classes that need to count the number of objectsin existence simply use Counter to take care of the bookkeeping. There are twoobvious ways to do this. One way is to define a Counter object as a class datamember, as in:

// embed a Counter to count objectsclass Widget {public:    .....  // all the usual public           // Widget stuff    static size_t howMany()    { return Counter::howMany(); }private:    .....  // all the usual private           // Widget stuff    Counter c;};

Theother way is to declare Counter as a base class, as in:

// inherit from Counter to count objectsclass Widget: public Counter {    .....  // all the usual public           // Widget stuffprivate:    .....  // all the usual private           // Widget stuff};

Bothapproaches have advantages and disadvantages. But before we examine them, weneed to observe that neither approach will work in its current form. Theproblem has to do with the static object count inside Counter. There's only onesuch object, but we need one for each class using Counter. For example, if wewant to count both Widgets and ABCDs, we need two static size_t objects, notone. Making Counter::count nonstatic doesn't solve the problem, because we needone counter per class, not one counter per object.

We can get the behavior we want by employing one of the best-known butoddest-named tricks in all of C++: we turn Counter into a template, and eachclass using Counter instantiates the template with itself as the templateargument.

Let me say that again. Counter becomes a template:

template<typename T>class Counter {public:    Counter() { ++count; }    Counter(const Counter&) { ++count; }    ~Counter() { --count; }    static size_t howMany()    { return count; }private:    static size_t count;};template<typename T>size_tCounter<T>::count = 0; // this now can go in header

Thefirst Widget implementation choice now looks like:

// embed a Counter to count objectsclass Widget {public:    .....    static size_t howMany()    {return Counter<Widget>::howMany();}private:    .....    Counter<Widget> c;};

Andthe second choice now looks like:

// inherit from Counter to count objectsclass Widget: public Counter<Widget> {        .....};

Noticehow in both cases we replace Counter with Counter<Widget>. As I saidearlier, each class using Counter instantiates the template with itself as theargument.

The tactic of a class instantiating a template for its own use by passingitself as the template argument was first publicized by Jim Coplien. He showedthat it's used in many languages (not just C++) and he called it "a curiouslyrecurring template pattern" [1]. I don't think Jim intended it, but hisdescription of the pattern has pretty much become its name. That's too bad,because pattern names are important, and this one fails to convey informationabout what it does or how it's used.

The naming of patterns is as much art as anything else, and I'm not very goodat it, but I'd probably call this pattern something like "Do It For Me."Basically, each class generated from Counter provides a service (it counts howmany objects exist) for the class requesting the Counter instantiation. So theclass Counter<Widget> counts Widgets, and the class Counter<ABCD>counts ABCDs.

Now that Counter is a template, both the embedding design and the inheritancedesign will work, so we're in a position to evaluate their comparativestrengths and weaknesses. One of our design criteria was that object-countingfunctionality should be easy for clients to obtain, and the code above makesclear that the inheritance-based design is easier than the embedding-baseddesign. That's because the former requires only the mentioning of Counter as abase class, whereas the latter requires that a Counter data member be definedand that howMany be reimplemented by clients to invoke Counter's howMany [2].That's not a lot of additional work (client howManys are simple inlinefunctions), but having to do one thing is easier than having to do two. Solet's first turn our attention to the design employing inheritance.

Using Public Inheritance

Thedesign based on inheritance works because C++ guarantees that each time aderived class object is constructed or destroyed, its base class part will alsobe constructed first and destroyed last. Making Counter a base class thusensures that a Counter constructor or destructor will be called each time aclass inheriting from it has an object created or destroyed.

Any time the subject of base classes comes up, however, so does the subject ofvirtual destructors. Should Counter have one? Well-established principles ofobject-oriented design for C++ dictate that it should. If it has no virtualdestructor, deletion of a derived class object via a base class pointer yieldsundefined (and typically undesirable) results:

class Widget: public Counter<Widget>{ ... };Counter<Widget> *pw =    new Widget;  // get base class ptr                 // to derived class object    ......delete pw; // yields undefined results           // if the base class lacks           // a virtual destructor

Suchbehavior would violate our criterion that our object-counting design beessentially foolproof, because there's nothing unreasonable about the codeabove. That's a powerful argument for giving Counter a virtual destructor.

Another criterion, however, was maximal efficiency (imposition of nounnecessary speed or space penalty for counting objects), and now we're introuble. We're in trouble because the presence of a virtual destructor (or anyvirtual function) in Counter means each object of type Counter (or a classderived from Counter) will contain a (hidden) virtual pointer, and this willincrease the size of such objects if they don't already support virtualfunctions [3]. That is, if Widget itself contains no virtual functions, objectsof type Widget would increase in size if Widget started inheriting fromCounter<Widget>. We don't want that.

The only way to avoid it is to find a way to prevent clients from deletingderived class objects via base class pointers. It seems that a reasonable wayto achieve this is to declare operator delete private in Counter:

template<typename T>class Counter {public:    .....private:    void operator delete(void*);    .....};

Nowthe delete expression won't compile:

class Widget: public Counter<Widget> { ... };Counter<Widget> *pw = new Widget;  ......delete pw; // Error. Can't call private// operator delete

Unfortunately— and this is the really interesting part — the new expression shouldn'tcompile either!

Counter<Widget> *pw =    new Widget;  // this should not                 // compile because                 // operator delete is                 // private

Rememberfrom my earlier discussion of new, delete, and exceptions that C++'s runtimesystem is responsible for deallocating memory allocated by operator new if thesubsequent constructor invocation fails. Recall also that operator delete isthe function called to perform the deallocation. But we've declared operatordelete private in Counter, which makes it invalid to create objects on the heapvia new!

Yes, this is counterintuitive, and don't be surprised if your compilers don'tyet support this rule, but the behavior I've described is correct. Furthermore,there's no other obvious way to prevent deletion of derived class objects viaCounter* pointers, and we've already rejected the notion of a virtualdestructor in Counter. So I say we abandon this design and turn our attentionto using a Counter data member instead.

Using a Data Member

We'vealready seen that the design based on a Counter data member has one drawback:clients must both define a Counter data member and write an inline version ofhowMany that calls the Counter's howMany function. That's marginally more workthan we'd like to impose on clients, but it's hardly unmanageable. There isanother drawback, however. The addition of a Counter data member to a classwill often increase the size of objects of that class type.

At first blush, this is hardly a revelation. After all, how surprising is itthat adding a data member to a class makes objects of that type bigger? Butblush again. Look at the definition of Counter:

template<typename T>class Counter {public:    Counter();    Counter(const Counter&);    ~Counter();    static size_t howMany();private:    static size_t count;};

Noticehow it has no nonstatic data members. That means each object of type Countercontains nothing. Might we hope that objects of type Counter have size zero? Wemight, but it would do us no good. C++ is quite clear on this point. Allobjects have a size of at least one byte, even objects with no nonstatic datamembers. By definition, sizeof will yield some positive number for each classinstantiated from the Counter template. So each client class containing aCounter object will contain more data than it would if it didn't contain theCounter.

(Interestingly, this does not imply that the size of a class without a Counterwill necessarily be bigger than the size of the same class containing aCounter. That's because alignment restrictions can enter into the matter. Forexample, if Widget is a class containing two bytes of data but that's requiredto be four-byte aligned, each object of type Widget will contain two bytes ofpadding, and sizeof(Widget) will return 4. If, as is common, compilers satisfythe requirement that no objects have zero size by inserting a char intoCounter<Widget>, it's likely that sizeof(Widget) will still yield 4 evenif Widget contains a Counter<Widget> object. That object will simply takethe place of one of the bytes of padding that Widget already contained. This isnot a terribly common scenario, however, and we certainly can't plan on it whendesigning a way to package object-counting capabilities.)

I'm writing this at the very beginning of the Christmas season. (It is in factThanksgiving Day, which gives you some idea of how I celebrate majorholidays...) Already I'm in a Bah Humbug mood. All I want to do is countobjects, and I don't want to haul along any extra baggage to do it. There hasgot to be a way.

Using Private Inheritance

Lookagain at the inheritance-based code that led to the need to consider a virtualdestructor in Counter:

class Widget: public Counter<Widget>{ ... };Counter<Widget> *pw = new Widget;            ......deletepw;  // yields undefined results     // if Counter lacks a virtual     // destructor

Earlierwe tried to prevent this sequence of operations by preventing the deleteexpression from compiling, but we discovered that that also prohibited the newexpression from compiling. But there is something else we can prohibit. We canprohibit the implicit conversion from a Widget* pointer (which is what newreturns) to a Counter<Widget>* pointer. In other words, we can preventinheritance-based pointer conversions. All we have to do is replace the use ofpublic inheritance with private inheritance:

class Widget: private Counter<Widget>{ ... };Counter<Widget> *pw =    new Widget;  // error! no implicit                 // conversion from                 // Widget* to                 // Counter<Widget>*

Furthermore,we're likely to find that the use of Counter as a base class does not increasethe size of Widget compared to Widget's stand-alone size. Yes, I know I justfinished telling you that no class has zero size, but — well, that's notreally what I said. What I said was that no objects have zero size. The C++Standard makes clear that the base-class part of a more derived object may havezero size. In fact many compilers implement what has come to be known as theempty base optimization [4].

Thus, if a Widget contains a Counter, the size of the Widget must increase. TheCounter data member is an object in its own right, hence it must have nonzerosize. But if Widget inherits from Counter, compilers are allowed to keep Widgetthe same size it was before. This suggests an interesting rule of thumb fordesigns where space is tight and empty classes are involved: prefer privateinheritance to containment when both will do.

This last design is nearly perfect. It fulfills the efficiency criterion,provided your compilers implement the empty base optimization, becauseinheriting from Counter adds no per-object data to the inheriting class, andall Counter member functions are inline. It fulfills the foolproof criterion,because count manipulations are handled automatically by Counter memberfunctions, those functions are automatically called by C++, and the use ofprivate inheritance prevents implicit conversions that would allowderived-class objects to be manipulated as if they were base-class objects.(Okay, it's not totally foolproof: Widget's author might foolishly instantiateCounter with a type other than Widget, i.e., Widget could be made to inheritfrom Counter<Gidget>. I choose to ignore this possibility.)

The design is certainly easy for clients to use, but some may grumble that itcould be easier. The use of private inheritance means that howMany will becomeprivate in inheriting classes, so such classes must include a using declarationto make howMany public to their clients:

class Widget: private Counter<Widget> {public:    // make howMany public    using Counter<Widget>::howMany;     ..... // rest of Widget is unchanged};class ABCD: private Counter<ABCD> {public:    // make howMany public    using Counter<ABCD>::howMany;    ..... // rest of ABCD is unchanged};

Forcompilers not supporting namespaces, the same thing is accomplished byreplacing the using declaration with the older (now deprecated) accessdeclaration:

class Widget: private Counter<Widget> {public:    // make howMany public    Counter<Widget>::howMany;     .....  // rest of Widget is unchanged};

Hence,clients who want to count objects and who want to make that count available (aspart of their class's interface) to their clients must do two things: declareCounter as a base class and make howMany accessible [5].

The use of inheritance does, however, lead to two conditions that are worthnoting. The first is ambiguity. Suppose we want to count Widgets, and we wantto make the count available for general use. As shown above, we have Widgetinherit from Counter<Widget> and we make howMany public in Widget. Nowsuppose we have a class SpecialWidget publicly inherit from Widget and we wantto offer SpecialWidget clients the same functionality Widget clients enjoy. Noproblem, we just have SpecialWidget inherit fromCounter<SpecialWidget>.

But here is the ambiguity problem. Which howMany should be made available bySpecialWidget, the one it inherits from Widget or the one it inherits fromCounter<SpecialWidget>? The one we want, naturally, is the one fromCounter<SpecialWidget>, but there's no way to say that without actuallywriting SpecialWidget::howMany. Fortunately, it's a simple inline function:

class SpecialWidget: public Widget,    private Counter<SpecialWidget> {public:    .....    static size_t howMany()    { return Counter<SpecialWidget>::howMany(); }    .....};

Thesecond observation about our use of inheritance to count objects is that thevalue returned from Widget::howMany includes not just the number of Widgetobjects, it includes also objects of classes derived from Widget. If the onlyclass derived from Widget is SpecialWidget and there are five stand-aloneWidget objects and three stand-alone SpecialWidgets, Widget::howMany willreturn eight. After all, construction of each SpecialWidget also entailsconstruction of the base Widget part.

Summary

Thefollowing points are really all you need to remember:

Automating the counting of objects isn't hard, but it's not completely straightforward, either. Use of the "Do It For Me" pattern (Coplien's "curiously recurring template pattern") makes it possible to generate the correct number of counters. The use of private inheritance makes it possible to offer object-counting capabilities without increasing object sizes.
When clients have a choice between inheriting from an empty class or containing an object of such a class as a data member, inheritance is preferable, because it allows for more compact objects.
Because C++ endeavors to avoid memory leaks when construction fails for heap objects, code that requires access to operator new generally requires access to the corresponding operator delete too.
The Counter class template doesn't care whether you inherit from it or you contain an object of its type. It looks the same regardless. Hence, clients can freely choose inheritance or containment, even using different strategies in different parts of a single application or library.

Notes and References

[1]James O. Coplien. "The Column Without a Name: A Curiously Recurring TemplatePattern," C++ Report, February 1995.

[2] An alternative is to omit Widget::howMany and make clients callCounter<Widget>::howMany directly. For the purposes of this article,however, we'll assume we want howMany to be part of the Widget interface.

[3] Scott Meyers. More Effective C++ (Addison-Wesley, 1996), pp. 113-122.

[4] Nathan Myers. "The Empty Member C++ Optimization," Dr. Dobb'sJournal, August 1997. Also available athttp://www.cantrip.org/emptyopt.html.

[5] Simple variations on this design make it possible for Widget to useCounter<Widget> to count objects without making the count available toWidget clients, not even by calling Counter<Widget>::howMany directly.Exercise for the reader with too much free time: come up with one or more suchvariations.

Acknowledgments

MarkRodgers, Damien Watkins, Marco Dalla Gasperina, and Bobby Schmidt providedcomments on drafts of this article. Their insights and suggestions improved itin several ways.

A Note About Placement new and Placement delete

TheC++ equivalent of malloc is operator new, and the C++ equivalent of free isoperator delete. Unlike malloc and free, however, operator new and operatordelete are overloadable functions, and as such they may take different numbersand types of parameters. This has always been the case for operator new, butuntil relatively recently, it wasn't valid to overload operator delete.

The "normal" signature for operator new is:

void * operator new(size_t) throw (std::bad_alloc);

(Tosimplify things from now on, I'll omit exception specifications. They're notgermane to the points I want to make.) Overloaded versions of operator new arelimited to adding additional parameters, so an overloaded version of operatornew might look like:

void * operator new(size_t, void *whereToPutObject){ return whereToPutObject; }

Thisparticular version of operator new — the one taking an extra void* argumentspecifying what pointer the function should return — is so commonly usefulthat it's in the Standard C++ library (declared in the header <new>) andit has a name, "placement new." The name indicates its purpose: to allowprogrammers to specify where in memory an object should be created (where itshould be placed).

Over time, the term placement new has come to be applied to any version ofoperator new taking additional arguments. (This terminology is actuallyenshrined in the grammar for C++ in the forthcoming International Standard.)Hence, when C++ programmers talk about the placement new function, they meanthe function above, the one taking the extra void* parameter specifying wherean object should be placed. When they talk about a placement new function,however, they mean any version of operator new taking more than the mandatorysize_t argument. That includes the function above, but it also includes aplethora of operator new functions that take more or different parametertypes.

In other words, when the topic is memory allocation functions, "placementnew" means "a version of operator new taking extra arguments." The term canmean still other things in other contexts, but we don't need to go down thatroad here, so we won't. For details, consult the suggested reading at the endof the article.

By analogy with placement new, the term "placement delete" means "a versionof operator delete taking extra arguments." The "normal" signature foroperator delete is:

void operator delete(void*);

Soany version of operator delete taking one or more arguments beyond themandatory void* parameter is a placement delete function.

Let us now revisit an issue discussed in the main article. What happens when anexception is thrown during construction of a heap object? Consider again thissimple example:

class ABCD { ... };ABCD *p = new ABCD;

Supposethe attempt to create the ABCD object yields an exception. The main articlepointed out that if that exception came from the ABCD constructor, operatordelete would automatically be called to deallocate the memory allocated byoperator new. But what if operator new is overloaded, and what if differentversions of operator new (quite reasonably) allocate memory in different ways?How can operator delete know how to correctly deallocate the memory?Furthermore, what if the ABCD object being created used placement new, asin:

void *objectBuffer = getPointerToStaticBuffer();ABCD *p = new (objectBuffer) ABCD; // create an ABCD object in a static buffer

(The)placement new didn't actually allocate any memory. It just returned the pointerto the static buffer that was passed to it in the first place. So there's noneed for any deallocation.

Clearly, the actions to be taken in operator delete to undo the actions of itscorresponding operator new depend on the version of operator new that wasinvoked to allocate the memory.

To make it possible for programmers to indicate how the actions of particularversions of operator new can be undone, the C++ Standards committee extendedC++ to allow operator delete to be overloaded, too. When an exception is thrownfrom the constructor for a heap object, then, the game is played a special way.The version of operator delete that's called is the one whose extra parametertypes correspond to those of the version of operator new that was invoked.

If there's no placement delete whose extra parameters correspond to the extraparameters of the placement new that was called, no operator delete is invokedat all. So the effects of operator new are not undone. For functions like (the)placement version of operator new, this is fine, because they don't reallyallocate memory. In general, however, if you create a custom placement versionof operator new, you should also create the corresponding custom placementversion of operator delete to go with it.

Alas, most compilers don't yet support placement delete. With code generatedfrom such compilers, you almost always suffer a memory leak if an exception isthrown during construction of a heap object, because no attempt is made todeallocate the memory that was allocated before the constructor was invoked.