Effective Objective-C 2.0: Item 30: Use ARC to Make Reference Counting Easier

来源：互联网发布：田众和时代网络编辑：程序博客网时间：2024/05/22 08:11

Item 30: Use ARC to Make Reference Counting Easier

Reference counting is a fairly easy concept to understand (see Item 29). The semantics of where retains and releases need to appear are easily expressed. So with the Clang compiler project came a static analyzer that is able to indicate the location of problems with the reference counting. For example, consider the following snippet written with manual reference counting:

Click here to view code image

if ([self shouldLogMessage]) {
    NSString *message = [[NSString alloc] initWithFormat:
                         @"I am object, %p", self];
    NSLog(@"message = %@", message);
}

This code has a memory leak because the message object is not released at the end of the if statement. Since it cannot be referenced outside the if statement, the object is leaked. The rules governing why this is a leak are straightforward. The call to NSString’s alloc method returns an object with a +1 retain count. But there is no balancing release. These rules are easy to express, and a computer could easily apply these rules and tell us that the object has been leaked. That’s exactly what the static analyzer does.

The static analyzer was taken one step further. Since it is able to tell you where there are memory-management problems, it should easily be able to go ahead and fix them by adding in the required retain or release, right? That is the idea from which Automatic Reference Counting (ARC) was born. ARC does exactly what it says in the name: makes reference counting automatic. So in the preceding code snippet, the message object would automatically have a release added in just before the end of the if statement scope, automatically turning the code into the following:

Click here to view code image

if ([self shouldLogMessage]) {
    NSString *message = [[NSString alloc] initWithFormat:
                         @"I am object, %p", self];
    NSLog(@"message = %@", message);
    [message release]; ///< Added by ARC
}

The important thing to remember with ARC is that reference counting is still being performed. But ARC adds in the retains and releases for you. ARC does more than apply memory-management semantics to methods that return objects, as you will see. But it is these core semantics, which have become standard throughout Objective-C, on which ARC is built.

Because ARC adds retains, releases, and autoreleases for you, calling memory-management methods directly under ARC is illegal. Specifically, you cannot call the following methods:

retain

release

autorelease

dealloc

Calling any of these methods directly will result in a compiler error because doing so would interfere with ARC’s being able to work out what memory-management calls are required. You have to put your trust in ARC to do the right thing, which can be daunting for developers used to manual reference counting.

In fact, ARC does not call these methods through the normal Objective-C message dispatch but instead calls lower-level C variants. This is optimal, since retains and releases are performed frequently, and so saving CPU cycles here is a big win. For example, the equivalent for `retain` is `objc_retain.` This is also why it is illegal to override `retain`, `release,` or `autorelease`, as these methods are never called directly. For the rest of this item, I will usually talk about the equivalent Objective-C method rather than the lower-level C variants. This should help if your background is with manual reference counting.

Method-Naming Rules Applied by ARC

The memory-management semantics dictated through method names have long been convention in Objective-C, but ARC has cemented them as hard rules. The rules are simple and relate to the method name. A method returning an object returns it owned by the caller if its method name begins with one of the following:

alloc

new

copy

mutableCopy

“Owned by the caller” means that the code calling any of the four methods listed is responsible for releasing the returned object. That is to say, the object will have a positive retain count, where exactly 1 needs to be balanced by the calling code. The retain count may be greater than 1 if the object has been retained additionally and autoreleased, which is one reason why the retainCount method is not useful (see Item 36).

Any other method name indicates that any returned object will be returned not owned by the calling code. In these cases, the object will be returned autoreleased, so that the value is alive across the method call boundary. If it wants to ensure that the object stays alive longer, the calling code must retain it.

ARC automatically handles all memory management required to maintain these rules, including the code for returning objects autoreleased, as illustrated in the following code:

Click here to view code image

+ (EOCPerson*)newPerson {
    EOCPerson *person = [[EOCPerson alloc] init];
    return person;
    /**
     * The method name begins with 'new', and since 'person'
     * already has an unbalanced +1 retain count from the
     * 'alloc', no retains, releases, or autoreleases are
     * required when returning.
     */
}

+ (EOCPerson*)somePerson {
    EOCPerson *person = [[EOCPerson alloc] init];
    return person;
    /**
     * The method name does not begin with one of the "owning"
     * prefixes, therefore ARC will add an autorelease when
     * returning 'person'.
     * The equivalent manual reference counting statement is:
     *   return [person autorelease];
     */
}

- (void)doSomething {
    EOCPerson *personOne = [EOCPerson newPerson];
    // ...

    EOCPerson *personTwo = [EOCPerson somePerson];
    // ...

    /**
     * At this point, 'personOne' and 'personTwo' go out of
     * scope, therefore ARC needs to clean them up as required.
     * - 'personOne' was returned as owned by this block of
     *   code, so it needs to be released.
     * - 'personTwo' was returned not owned by this block of
     *   code, so it does not need to be released.
     * The equivalent manual reference counting cleanup code
     * is:
     *    [personOne release];
     */
}

ARC standardizes the memory-management rules through naming conventions, something that newcomers to the language often see as unusual. Very few other languages put as much emphasis on naming as Objective-C does. Becoming comfortable with this concept is crucial to being a good Objective-C developer. ARC helps with the process because it does a lot of the work for you.

In addition to adding in retains and releases, ARC has other benefits. It is also able to perform optimizations that would be difficult or impossible to do by hand. For example, at compile time, ARC can collapse retains, releases, and autoreleases to cancel them out, if possible. If it sees that the same object is being retained multiple times and released multiple times, ARC can sometimes remove pairs of retains and releases.

ARC also includes a runtime component. The optimizations that occur here are even more interesting and should help prove why all future code should be written under ARC. Recall that some objects are returned from methods autoreleased. Sometimes, the calling code needs to retain the object straightaway, as in this scenario:

Click here to view code image

// From a class where _myPerson is a strong instance variable
_myPerson = [EOCPerson personWithName:@"Bob Smith"];

The call to personWithName: returns a new EOCPerson autoreleased. But the compiler also needs to add a retain when setting the instance variable, since it holds a strong reference. Therefore, the preceding code is equivalent to the following in a world of manual reference counting:

Click here to view code image

EOCPerson *tmp = [EOCPerson personWithName:@"Bob Smith"];
_myPerson = [tmp retain];

You would be correct to note here that the autorelease from the personWithName: method and the retain areextraneous. It would be beneficial for performance to remove both. But code compiled under ARC needs to be compatible with non-ARC code, for backward compatibility. ARC could have removed the concept of autoreleaseand dictated that all objects returned from methods be returned with a +1 retain count. However, that would break backward compatibility.

But ARC does in fact contain runtime behavior to detect the situation of extraneous autorelease plus immediateretain. It does this through a special function that is run when an object is returned autoreleased. Instead of a plain call to the object’s autorelease method, it calls objc_autoreleaseReturnValue. This function inspects the code that is going to be run immediately after returning from the current method. If it is detected that this is going to be a retain of the returned object, a flag is set within a global data structure (processor dependent) instead of performing the autorelease. Similarly, the calling code that retains an autoreleased object returned from a method uses a function called objc_retainAutoreleasedReturnValue instead of calling retain directly. This function checks the flag and, if set, doesn’t perform retain. This extra work to set and check flags is faster than performing autorelease and retain.

The following code illustrates this optimization by showing how ARC uses these special functions:

Click here to view code image

// Within EOCPerson class
+ (EOCPerson*)personWithName:(NSString*)name {
    EOCPerson *person = [[EOCPerson alloc] init];
    person.name = name;
    objc_autoreleaseReturnValue(person);
}

// Code using EOCPerson class
EOCPerson *tmp = [EOCPerson personWithName:@"Matt Galloway"];
_myPerson = objc_retainAutoreleasedReturnValue(tmp);

These special functions have processor-specific implementations to make use of the most optimal solution. The following pseudocode implementations explain what happens:

Click here to view code image

id objc_autoreleaseReturnValue(id object) {
    if ( /* caller will retain object */ ) {
        set_flag(object);
        return object; ///< No autorelease
    } else {
        return [object autorelease];
    }
}

id objc_retainAutoreleasedReturnValue(id object) {
    if (get_flag(object)) {
        clear_flag(object);
        return object; ///< No retain
    } else {
        return [object retain];
    }
}

The way in which objc_autoreleaseReturnValue detects whether the calling code is going to immediately retain the object is processor specific. Only the author of the compiler can implement this, since it uses inspection of the raw machine-code instructions. The author of the compiler is the only person who can ensure that the code in the calling method is arranged in such a way that detection like this is possible.

This is just one such optimization that is made possible by putting memory management in the hands of the compiler and the runtime. It should help to illustrate why using ARC is such a good idea. As the compiler and runtime mature, I’m sure that other optimizations will be making an appearance.

Memory-Management Semantics of Variables

ARC also handles memory management of local variables and instance variables. By default, every variable is said to hold a strong reference to the object. This is important to understand, particularly with instance variables, since for certain code, the semantics can be different from manual reference counting. For example, consider the following code:

Click here to view code image

@interface EOCClass : NSObject {
id _object;
}

@implementation EOCClass
- (void)setup {
_object = [EOCOtherClass new];
}
@end

The _object instance variable does not automatically retain its value under manual reference counting but does under ARC. Therefore, when the setup method is compiled under ARC, the method transforms into this:

Click here to view code image

- (void)setup {
    id tmp = [EOCOtherClass new];
    _object = [tmp retain];
    [tmp release];
}

Of course, in this situation, retain and release can be cancelled out. So ARC does this, leaving the same code as before. But this comes in handy when writing a setter. Before ARC, you may have written a setter like this:

Click here to view code image

- (void)setObject:(id)object {
[_object release];
_object = [object retain];
}

But this reveals a problem. What if the new value being set is the same as the one already held by the instance variable? If this object was the only thing holding a reference to it, the release in the setter would cause the retain count to drop to 0, and the object would be deallocated. The subsequent retain would cause the application to crash. ARC makes this sort of mistake impossible. The equivalent setter under ARC is this:

Click here to view code image

- (void)setObject:(id)object {
_object = object;
}

ARC performs a safe setting of the instance variable by retaining the new value, then releasing the old one before finally setting the instance variable. You may have understood this under manual reference counting and written your setters correctly, but with ARC, you don’t have to worry about such edge cases.

The semantics of local and instance variables can be altered through the application of the following qualifiers:

__strong The default; the value is retained.

__unsafe_unretained The value is not retained and is potentially unsafe, as the object may have been deallocated already by the time the variable is used again.

__weak The value is not retained but is safe because it is automatically set to nil if the current object is ever deallocated.

__autoreleasing This special qualifier is used when an object is passed by reference to a method. The value is autoreleased on return.

For example, to make an instance variable behave the same as it does without ARC, you would apply the __weakor __unsafe_unretained attribute:

Click here to view code image

@interface EOCClass : NSObject {
id __weak _weakObject;
id __unsafe_unretained _unsafeUnretainedObject;
}

In either case, when setting the instance variable, the object will not be retained. Automatically nilling weakreferences with the __weak qualifier is available only in the latest versions of the runtime (Mac OS X 10.7 and iOS 5.0) because they rely on features that have been added.

When applied to local variables, the qualifiers are often used to break retain cycles that can be introduced with blocks (see Item 40). A block automatically retains all objects it captures, which can sometimes lead to a retain cycle if an object retaining a block is retained by the block. A `__weak` local variable can be used to break the retain cycle:

Click here to view code image

NSURL *url = [NSURL URLWithString:@"http://www.example.com/"];
EOCNetworkFetcher *fetcher =
[[EOCNetworkFetcher alloc] initWithURL:url];
EOCNetworkFetcher * __weak weakFetcher = fetcher;
[fetcher startWithCompletion:^(BOOL success){
NSLog(@"Finished fetching from %@", weakFetcher.url);
}];

ARC Handling of Instance Variables

As explained, ARC also handles the memory management of instance variables. Doing so requires ARC to automatically generate the required cleanup code during deallocation. Any variables holding a strong reference need releasing, which ARC does by hooking into the dealloc method. With manual reference counting, you would have found yourself writing dealloc methods that look like this:

- (void)dealloc {
    [_foo release];
    [_bar release];
    [super dealloc];
}

With ARC, this sort of dealloc method is not required; the generated cleanup routine will perform these two releases for you by stealing a feature from Objective-C++. An Objective-C++ object has to call the destructors for all C++ objects held by the object during deallocation. When the compiler saw that an object contained C++ objects, it would generate a method called .cxx_destruct. ARC piggybacks on this method and emits the required cleanup code within it.

However, you still need to clean up any non-Objective-C objects if you have any, such as CoreFoundation objects or heap-allocated memory, with malloc(). But you do not need to call the superclass implementation of deallocas you did before. Recall that calling dealloc under ARC explicitly is illegal. So ARC, along with generating and running the .cxx_destruct method for you, also automatically calls the superclass’s dealloc method. Under ARC, a dealloc method may end up looking like this:

Click here to view code image

- (void)dealloc {
CFRelease(_coreFoundationObject);
free(_heapAllocatedMemoryBlob);
}

The fact that ARC generates deallocation code means that usually, a dealloc method is not required. This often considerably reduces the size of a project’s source code and helps to reduce boilerplate code.

Overriding the Memory-Management Methods

Before ARC, it was possible to override the memory-management methods. For example, a singleton implementation often overrode release to be a no-op, as a singleton cannot be released. This is now illegal under ARC because doing so could interfere with ARC’s understanding of an object’s lifetime. Also, because the methods are illegal to call and override, ARC makes the optimization of not going through an Objective-C message dispatch (see Item 11) when it needs to perform a retain, release, or autorelease. Instead, the optimization is implemented with C functions deep in the runtime. This means that ARC is able to do optimizations such as the one described earlier when returning an autoreleased object that is immediately retained.

Things to Remember

Automatic Reference Counting (ARC) frees the developer from having to worry about most memory management. Using ARC reduces boilerplate code from classes.

ARC handles the object life cycle almost entirely by adding in retains and releases as it sees appropriate. Variable qualifiers can be used to indicate memory-management semantics; previously, retains and releases were manually arranged.

Method names have always been used to indicate memory-management semantics of returned objects. ARC has solidified these and made it impossible not to follow them.

ARC handles only Objective-C objects. In particular, this means that CoreFoundation objects are not handled, and the appropriate CFRetain/CFRelease calls must be applied.a