2011年12月26日星期一

Migrating your code to Objective-C ARC

Migrating your code to Objective-C ARC:
Recently, Apple introduced several new developer stuff including Xcode 4, ARC, LLVM Compiler 3.0 and iOS 5. From some of the questions on Stack overflow, I could understand that, most of the ARC related confusions arise due to the fact that, developers don’t know if “ABC” is a feature/restriction of LLVM 3.0 or iOS 5 or ARC.

Retain cycles, auto-release pools, @autorelease blocks, oh man! So many new things? What am I going to do? You are right. ARC, or Objective-C Automatic Reference Counting is almost as magical as the iPad. No really!

In this post, I’ve made an attempt to demystify the air around this. Before starting, I’ll have to warn you that, this is a fairly long post. If you are too bored, Instapaper this article and read it later. But, hopefully, at the end of this, I believe, you will have a better understanding on how ARC works and be able to work around the innumerable errors it spits out when you convert your project.

Having said that, let’s get started.

What is ARC


ARC is a feature of the new LLVM 3.0 compiler that helps you to write code without worrying much about memory management. Memory management can be broadly classified into two, garbage collected and reference counted models. Before going to the details, let’s briefly discuss these two models and understand why ARC is even needed.

Problems with the current model.


The current memory model we use in Objective-C is manual reference counting on iOS and Garbage collection on Mac.

There are certain problems with both these memory models which probably was the reason why ARC was developed.

Garbage collection

Garbage collection is a higher level language feature probably introduced in Java (or technically, Java Virtual Machine) and implemented in a variety of other programming platforms including Microsoft’s Common Language Runtime. While Garbage collection worked well for higher level languages, Objective-C, which is still C under the hood, didn’t really fly high. Pointers (or rather references) in other languages like Java were actually objects that managed retain count and automatically releases itself when the count reaches zero. One of the design goals of C was to be optimized for performance and not “easy of use”. While pointer objects (read smart pointers) are great object oriented abstractions, they have an adverse effect on the performance of the code and since Objective-C was intended primarily for native programming where developers are used to use pointers, pointers within a structure, pointer to a pointer (for dereferencing a out parameter), it was just too difficult to introduce something like a smart pointer that would require a lot of mindset change from the developers who prefer a deterministic memory management model (Reference counting) over a non-deterministic memory management model (Garbage collection). Nevertheless, GC (Generational GC) was introduced in Objective-C 2.0 for Mac. While Generational GC doesn’t suffer from “Stop the world” issues like the mark and sweep alogrithm, they don’t collect every released variable and an occasional mark and sweep collection is still needed.

Reference Counting

The memory management model used in iOS is called as reference counting model, or more precisely, manual reference counting.

In manual reference counting model, you as a developer, have to deallocate every object you allocated. When you don’t do this, you either leak memory or over release it, causing a crash. While that counting sounds easy, Most of the memory leaks happen when you transfer ownership of objects across scope boundaries. That’s a creator method that allocates an object for you and expects the caller to deallocate it. To circumvent this problem, Objective-C introduced a concept called autorelease. auto-released variables are added to the auto-release pool and are released at the end of the runloop. While this sounds too good, auto-release pools do incur an additional overhead.

For example, comparing the two code blocks,


{
NSDictionary *dict = [[NSDictionary alloc] init];
// do something with the dictionary here
// I'm done with the dictionary, I don't need it anymore
[dict release];
// more code
}


and


{
NSDictionary *dict = [NSDictionary dictionary]; // auto-released object
// do something with the dictionary here
// I'm done with the dictionary, I don't need it anymore

// more code
}


the first block is an example of optimized use of memory where as the second depends on auto-release pools. While this block of code doesn’t really incur significant memory overhead, code like these slowly adds together and makes your reference counted model heavily dependent on auto-release pools. That is, objects that you know could be deallocated, will still linger around in the auto-release pool for a little longer.

Automatic Reference Counting

Say hello to ARC. ARC is a compiler feature that auto inserts retain and release for you. So in the first code block of the above example, you no longer have to write the release method and ARC auto-inserts for you before compilation.


{
NSDictionary *dict = [[NSDictionary alloc] init];
// do something with the dictionary here
// I'm done with the dictionary, I don't need it anymore
[dict release]; // ARC inserted
// more code
}


When you create an autoreleased object, like in the second block of code, ARC compiler is clever enough not to add a release call. Sound great, so how should I go about doing this ARC thing? Just delete all release/retain codes and pray? Unfortunately, it isn’t that easy. ARC is not just some auto insert or macro expander kind of tool. It forces you to think in terms of object graphs instead of memory allocation, retain or release. Let’s delve a little deeper into ARC.

Compiler level feature


ARC is a compiler level feature. I repeat. ARC IS A COMPILER LEVEL FEATURE. This means, when you use ARC, you don’t have to worry about upgrading your deployment target and so on. However, only the latest LLVM 3.0 compiler supports ARC. If you are still stuck with GCC, you are out of luck. (Meh!) Some more points that you should know about ARC are,

  • ARC is backward compatible with libraries and framework compiled with under non-ARC.
  • ARC can be used within your project on a file-by-file basis. So you can mix and match ARC code with non-ARC code.
  • You can also integrate ARC compiled libraries into your project that doesn’t use ARC and vice-versa.
  • You use a compiler switch to turn ARC on and off.
  • (The keyword here is compiler switch)
  • You can also set the complete target to build with ARC by default (and use non-ARC compiler only when instructed so.) This is shown in the illustration below.
  • ARC  Xcode

The two main compiler switches that you would often use are when you build your application with a third party library that is not ARC compliant and vice versa, are

  • -fno-objc-arc
  • -fobjc-arc

-f is the switch and no-objc-arc and objc-arc are the options that you are turning on. As evident from the names, the first one turns off ARC and the second turns on.

For example, if your application is ARC enabled but a third party library is not, you use the first switch -fno-objc-arc to exclude the third party library. Conversely, if your application is not yet ARC enabled (gasp!) but the third party library you are integrating is, you use the second switch -fobjc-arc You add these flags to the project from the Build phases tab as shown below.Xcode 2 1

Also a run time feature


Wait! You just told me (and repeated) that ARC is a compiler level feature? Now what? Sorry, I hear you, but, unfortunately, things aren’t that easy and it doesn’t just stop here. ARC also backs up on a runtime feature called zero-ing weak references. Oh, damn, another keyword! I should have introduced this before. But that’s ok. We will revisit about the run-time dependency of ARC, a little later in this post.

ARC Ownership qualifiers


As I showed you earlier, ARC automatically inserts releases and retains in your code in a pre-compilation step. But for ARC to know when to release your objects and when to retain them, you need to somehow tell the life of your variables. You use ownership qualifiers for that. A strong understanding of ownerships is vital to understand and use ARC properly. Once you understand this concept, you will be thinking in terms of object graphs instead of retain/release. Secondly, when you use ARC, all variables local or ivars are initialized to nil automatically for you. This means, there is little chance of having a dangling reference in your application.

  • __strong
  • __weak
  • __unsafe_unretained
  • __autoreleasing

The first qualifier, __strong, is the default and you might not even be using this explicitly. It is use to tell the ARC compiler that, the declared variable “owns” the reference. The opposite of this is __weak, which tells the ARC compiler that the declared variable doesn’t own the reference.

The __weak is synonymous to the “assign” modifier. You normally use assign modifier for IBOutlets and delegates. Under ARC, this is replaced with __weak. However, there is a caveat. __weak requires you to deploy the app on a runtime that supports zero-ing weak references. This includes, iOS 5 and Lion. Snow Leopard and older operating systems or iOS 4 and older operating systems don’t support zero-ing weak references. This obviously means you cannot use __weak ownership modifiers if you plan to deploy to older operating systems. Fret not. ARC includes another ownership qualifier, __unsafe_unretained that is synonymous to __weak, except that when the pointer is de-referenced, it is not set to nil, but remains dangling. A while ago, I told something about zero-ing weak references? When the runtime supports zero-ing weak references, your __weak variables are automatically set to nil when they are released. This is the only feature that requires a higher deployment target (iOS 5/Lion). Otherwise, you are good to deploy on iOS 4/Snow Leopard.

A couple other important things to know about __weak vs __unsafe_unretained is that, the compiler doesn’t allow you to use __weak when your deployment target is set to a operating system that doesn’t support zero-ing weak references. The Convert to Objective-C ARC wizard uses __weak only when your deployment target supports zero-ing weak references. So if your deployment target is iOS 4, the Objective-C convert ion wizard will replace assign modifiers with __unsafe_unretained instead of __weak.

The last ownership qualifier, __auto_releasing is used mostly when passing a reference to a function for writing out. You would use this in places where you normally use pointer indirection like returning a NSError object via an out parameter.

Properties in your header file can also have the above ownership qualifiers except the __auto_releasing. When applied to properties, ARC automatically generates the correct code in dealloc to release them when the object dies.

Lastly, and more importantly, all of ARC managed objects are initialized to nil when they are created. So, again, no more dangling pointers because you forgot a initialize statement. However, do note that this initialization doesn’t initialize primitive data types. So a declaration like,


int a;


might contain a garbage value for a.

Whew! That’s pretty taxing. Take a break. We just started.

ARC knows more Objective-C than you


ARC also taps into a the Objective-C language naming conventions and infers the ownership of the returned object.

In Objective-C, a method that stats with any one of the following prefix,

  • init,
  • alloc,
  • copy,
  • mutableCopy and
  • new

are considered to be transferring ownership of the returned object to the caller.

This means, in your application, when you create a method, ARC automatically infers whether to return a autoreleased object or a +1 retained object from your method name. In fact, in most cases, instead of returning auto-release objects, ARC just inserts a manual release in the calling code, automatically for you. However, there is a small caveat. Let’s assume that you have a method that starts with “copy”, as in


-(NSString*) copyRightString;


ARC assumes that it would transfer the ownership of the returned string to the caller and inserts a release automatically. Everything works well, if both the called method and the calling method are compiled using ARC.

But if your “copyRightString” method is in a third party library that isn’t compiled with ARC, you will over-release the returned string. This is because, on the calling code, ARC compiler inserts a release to balance out the retain count bumped up by the “copy” method. Conversely, if the third party library is compiled with ARC and your method isn’t, you will have a memory leak. You can however override this behavior by adding one of the following attribute to your methods.

  • NS_RETURNS_NON_RETAINED
  • NS_RETURNS_RETAINED

So your method will now look like this.


-(NSString*) copyRightString NS_RETURNS_NON_RETAINED;


You can also rename the method name to copyrightString (note the case) or getCopyRightString instead of adding an attribute. However, I wouldn’t recommend the former method as it breaks the cocoa naming conventions (prefixing a method with “get” or “set” is Java-ish)

You will see methods having the NS_RETURNS_* prefixes throughout the header files in Apple’s own UIKit.framework or the foundation classes. Now that you know what happens behind the scenes and how compiler treats these decorations, you can solve crazy memory issues, like a crash when you call a copyRightString in your method in a third party library.

With that, let’s get ready for climbing the next peak.

Toll-free bridging


ARC doesn’t manage Core Foundation objects. They say, there is no free lunch. ARC, takes it one step further. There is no free-casting between Core Foundation objects and equivalent Objective-C objects (NS* objects). Yes, that’s right. You cannot cast a Core Foundation object to an equivalent Objective-C object (NS* object) without telling ARC how to manage ownerships.

Let’s now see how to specify ownership transfers when you cast a Core Foundation object.

The following ownership transfer modifiers should be provided when you cast a Objective-C object to a Core Foundation object.

  • __bridge
  • __bridge_retained
  • __bridge_transfer

When you migrate a project to ARC, you would have seen error messages like the one below.

Toll free bridging
ARC Error because of a missing bridge attribute in a Toll-free bridging code

You might also have proceeded by accepting the suggestions provided by the LLVM compiler. But now, let’s dig deeper and understand the “why” behind it.

The modifier, __bridge tells the ARC compiler that, it’s a plain simple, bridging cast. That means, you ask the ARC compiler to do nothing extra when the transfer is made. You might think, if that is the case, Apple could have made this the default choice. But it was not made probably because, it’s to preposterous to make such an assumption. Making such a bold assumption means, you would easily leak memory as there isn’t a easier way to tell when you are actually releasing a Core Foundation object unlike a Objective-C object.

The second modifier, __bridge_retained is used to tell the ARC compiler that the Objective-C object should be transferred to Core Foundation by bumping the retain count by 1 and it should be treated as if it is a newly created object (as opposed to a auto-released object). You use this modifier if the method was probably named like a creation method (starting with init, copy, mutableCopy etc.,) or if you are going to release the Objective-C object inside of Core Foundation using methods like CFRelease.

The last modifier, __bridge_transfer is used to tell the ARC compiler that the Core Foundation object is to be transferred to ARC with a retain count of 1. This is used if you created a Core Foundation object using one of the CF***Create methods and want the ARC compiler to handle the memory management for you. That’s you are transferring a Core Foundation object to ARC with a retain count of 1.

As a side note on this, avoid using __bridge_retained and __bridge_transfer to trick the compiler to add retain and releases for you. Use it to improve your code readability and minimizing the number of manual memory management calls. (Move on if you don’t understand this line. You will start understanding this automatically when you start using this in your own code)

How does ARC work internally?


ARC ain’t magic, if you know how it works. But a little knowledge is a dangerous thing. Knowing how the ARC compiler works will help you more in understanding the error messages and compiler warnings spat out by it.

The ARC compiler has two main parts, a front end compiler and an optimizer.

ARC front end


The ARC front end compiler checks for every “owned” Objective-C object and inserts release appropriately. By owned object, I mean, an object whose ownership qualifier has been set. For example, if the “owned” object is a local variable, ARC front end compiler inserts a release at the end of the scope. This is because, by default all local variables are “strong” ly owned. If the object is a instance variable, the ARC front end compiler inserts a release statement in the dealloc method, if the ownership type is strong. For unsafe_unretained or weak ownership ARC doesn’t do anything. It also takes care of calling the [super dealloc] for you and intact ARC compiler doesn’t allow you to explicitly call dealloc.

The ARC front end compiler also takes care of generating errors when it encounters a variable (local or instance) whose ownership qualifier is not set or when you explicitly calling dealloc.

ARC optimizer


The function of the ARC optimizer is to optimize the retain and release statements by removing them if they are inserted multiple times by the ARC front end compiler. It is this optimizer that ensures that performance is not affected by calling retain and release multiple times.

The actual Migration using Xcode 4.2


Xcode 4.2 has a wizard to automatically migrate your code for use with the ARC compiler. This means, the wizard rewrites some of your code, removes calls to retain/release and removes dealloc methods and calls to [super dealloc] for you.

The first step is to open your project, select Edit -> Refactor -> Convert to Objective-C ARC from the menu.

Refactor option
Migrating to Objective-C ARC using Xcode 4.2

When you select this option, you will be asked to select a target. If you have only one target, it’s fine. If you have multiple targets in your application, you have to perform the ARC migration on every target. After you select a target, the wizard by default selects all source code files that belong to that project for ARC migration. If you are using third party libraries that are not yet ARC ready, you can uncheck those files in this step. This is illustrated in the screenshot below.

Cannot convert
Selecting your files for ARC exclusion

In the above project, since I know that ASIHttpRequest is not yet ARC compatible, I’m selecting them and command-clicking them to show the option to uncheck all of them. When you do this, the wizard automatically adds a -fno-objc-arc compiler flag for all these files.

The next step is to start the pre-checking process. The pre-checking process compiles the project and analyzes for potential problems before performing the actual migration. You might almost and always get a error message like this.

Cannot convert
The dreaded error message!

Of course, 58 errors in this screenshot is actually quite low. You should expect anywhere in the range of 300+ for a mid sized project. But fret not, they aren’t complicated at all to fix.

Common ARC migration errors


The number of errors that might prevent you from converting your project to ARC is usually high if your code is “old” or if it doesn’t adhere to Objective-C design patterns. For example, accessing a iVar. While it’s technically ok, you should almost and always use properties to access them outside of init and dealloc methods. If you have been using properties, ARC migration would be painless. If you were old skool, you have to feel the pain now. In this last section, I’ll show you the most commonly occurring errors when you migrate your project.

Cast of Objective-C pointer to C Pointer type


This error is generated because ARC doesn’t do toll-free bridging for you. As I explained before in the section, Toll-free bridging, requires the developer to explicitly specify ownership transfer qualifiers.

Use the various ownership transfer qualifiers I showed you before to fix this problem.

performSelector may cause a leak because its selector is unknown


We now know that Objective-C ARC compiler knows more Objective-C than you. This error message is because of that. The ARC compiler tries to identify the method family and determine whether to add a retain or release to the returned value from the caller code. This means, if your method starts with init, alloc, copy, mutableCopy or new, the ARC compiler will add a release to the calling code after the variable scope ends. Since are using a selector to call a method dynamically at runtime, ARC doesn’t really know if the method called returns a +1 retained object or a auto-released object. As such, ARC cannot reliably insert a retain or release to the returned object after its scope ends. This warning is shown to warn you of potential memory leaks.

If you are sure that your code works fine without memory leaks, you can ignore this warning. To suppress this warning, you can turn off the compiler flag -Warc-performSelector-leaks warning on a line by line basis like this.


#pragma clang diagnostic push
#pragma clang diagnostic ignored "-Warc-performSelector-leaks"
[self performSelector:self.mySel];
#pragma clang diagnostic pop


Unfortunately, you cannot annotate a dynamic selector using __attribute__ ((objc_method_family(*))).

Receiver type “*” doesn’t declare the method with selector “*”


ARC mandates that every method you call should be declared properly. With GCC, this was a warning. But LLVM makes this step mandatory since ARC needs to accurately identify and that is why you see an error like this.

Undeclared Selectors
Error that you see when you don't declare a receiver type

This error is also because of the fact that ARC needs to identify the method family to determine if it has to add a retain or release to the returned object. For example, in the above code, the method, returnMyGreatObject might return a NS_RETURNS_RETAINED. In this case, the ARC compiler has to insert a release after the returned object goes out of scope. The ARC compiler can know this only when you declare it formally. This is why, under ARC method declarations are mandatory. If you have been declaring methods formally under GCC, even when the compiler didn’t enforce (so that the code was aesthetically beautiful) you wouldn’t see this error at all. As I said before, the number of ARC migration errors is directly proportional to the quality of code you write. Fixing this error is fairly simple and all you have to do is to declare every method formally in the header file or on a private category extension.

Common workarounds that you use in ARC on code that otherwise looks normal


In some cases, while using ARC, you might end up writing code that looks as if it’s written to “please” the ARC compiler rather than writing natural code. Unfortunately, nothing can be done to this and we all have to live with this. The next two sections explain when you might need to write unnecessary code like this to please the compiler.

Capturing “*” strongly is likely to lead to a retain cycle


Capture
ARC and retain cycles

The last category of warning message is shown when a retain cycle is detected in your code. An example is shown below.

This code was probably leaking the request object before ARC and increasing your memory footprint. But, thanks to ARC. You now know that code like these cause retain cycles that cannot be released automatically. Circumventing a retain cycle issue almost and always ends up breaking the cycle with a weak reference.

Fixing this error is fairly simple and in this case, you can get a weak reference to the request object and copy it to the block. Within the block, convert it again to a strong reference. This is illustrated below.

Capture fixed
Workaround for ARC and retain cycle issue

In the above code block, you can also replace references to __unsafe_unretained with __weak if you are deploying to a runtime that supports zero-ing weak references.

Avoiding retain cycles using __block


Sometimes, you need an object to live till as long as the completion handler on it can live. For example, a Block based UIAlertView can call a completion handler after the user presses a button on the UIAlertView.

For example,


UIAlertView *alertView = [UIAlertView alertViewWithTitle:@"Test" buttons:[NSArray arrayWithObjects:@"Ok", @"Cancel", nil] completionHandler:^(int tappedButtonIndex)  {

// do something based on the button tapped on alertView
}];
[alertView show];


In the above case, the alertView gets deallocated by ARC as soon as it’s shown and the call to completionHandler never gets executed (or even crashes).

To prevent this, you can use the __block decoration on UIAlertView declaration and copy it inside the block like


__block UIAlertView *alertView = [UIAlertView alertViewWithTitle:@"Test" buttons:[NSArray arrayWithObjects:@"Ok", @"Cancel", nil] completionHandler:^(int tappedButtonIndex)  {

// do something based on the button tapped on alertView
alertView = nil;
}];
[alertView show];


ARC takes care of releasing it when you nil it inside the completionHandler. You will find this pattern used a lot when you work with completionHandlers in TwTweetComposeViewController or even UIViewController presentViewController:animated:completion: methods.

That last question


When should you migrate?


NOW

The performance benefits you get by using ARC is remarkable. Apple claims that the @autoreleasepool is over 6 times faster than NSAutoReleasePool objects used in your non-ARC code. This is because, @autoreleasepools don’t allocate objects and all it does is must bump up the pointer retain counts. Similarly, NSObjects’ retain and release are optimized that you can expect a performance boost of anywhere around 2.5x. The third important performance benefit you will see is in methods that return autoreleased object. Under ARC, this variable is no longer transferred using the auto-release pool and what instead happens is a ownership transfer. Again this is upto 20x faster.

Hence, don’t wait till your dependent third party frameworks are migrated to ARC. You can always exclude them and go ahead and convert your code to ARC now.

Where to go from here?


  • WWDC 2011 – Session 322 Introduction to Automatic Reference Counting
  • WWDC 2011 – Session 322 Objective-C Advancements in depth
  • Stop: You are warned. This link is only for hard code geeks. http://clang.llvm.org/docs/AutomaticReferenceCounting.html
  • WWDC 2011 – Session 308 – Blocks and Grand Central Dispatch in Practice

One last word, treat this post as a living document. I’ll be updating the last few sections on new workarounds as and when I find a fix for them.



Mugunth

没有评论:

发表评论