Introduction to Objective-C Modules

Current State

Unless you’ve written C in the past, your closest encounter with the preprocessor and its code inclusion is probably the #import statement. But before #import, there was #include.

#include

The preprocessor command #include is a clever little trick. It basically tells the preprocessor to treat the contents the file included as if the entire file actually appeared at the point of the #include. That explanation may seem a little confusing, so let’s just look at an example. Let’s say we have a file, IncludeMe.h:


// IncludeMe.h

#define kMyConstantNumber 42

#define kMyConstantBoolean true

Now, we write a little C program that uses the constants defined in the IncludeMe header file:


// MyProgram.c

#include <stdio.h>

#include "IncludeMe.h"

printf("The constant is %d and the boolean is %d", kMyConstantNumber, kMyConstantBoolean);

How does MyProgram.c know what the values of kMyConstantNumber and kMyConstantBoolean are in order to print them out from printf()? Well, what’s really happening is the preprocessor is going in and injecting the contents of IncludeMe.h into MyProgram.c. So, what the compiler actually sees when it’s compiling your program is the following:


// MyProgram.c

#define kMyConstantNumber 42

#define kMyConstantBoolean true

printf("The constant is %d and the boolean is %d", kMyConstantNumber, kMyConstantBoolean);

Sure, this is a trivial example, but it should make it pretty obvious exactly what the preprocessor is doing. Actually, the above example isn’t entirely true. To see what the preprocessed file really looks like, fire up Xcode and create a new C/C++ file. In terminal, navigate to the directory containing that file (use the cd command to change directories and navigate to the file). Then type gcc -E filename.c and observe all of the code that gets spit out to the terminal window. By default, all new C files have stdio.h included. This is the header file for all of the basic IO functions available to all C programs (such as the printf() function you saw above). The preprocessor sees that your C file includes stdio.h, and so it includes all of the code in stdio and makes it available to your C program. If you scroll aaaall the way to the bottom of the output, you’ll see the code you actually wrote. #include placed all of the code from stdio in your file as if you had copy/pasted it there yourself.

As a side note, notice that we wrapped our included header file in double quotes (” “). This tells the preprocessor “look for IncludeMe.h in the same directory as MyProgram.c”. However, stdio.h is wrapped in angle brackets (< >). This tells the preprocessor to look for this header file in the directory with all of the system headers.

Recursive Includes

So, this preprocessor #include directive is great. It lets you modularize your code, include system headers, and fosters reusability. But what happens when you have a pair of files that look like this:


// FirstFile.h

#include "SecondFile.h"

/* Some code */

// SecondFile.h

#include "FirstFile.h"

/* Some other code */

Well, the preprocessor first goes out and sees that FirstFile.h wants to include SecondFile.h inside of it. But when it goes to do that, it sees that SecondFile.h also tries to include FirstFile.h, which includes SecondFile.h, which includes FirstFile.h, which includes….ok, well, you get the picture. This is called a Recursive Include.

#include vs. #import

The recursive include is the problem that Objective-C tried to solve with the introduction of the #import directive. Using #import, a file would be guarded against recursive includes by first checking to make sure the included file was not already defined. If it was not, the file would be included, otherwise it would be skipped. Traditional C headers also support this in the form of header guards:


#ifndef MyFile_h

#define MyFile_h

// Some code

#endif

The two essentially do the same thing, however Objective-C classes and frameworks should use #import and not #include.

Introducing @import

Back in November in 2012, Doug Gregor of Apple gave a presentation at the LLVM Developers Meeting requesting that modules, a solution to the problems inherent to preprocessor #imports and #includes, be introduced. Modules, Gregor argued, solve two problems the current preprocessor implementation faces:

  1. Fragility
  2. Scalability

With regards to fragility, a simple example exposes how ordering with preprocessor #includes and #imports matters greatly in the end. Let’s say we have the following Objective-C file:


// MyFile.h

#define strong @"this won't work"

#import <UIKit/UIKit.h>

@interface MyFile : NSObject

@property (nonatomic, strong) NSArray *anArray;

@end

What happens after the preprocessor is done doing its work? Your header file now looks like this:


// MyFile.h

#define strong @"this won't work"

// UIKit imports

@interface MyFile : NSObject

@property (nonatomic, @"this won't work") NSArray *anArray;

@end

Notice that we’ve overridden the definition of the strong keyword with something the compiler doesn’t know how to handle.

The other issue, scalability, should be apparent from the above description about #include. #include and #import are both textual inclusions — they are a glorified copy/paste transaction. The contents of the included file are simply pasted inline where the #include or #import statement was placed. Furthermore, any files included in that file also have their contents pasted into the original file, and so on and so forth until the entire #include/#import tree is traversed. This results in a multiplicative compile time between source files and headers. Now, you would think that for as long as C and Objective-C have been around, someone somewhere would have tried to tackle this problem. And you’d be right. Pre-compiled headers (.pch) have been in use for years to combat the scalability issue. Add an #include/#import statement here, headers are compiled into a single on-disk representation of all files in the .pch, and those headers are included in every source file in your project. However, even .pch files come with their own set of problems:

  1. Most developers don’t maintain their .pch files. They soon become unruly and unmanageable
  2. It’s difficult for developers new to a project to understand how files are related if everything is in the .pch file
  3. Sometimes you just don’t want a file included everywhere

Modules break away from this textual inclusion model and instead act as an encapsulation of what a framework is, as opposed to just shoving the headers into your source files. Think of it as making an API of the framework available to your source file. With modules, a framework is compiled once into an efficient, serialized representation that can be efficiently imported when the library is used. Additionally, it ignores preprocessor state within the source file, meaning that you can’t override the definition of a keyword in a module just because you #define something with the same name prior to the import.

But Apple, in typical fashion, has taken modules even further. Think about it: if you import the MapKit framework, why should you also have to tell Xcode to link against the MapKit framework? With modules, you get support for autolinking frameworks by default!

And how do we get all of this header-caching-auto-linking goodness? Say hello to @import. Simply using the @import declaration will kick off the new module parsing and caching. And you can still do selective imports through the use of the dot syntax. So, for instance, to import only the MKMapView classes and none of the other classes in the MapKit module, you simply say

@import MapKit.MKMapView

Done.

Well that’s great, but now I have to go through each of my #import statements and replace them with @import? No! Apple is taking care of this for you, as long as you opt in. To opt in, make sure you turn on the Enable Modules (C and Objective-C) build setting. Additionally, you can turn on/off the auto-linking of frameworks here, as well.

@import Build Setting

@import Build Setting

 

 

 

 

And that’s it! How does this work, you ask? Module Maps. Module Maps are a way for modules to, well, map back to their header counterparts. Then, a separate compiler instance is spawned and the headers from the Module Map are parsed. A module file is written, and then that module file is loaded at the import declaration. As before, that module file is cached for later re-use (where ever you may import the same module).

Defining a Module

Defining a module is a relatively easy process. From Apple’s own presentation to the November LLVM Meeting, here is an example of how to define the C stdio library as a module:

Module for stdio.c

Module for stdio.c

The export keyword specifies the module name. The dot (.) indicates a submodule, so, in this case, stdio is a submodule of the std module. The public: keyword denotes the access to the API. In other words, which variables, methods, etc. will be publicly available. Anything defined outside of this is private to the module and remains that way. And that’s about it!

NOTE: At this time, modules are only available for Apple’s frameworks. and have been implicitly disabled for C++.

Performance Improvements

All of this @import stuff is really cool, in theory. But how does it stack up in practice? Doug Gregor includes an example in his presentation of the difference in the number of lines of compiled code using the traditional preprocessor macro versus the new @import directive. For the ubiquitous “Hello World” C program, the original source code file is a mere 64 lines of code. After the preprocessor has done its include of the stdio.h header, that number jumps to 11,072 lines. That’s 173 times the number of lines in the actual source file. That’s huge! Now let’s say each of your source files imports the stdio.h header (as almost all C programs do). You’re talking about adding over 11,000 lines of code to each file you add. From a mathematical perspective, that leads to an M x N compile time, if you imagine M source files and N headers. Using modules, the stdio module is parsed only once and then cached, dramatically reducing the number of lines in your processed source code files. Now, for smaller projects, the difference in compile time is negligible, and you probably won’t notice much of a benefit (aside from the conveniences of things like the auto-linking of frameworks). For larger projects, however, you’re looking at potential compile time improvements of a couple of percentage points or more! Either way, this is a really interesting and welcome addition to the LLVM compiler, and as the community continues to expand on what a module is and does, the benefits of using modules will continue to grow.

17 thoughts on “Introduction to Objective-C Modules

  1. Pingback: Using Objective-C Modules in Xcode « [iOS developer:tips];

  2. Pingback: Introduction to Objective-C Modules | zdima.net

  3. It does not seems to correctly link to frameworks though. I am using XCode 5 pr5 and my app crashes on start if I do not manually link MapKit. I took a look at the ld command it does not link with MapKit Framework. Does it work for you? (I have both have linking and modules for the project turned on)

    • Hey Evgeni,

      That’s a great question. My advice to you, for now, would be to sit tight for a few weeks until Xcode 5 is officially released and the whole thing is out of private beta. Until then, things are constantly changing, with new features being added. While this post mentions and explains topics brought forth during the November LLVM meeting, it’s not to say that Apple has necessarily implemented everything in their Developer Preview releases of the new Xcode.

  4. An anonymous Apple developer claims that Modules will not be officially supported in XC5, but it is something they are actively working on. Take this with a grain of salt, or several though.

    My point being, don’t sit around waiting for them to be supported, use Modules when they are supported, but do it the “old fashioned” way until then.

    • You don’t need to add @import Foundation to any of your source files. Foundation is included (along with UIKit for iOS projects and AppKit for OS X projects) in your .pch file. That being said, it is done using the “old-school” #import macro, not the new module system. For any additional frameworks you want to add (CoreLocation, CoreBluetooth, QuartzCore, etc.), you can just @import that framework in the files you need it. When you use #import, you’re essentially just copy/pasting that file where ever you put the #import macro. Even in the case of a .pch file, including any headers here with #import copy/pastes that file into EVERY file in your project. In the case of, say, CoreLocation, you may only need that framework in two or three files. So, including it in the .pch with #import, while allowing you to only have to type the import once, includes the framework in every other file in your project, even though they may or may not need access to CoreLocation. However, with @import, you get the visual benefit of modularity (I want CoreLocation in this file specifically, because I have included it here) with the additional benefit of the @import declaration, which will only include the framework once, and then essentially cache the file for use anywhere else you use @import, as opposed to blindly copy/pasting it each time.

Leave a reply to jmstone617 Cancel reply