Archive for the ‘C’ Category

Compiler indirectives and metaphorical keypaths

Wednesday, July 2nd, 2008

I am, like, literally ROTFRTFMIMHOLOLYMMVIIRCFUBAROTOH!

[[NSNotificationCenter defaultCenter] addObserver:self selector:@selector(likeNoWay:) name:@"ROTFRTFMIMHOLOLYMMVIIRCFUBAROTOH" object:nil];

A string literal is a sequence of characters enclosed in quote-unquote ‘\”quotation marks\”‘ (wiggles index and middle fingers). In the C programming language — so-named because its inventors lacked imagination — a string literal represents an array of char terminated by a null (or by a comma when the feeling’s not that strong). In the Objective-C programming language — otherwise known as The O.C. — a string literal in a compiler directive (e.g., @"NSBirdJustFlewIntoMyWindowException") defines a constant NSString object.

Chris Hanson and Sanjay Samani have offered some excellent advice about avoiding the use of these NSString literals in your method calls. The problem is that the compiler will accept pretty much any directive: with Objective-C 1.0 the compiler only warns about non-ASCII characters, and with Objective-C 2.0 it doesn’t even do that (for better or worse, but that’s a subject for a different post). Thus, if you happen to misspell a notification name, you’ll never know until your app misbehaves at runtime. I’m sorry, Dave, I’m afraid I can’t do that.

I recommend replacing NSString literals with macros or constant variables (huh?) wherever spelling matters. (Spelling matters everywhere. I’ve seen some pretty bad method names.) Here’s a little trick for handling arbitrary keypaths:

#define KEY1 @"key1"
#define KEY2 @"key2"
#define KEY3 @"key3"
#define DOT @"."

[self valueForKeyPath:KEY1 DOT KEY2 DOT KEY3];

Unfortunately, we’re still at the mercy of misspellings in nib bindings. Yet another reason to do without nibs. But that’s also a subject for a different post and horse of a different color (dried poop).

Logging in Leopard

Sunday, January 6th, 2008

The release of Leopard has given third-party developers a lot to do: attempting to restore features lost from Tiger, for instance. (By the way, where is the second party, and why am I never invited?) My friend Rainer Brockerhoff has provided a way, or Quay, to display hierarchical popup menus in the Dock again. One of my most missed features in Leopard is using NSLog to spew output exclusively to Xcode’s console log. When you debug or run your app in Xcode on Tiger, you can put NSLog calls everywhere without worrying about polluting console.log. In my opinion, console.log is only for important messages and errors. I frequently ask users to consult it if they’re experiencing a problem with an app. Either that or the Oracle at Delphi.

Leopard dispenses completely with console.log, though there is a “Console Messages” database query in Console. Whereas on Tiger stdout and stderr standardly go to console.log, on Leopard they boldly go to system.log (as well as to the “Console Messages” query). On either version of Mac OS X, Xcode redirects stdout and stderr to its own console log, so they don’t appear in Console at all.

According to the documentation, NSLog sends a message to stderr. This is true for Tiger, and it’s also true for Leopard, but Leopard’s NSLog has the additional behavior of sending a message to system.log regardless of whether stderr is redirected. Thus, when you debug or run your app in Xcode (these may amount to the same thing in Xcode 3), messages from NSLog appear both in Xcode’s console log and in system.log! Curiously, there is no duplication of NSLog messages in system.log when stderr is not redirected.

If you prefer to keep your debug output out of system.log, the workaround for this new NSLog behavior is to abandon NSLog for debugging purposes on Leopard. :-( After much experimentation with asl, I realized that our old faithful printf would work. Since printf writes to stdout, its output is redirected by Xcode. Plus, when you’re debugging your app in Xcode you don’t really need NSLog to tell you the name of your app, the date, or your shoe size.

A limitation of printf is that it doesn’t handle the format specifier %@ for an Objective-C object. With Cocoa, therefore, we want an Objective-C wrapper around printf (like, um, NSLog). If you add the following code to your target’s .pch file, you’ll have an Objective-C debug logging function JJLog available throughout your target’s code. To enable logging in your app’s debug build, just add JJLOGGING to the GCC_PREPROCESSOR_DEFINITIONS setting (AKA “Preprocessor Macros”) in the debug build configuration.


#ifdef __OBJC__
	#import <Cocoa/Cocoa.h>
	#ifdef JJLOGGING
		#define JJLog(...) (void)printf("%s:%i %s: %s\n", __FILE__, __LINE__, __PRETTY_FUNCTION__, [[NSString stringWithFormat:__VA_ARGS__] UTF8String])
	#else
		#define JJLog(...)
	#endif
#endif

In your app’s release build, the debug function is a NOP that the compiler will almost certainly optimize out. This conditional code should not cause problems when using GCC_PRECOMPILE_PREFIX_HEADER, because Xcode already generates a separate precompiled prefix header for each build configuration. See the .pch.gch.hash-criteria files in /Library/Caches/com.apple.Xcode.###/SharedPrecompiledHeaders.

You can send gobs of gab to JJLog without repercussion or remorse. However, you’ll still want to use NSLog (sparingly, please) for runtime errors in your release build. Now to continue in the spirit of this post, I’ll redirect the epilogue to /dev/null.

How not to fix a build warning

Saturday, December 22nd, 2007

The Hollywood writers strike continues, and the desperation grows for alternative sources of entertainment. Fortunately, we programmers can find entertainment in our own sources. I’ve got some reality programming for you! The following snippet of code is taken from an actual CVS commit. (Yes, CVS. Don’t laugh. Do cry for me, Argentina.) This build warning ‘fix’ was made by some contractor for some project that I worked on at some point in time for some company. To protect the innocent and/or guilty, I won’t say who, what, when, or where. As for why, I wish I knew. Or maybe not.


NSEnumerator* fileEnum = [fileArray objectEnumerator];
NSDictionary* aDict = nil;
//Changed to Remove the Build Warnings
//while(aDict = [fileEnum nextObject])
while(aDict == [fileEnum nextObject])

Let this example serve as a lesson. Not for programmers — the one who wrote it is probably hopeless — but rather for managers. Please do not just hire the lowest bidder!

BOOLing for Dollars

Sunday, September 30th, 2007

While we’re all aiting-way or-fay eopard-Lay, I’d like to share a pointer that I picked up while mugging a C library. (I have no idea what that means. It seemed witty when I wrote it.) As you know, I’m always ahead of the curve, setting the trends, framing the public discourse. Thus, I should add my 1.2 cents — the dollar is weak, and I’m a little short this month — on a hot topic discussed on the Cocoa-dev mailing list recently (in geological time, anyway): the use of the Objective-C BOOL type.

If you look in the header file /usr/include/objc/objc.h, you can see how BOOL is defined:


	typedef signed char		BOOL;
	// BOOL is explicitly signed so @encode(BOOL) == "c" rather than "C"
	// even if -funsigned-char is used.

	#define YES             (BOOL)1
	#define NO              (BOOL)0

A char type — e.g., char, signed char, unsigned char — is always one byte, i.e., sizeof(signed char) == 1, whereas in most implementations an int type is more than one byte. A byte standardly consists of 8 bits, or 12 nibbles. What happens to the extra bits if you convert an int into a BOOL? According to the wacky rules of C type conversion, the result is implementation-dependent. Many implementations simply throw away the highest bits. (Other implementations recycle them into information superhighway speed bumps.) As a consequence, it’s possible that myIntVar != 0 && (BOOL)myIntVar == NO.

Usually we don’t have to worry about this, because ‘boolean’ operators in C, such as == and !, always return 1 or 0. When we use bitwise operators, on the other hand, the problem does come into play. Suppose, for example, that we’re testing whether the option key is down. The method -[NSEvent modifierFlags] returns a bit field indicating the modifier keys that are pressed, and bit masks can be used to test for specific keys. Consider the following code; there are situations where doSomethingAfterEvent: does something, yet doSomethingElseAfterEvent: does nothing.


	-(void) doSomethingAfterEvent:(NSEvent *)anEvent
	{
		if (anEvent)
		{
			if ([anEvent modifierFlags] & NSAlternateKeyMask)
			{
				[self doSomething];
			}
		}
	}

	-(void) doSomethingElseAfterEvent:(NSEvent *)anEvent
	{
		if (anEvent)
		{
			BOOL shouldDoSomethingElse = [anEvent modifierFlags] & NSAlternateKeyMask;
			if (shouldDoSomethingElse)
			{
				[self doSomethingElse];
			}
		}
	}

It has been suggested on the mailing list that the type conversion could be handled by


	BOOL shouldDoSomethingElse = !!([anEvent modifierFlags] & NSAlternateKeyMask);

or


	BOOL shouldDoSomethingElse = ([anEvent modifierFlags] & NSAlternateKeyMask) != 0;

However, these approaches would only work for single-bit masks. What if we wanted to test both the option key and the shift key?

The point I wish to make here actually has little to do with the BOOL type. (Say what?!?) Bitwise operators are not boolean operators. A boolean operator only returns 1 or 0. A bitwise operator, in contrast, can return any bit field. The proper way to handle a bitmask is to test whether the resulting bit field has the desired value:


	unsigned int myMask = NSAlternateKeyMask | NSShiftKeyMask;
	BOOL isMyKeyComboPressed = ([anEvent modifierFlags] & myMask) == myMask;

Yes, I know Robot Chicken already covered this subject a ha-while ago. I didn’t really care for it.

Invoking errors

Sunday, April 1st, 2007

If I had to name my specialty as a developer, it would be error code. I’ve reached a point in my career where I can invoke errors almost instinctively. This post is about NSErrors, however, which differ from regular errors in one crucial respect: the more NSErrors in your app, the better. Yes, even when Apple makes errors, they’re appealing. This post is also about NSInvocations, which sound arcane, and they are. I use them, but I’m afraid of them … kind of like microwave ovens. I have no idea what’s going on inside, but they make good Cocoa. (Cat’s breath, charm of sleep and bath, thy omen of scratching.)

There will come a time, maybe not today, maybe not tomorrow, maybe not soon, maybe not for the rest of your life, but someday anyway, when you’ll want to use an NSError as an argument to an NSInvocation. Or maybe never, but you’ll certainly want to use a pointer to a pointer to an NSError. That’s an error’s cousin, once removed; they can marry in selected states. The standard pattern for handling errors with Cocoa methods is to directly return a BOOL indicating success or failure and to return ‘by reference’ (look that up in your Funk & Wagnalls) an NSError that provides additional information about the failure, e.g., You’re a loser, and who does your hair, Floyd the barber? If you call the method by name — heyJoe — you can just pass the address (&) of your NSError object as an argument to the method. What if you want the method to vary at runtime, though?

Suppose that your app needs to perform a series of disparate tasks, and each task requires error checking afterward to determine whether the series should continue. Using conventional methods, your code could become a winding mess. All roads lead to Rome, just not in a straight line. The more pleasant alternative, in my opinion, is to wrap up everything into NSInvocations at the beginning to store in an NSArray for iteration. And rather than having a monstrous error checking method with endless if else clauses, you could have an aptly named error checker corresponding to an individual task, just as in key-value coding you have an aptly named setter corresponding to the getter. For example, we have a sample error checking method:


-(BOOL) didSucceedAtJoe:(id)result error:(NSError **)error {
	BOOL didSucceed = result != nil;
	if (!didSucceed && error) {
		*error = [NSError errorWithDomain:@"MyErrorDomain" code:69 userInfo:nil];
	}
	return didSucceed;
}
	

Such a method would be invoked by a generic error checker:


-(BOOL) didSucceedAtTask:(NSDictionary *)taskInfo {
	BOOL didSucceed = NO;
	NSError * error = nil;
	NSInvocation * invocation = [self invocationForDidSucceedAtTaskWithName:[taskInfo objectForKey:@"name"]];
	if (invocation) {
		id result = [taskInfo objectForKey:@"result"];
		[invocation setArgument:&result atIndex:2];
		NSError ** errorPointer = &error;
		[invocation setArgument:&errorPointer atIndex:3];
		[invocation invokeWithTarget:self];
		[invocation getReturnValue:&didSucceed];
	}
	NSLog(@"error:%@", error);
	return didSucceed;
}
	

It’s crucial to the invocation that you declare, and thus allocate memory for, both NSError * error and NSError ** errorPointer. It’s not sufficient simply to declare the latter, and you definitely can’t use a construction such as &(&error), because &error basically just gives a number, which has no address itself. Finally, we have the method for creating the invocation:


-(NSInvocation *) invocationForDidSucceedAtTaskWithName:(NSString *)name {
	NSInvocation * invocation = nil;
	if (name) {
		SEL selector = NSSelectorFromString([NSString stringWithFormat:@"didSucceedAt%@:error:", name]);
		if (selector) {
			if (ivarInvocation) {
				invocation = [[ivarInvocation retain] autorelease];
				[invocation setSelector:selector];
			} else {
				NSMethodSignature * signature = [self methodSignatureForSelector:selector];
				if (signature) {
					invocation = [NSInvocation invocationWithMethodSignature:signature];
					if (invocation) {
						ivarInvocation = [invocation retain];
						[invocation setSelector:selector];
					}
				}
			}
		}
	}
	return invocation;
}
	

The invocation is stored as an instance variable because the method signature depends only on the argument and return types, so it will remain the same for all of our selectors. Now our apps can invoke errors at a rate matched only by analysts, economists, and pundits.

On a personal note, I’d like to invite my faithful readers (all three, including myself) to my wedding today. Although the location is unusual — the Virgin Megastore in Orange County — it will nonetheless be a traditional ceremony. Pamela Anderson is signing.

WordPress Bug Fix!

Saturday, January 20th, 2007

This is the first post in what I hope is a series, which I’ll call “WordPress Bug Fix Near Saturday”. And while I’m naming things, I’ll call this first post “Episode IV: A New Hope” (to be followed, no doubt, by “Episode V: All Hope Dashed”). I’m very happy to report that the ETag parsing bug has been fixed in WordPress 2.0.7. Thus, if you’re running WordPress 2.0.7 on your site, you no longer have to comment out the following line in the file wp-includes/classes.php:

@header("ETag: $wp_etag");

My logs confirm that WordPress is now correctly parsing its own ETags and sending out HTTP 200 and 304 responses as appropriate. The Penultimate Warrior is victorious and undefeated! Note, however, that none of the bugs that I reported before the Warrior started Running Wild® (cp. Going Wild®) have been fixed yet.

Speaking of my web site logs, they contain a list of phrases that were used to find my site from internet search engines. I’d like to share a few of them that have caught my eye:

  1. instructions+for+using+the+thighmaster

    Squeeze, release, repeat. You’re welcome.

  2. how+do+u+declare+a+pointer+to+an+array+of+pointers+to+int%3f%3f+in+c+language

    int * (* ptr)[];

  3. talk+to+cat+software

    That doesn’t exist, Dr. Doolittle. Try meowing.

  4. cat+lederhosen

    I’ve given your IP to the SPCA.

  5. betamax+the+sausage+and+the+mouse

    These aren’t the droids we’re looking for.

  6. what+does+ns+stand+for+i+cocoa

    NeXTSTEP. Next!

  7. circle+k+employee+uniforms

    I’m so very sorry, dude.

  8. if+jeff+s+usual+is+a+hint+for+a+password+what+is+the+password

    Stop trying to hack into my account, you scoundrel!

Bring back STX and ETX

Saturday, December 30th, 2006

Some people think that XML is the greatest invention since sliced bread. (I won’t name names, to protect the non-existent.) In contrast, I think that it’s just a symptom of a disease, a terminal illness infecting the entire computing world. Programmers are supposed to be smart, but we’re actually the ones who are responsible for the spread of the disease. We started and promoted the practice of using printable text to delimit printable text. In my haughty opinion, this was one of the worst ideas in the history of computers (surpassed only by SMTP and MacAppADay). It was doomed to fail from the beginning, kind of like the paradox of the liar. The use of printable text to delimit printable text has been the cause of countless bugs — let’s say 500 billion — and indeed, countless security vulnerabilities. It continues to plague us today.

Having worked on a feed reader, I do know a thing or two about this issue. I can tell you that it’s a major pain to parse XML feeds. Parsing HTML is even worse, but thankfully we can leave the majority of that to WebKit. It’s hard enough when everything is perfect, but we inevitably run into issues where the text is improperly escaped or not properly escaped. This is no fun for anyone.

Since the beginning, ASCII contained a number of non-printing control characters, but for some reason they have fallen out of favor. Among the control characters are STX (0x2) and ETX (0x3). Their position in the list of character codes indicates their importance: they were used to delimit text. With character codes such as these, parsing data into strings becomes trivial:

  1. Start parsing a string when you see a STX code.
  2. Continue until you see a ETX code, you see a non-character code, you reach a preset maximum length, or you run out of data.
  3. If the last code was ETX, you’ve got a good string. Otherwise, you’ve encountered an error, and you can do whatever error handling you like.
  4. There are no more steps. The characters in the string are all literal, no unescaping necessary.

Unicode has added some similar codes such as SOS and ST. I’d like to see even more control codes, to allow for fine-grained specification of the structure of the text. For example, we could have control codes to delimit words, sentences, paragraphs, etc. This would be similar to tags in HTML but without the use of printable characters to represent the tags.

Why don’t we do this now? One objection is that files containing control characters are not human readable. I think that this is a lame excuse, because no computer file is human readable. Although my hard drive is enclosed, preventing me from examining the files on there, I have burned text files to DVD, and no matter how long I stare and squint at the shiny bottom, all I can see is my own reflection. Anyway, a lot of markup is human readable only in the sense that Derrida is human readable: there is a series of legible text characters, but do you really want to wade through all the crap to make sense of it?

Perhaps the real point underlying this objection is that control-character delimited text would not be readable by simple (i.e., dumb) text editors. This is true, but why should we be ruled by the lowest common denominator? Many modern text editors are quite intelligent and could handle the new format easily. They can already parse various forms of syntax and highlight them for the user. Let’s not let backward compatibility hold us back. That’s certainly not the Apple Way. It’s not entirely the Microsoft Way either; after all, the Word file format makes no concession to simple text editors. Neither does the cross-platform Adobe PDF.

The most powerful objection to using control characters as text delimiters is that we shouldn’t force users to learn how to input control characters along with text. I agree, which is why I think the burden should be placed on computer programs — the text editors and command line interpreters — rather than on users. When taking text input from users, an app should do the following:

  1. Use the context to guess the user’s intention.
  2. Give a visual indication of the guess to the user. Syntax coloring is one example, but the possibilities are endless. Be creative.
  3. Make it easy for the user to correct bad guesses.

In command line interpreters, by the way, there’s no good reason why the space key needs to separate arguments, as opposed to a key for a non-printable character such as escape. It’s the 21st century, by Jove, and we should be finally be able to use any printable character in a file name, including colons, quotes, slashes, and spaces, without having to do voodoo on the command line just to refer to it! (I won’t even mention hierarchical file systems, which are themselves a bad idea. Oops, I just did. Since I mentioned it, the ideal behavior when a user enters a file name is to quickly find the named file or files, which any decent file system should be able to do, and show a visual preview so that the user can verify or choose the correct file, if necessary.)

This rant has been brought to you by BBEdit. The makers of BBEdit, I assume, take no responsibility or credit for the content here, nor do they endorse the opinions I’ve expressed. (Or do they?)

(No, not as far as I know, which is nothing. In any case, I do endorse BBEdit.)

A matter of style

Tuesday, October 17th, 2006

I was dreaming when I wrote this. Forgive me if it goes astray. Like many programmers, I’ve experimented with several coding styles. (But I didn’t inhale.) The point of this experimentation was not simply aesthetic, although I do believe that eye-appeal is important when you’re staring at something for hours a day. Good coding style makes it easier for you and others to understand, debug, and modify your code. It can also prevent bugs from occurring the first place.

In essence, coding styles offer alternative presentations of code that the compiler treats as the same. Compilers, it turns out, have no taste. A compiler wouldn’t care if your entire application appeared all smushed together on one line. That, by the way, is one of the reasons we prefer not to talk to compilers directly; the preprocessor is a much more civilized conversational partner.

In zoological order below, I express my current opinions on coding style, for whatever they’re worth. (Make me an offer.) I of course encourage a diversity of viewpoints, so in the comments to this post I look forward to hearing from both sides on these issues: those who agree with me and those who vehemently agree with me.

Making it explicit

I know what you’re thinking, but get your mind out of the gutter! The headline refers to giving clear indications in your code of things that would otherwise be handled automatically according to the specifications of the language. For example, in expressions with multiple operators, I put parentheses around the sub-expressions, 1 + (2 * 3), instead of relying on the implicit order of operations, although I do make an exception when the main operator is an assignment, sum = 1 + 2. When checking for 0 values, I put the 0 values in my code: if (pointer != NULL), if (pointer == nil), if ([array count] > 0), as opposed to if (pointer), if (!pointer), if ([array count]). I use casts, (int)count, and constant suffixes, 0u, instead of allowing implicit type conversions. This applies to function and method arguments and variable assignments too!

Even if you know the operator precedence and type conversion rules like the back of your hand — actually, I can’t say that I give much thought to the back of my hand — others who read your code might not. When these are left implicit, bugs are destined to arise. In fact, I recently fixed a bug in Vienna where some rows were too short because the methods took floats but the calculations were done as integers, thus truncating the fractional values. Besides, implicit rules can vary from language to language, so if you’re concerned about code portability, or perhaps the sheer effort of mastering multiple languages, why not just free your code from reliance on the linguistic eccentricities?

By the way, I really like the way Java handles logical expressions, using only boolean values. This is definitely a bias due to my background in logic. I want boolean expressions to be real booleans, not integers! My instinctive reaction when encountering if (expression) is to treat expression as boolean, so I don’t want to do a double-take every time just in case I should really be thinking non-boolean.

Goto jail. Do not pass Go. Do not collect $200.

On the one hand, it’s painful to read function and method implementations that consist entirely of nested if clauses. On the other hand, a single return at the end makes an implementation easier to understand and debug. I’ve come to the conclusion, then, that goto is the way to go. I admit that it seems very BASIC, but give it a chance.


-(BOOL)doSomethingWithArgument:(id)argument {
    BOOL success = NO;

    if (argument == nil) {
        goto end;
    }
    statements
    if (condition) {
        goto end;
    statements

    success = YES;

    end:
    return success;
}

Space: The final frontier

I put a space before before and after binary operators, 1 + 2. The only exceptions would be the operators for structure or union membership, a.b and a->b, which I don’t put space around, but you could argue that they are actually binary postfix operators, while other binary operators are infix.

I don’t have a strong opinion about whether to put space after an opening parenthesis and before a closing parenthesis, such as with expressions or function arguments, (1 + 2) vs. ( 1 + 2 ). I tend not to use space in these cases. However, I do have a strong opinion about whether to put space after the asterisk in a declaration: YES! In some cases, leaving out the space is highly misleading. For example, int *array[50] does not declare a pointer to an array of integers but rather an array of pointers to integers, so it’s better to use int * array[50]. The * character already has too many functions — pointer declaration, pointer dereferencing, multiplication — which is why I think it’s best to distinguish them as much as possible.

In the past, I preferred that the opening brace of a code block appear on its own line.


while (condition)
{
    statements
}

I suppose that the logician in me enjoyed the symmetry of opening and closing braces at the same level. However, I’ve come to appreciate the virtues of the other standard.


while (condition) {
    statements
}

Using this style, it’s just as easy to determine the beginning and ending of the code block, and over the course of many blocks you fill a lot less vertical space, which means that you can see more of your function or method at once (which is a good thing). Moreover, there are instances where you want to put something after the closing brace, such as in a do while loop.


do {
    statements
} while (condition);

Therefore, if you want to be consistent, you shouldn’t be opposed to similar constructs elsewhere.


if (condition) {
    statements
} else {
    statements
}

Speaking of consistency, I like to use the ’shorter’ style for function or method implementations (and even for Objective-C @interface ivar declarations).


type function(arguments) {
    statements
}

It may be true that functions and methods are ’special’, but I believe that the same considerations apply to them as to other blocks. I don’t understand why developers would deviate here from the coding style used everywhere else.

Party’s over

Oops, out of time.

Filling an NSMutableArray

Tuesday, October 10th, 2006

It is estimated that there are over 6 billion people living on Earth. This staggering number raises many issues. For the Cocoa programmer, one of the issues is, can they all fit in an NSArray?

I’ve always wondered how many objects can fit in an NSArray. My first guess is one fewer than the number of grains in a heap of sand. According to the class reference for NSMutableArray, the method +[NSMutableArray arrayWithCapacity:] takes an unsigned int argument, so what’s the maximum value of an unsigned int? This should be defined in the standard headers by the macro UINT_MAX. We could easily learn the value of UINT_MAX by calling NSLog(@"UINT_MAX: %u", UINT_MAX); but instead let’s go on a wild goose chase! That would be much more fun. The natural place to start would be /usr/include/limits.h.

Nope, no dice. Ok, time to give up.

Wait! Near the top of the file we find #include <machine/limits.h> and #include <sys/syslimits.h>. The next stop on our goose chase is /usr/include/machine/limits.h. This file just tells you where to look depending on your architecture. If you have a PowerPC machine, it’s /usr/include/ppc/limits.h; I have an Intel machine, so I’m going to try /usr/include/i386/limits.h. Bingo!

#define UINT_MAX 0xffffffff /* max value for an unsigned int */

For those of you who don’t count in hexadecimal, that’s decimal 4,294,967,295. For those of you who don’t count in decimal, that’s 4 billion, give or take (actually, give). Unfortunately, an array with capacity UINT_MAX is not big enough to hold every person, or every Person, as the sample code usually goes. On the other hand, who pays attention to the stated capacity? Certainly not elevator riders. Maybe we can just keep stuffing an array with objects if we push really hard. After all, an NSMutableArray is supposed to be able to expand beyond its Capacity: argument.

Before we send our array to the all-you-can-eat object buffet, we should carefully consider the consequences. Will it have to go on an NSDiet afterward? Will we be able to find an object after it has been added to the array? The method -[NSArray indexOfObject:] returns an unsigned int. What index will the (UINT_MAX + 1)th object return? Another worry is that this method returns NSNotFound when the array does not contain the object. The file NSObjCRuntime.h defines NSNotFound:

enum {NSNotFound = 0x7fffffff};

That’s 2,147,483,647 for the hex-impaired. It turns out, then, that NSNotFound < UINT_MAX. (Note to self: link to the sound of a car slamming on its brakes and screeching to a halt.) If we add more than NSNotFound objects to an array, how will we know whether -[NSArray indexOfObject:] has found an object or not?

We’ve accumulated plenty of unanswered questions. It’s time to put up or shut up. (Note to self: stop talking to yourself.) Let’s test what actually happens when we add a large number of objects to an NSMutableArray. I decided to begin with NSNotFound before moving up to UINT_MAX. In the end, it didn’t make a difference.

#import <Foundation/Foundation.h>

int main(int argc, const char * argv[]) {
	NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];

	unsigned int size = NSNotFound;
	NSMutableArray * array = [NSMutableArray arrayWithCapacity:size];
	NSNumber * number;
	unsigned int index;
	for (index = 0u; index < size; ++index) {
		number = [[NSNumber alloc] initWithUnsignedInt:index];
		[array addObject:number];
		[number release];
		if ((index % 1000000u) == 0u) {
			NSLog(@"Current index: %u", index);
		}
	}

	NSString * string = [[NSString alloc] initWithString:@"MAD_MAX"];
	[array addObject:string];
	NSLog(@"MAD_MAX index: %u", [array indexOfObject:string]);
	[string release];

    [pool release];
    return 0;
}

Can you say “SIGBUS”, children? Good! I knew you could! Yes, my test program crashed hard at somewhere between 115 and 116 million objects.

ArraySize(3780) malloc: *** vm_allocate(size=1069056) failed (error code=3)
ArraySize(3780) malloc: *** error: can't allocate region

Moral of the story: What’s the capacity of an NSMutableArray? I don’t know. Before it reaches any kind of limit, your computer will run out of memory. My iMac has 1 GB RAM; a Mac Pro has a maximum of 16 GB. Therefore, if you’re adding a completely unknown quantity of objects to an array, you might want to place your own limits. Or in other words, be excellent to each other, and party on dudes!