Archive for the ‘C’ Category

A matter of style

Tuesday, October 17th, 2006

I was dreaming when I wrote this. Forgive me if it goes astray. Like many programmers, I’ve experimented with several coding styles. (But I didn’t inhale.) The point of this experimentation was not simply aesthetic, although I do believe that eye-appeal is important when you’re staring at something for hours a day. Good coding style makes it easier for you and others to understand, debug, and modify your code. It can also prevent bugs from occurring the first place.

In essence, coding styles offer alternative presentations of code that the compiler treats as the same. Compilers, it turns out, have no taste. A compiler wouldn’t care if your entire application appeared all smushed together on one line. That, by the way, is one of the reasons we prefer not to talk to compilers directly; the preprocessor is a much more civilized conversational partner.

In zoological order below, I express my current opinions on coding style, for whatever they’re worth. (Make me an offer.) I of course encourage a diversity of viewpoints, so in the comments to this post I look forward to hearing from both sides on these issues: those who agree with me and those who vehemently agree with me.

Making it explicit

I know what you’re thinking, but get your mind out of the gutter! The headline refers to giving clear indications in your code of things that would otherwise be handled automatically according to the specifications of the language. For example, in expressions with multiple operators, I put parentheses around the sub-expressions, 1 + (2 * 3), instead of relying on the implicit order of operations, although I do make an exception when the main operator is an assignment, sum = 1 + 2. When checking for 0 values, I put the 0 values in my code: if (pointer != NULL), if (pointer == nil), if ([array count] > 0), as opposed to if (pointer), if (!pointer), if ([array count]). I use casts, (int)count, and constant suffixes, 0u, instead of allowing implicit type conversions. This applies to function and method arguments and variable assignments too!

Even if you know the operator precedence and type conversion rules like the back of your hand — actually, I can’t say that I give much thought to the back of my hand — others who read your code might not. When these are left implicit, bugs are destined to arise. In fact, I recently fixed a bug in Vienna where some rows were too short because the methods took floats but the calculations were done as integers, thus truncating the fractional values. Besides, implicit rules can vary from language to language, so if you’re concerned about code portability, or perhaps the sheer effort of mastering multiple languages, why not just free your code from reliance on the linguistic eccentricities?

By the way, I really like the way Java handles logical expressions, using only boolean values. This is definitely a bias due to my background in logic. I want boolean expressions to be real booleans, not integers! My instinctive reaction when encountering if (expression) is to treat expression as boolean, so I don’t want to do a double-take every time just in case I should really be thinking non-boolean.

Goto jail. Do not pass Go. Do not collect $200.

On the one hand, it’s painful to read function and method implementations that consist entirely of nested if clauses. On the other hand, a single return at the end makes an implementation easier to understand and debug. I’ve come to the conclusion, then, that goto is the way to go. I admit that it seems very BASIC, but give it a chance.


-(BOOL)doSomethingWithArgument:(id)argument {
    BOOL success = NO;

    if (argument == nil) {
        goto end;
    }
    statements
    if (condition) {
        goto end;
    statements

    success = YES;

    end:
    return success;
}

Space: The final frontier

I put a space before before and after binary operators, 1 + 2. The only exceptions would be the operators for structure or union membership, a.b and a->b, which I don’t put space around, but you could argue that they are actually binary postfix operators, while other binary operators are infix.

I don’t have a strong opinion about whether to put space after an opening parenthesis and before a closing parenthesis, such as with expressions or function arguments, (1 + 2) vs. ( 1 + 2 ). I tend not to use space in these cases. However, I do have a strong opinion about whether to put space after the asterisk in a declaration: YES! In some cases, leaving out the space is highly misleading. For example, int *array[50] does not declare a pointer to an array of integers but rather an array of pointers to integers, so it’s better to use int * array[50]. The * character already has too many functions — pointer declaration, pointer dereferencing, multiplication — which is why I think it’s best to distinguish them as much as possible.

In the past, I preferred that the opening brace of a code block appear on its own line.


while (condition)
{
    statements
}

I suppose that the logician in me enjoyed the symmetry of opening and closing braces at the same level. However, I’ve come to appreciate the virtues of the other standard.


while (condition) {
    statements
}

Using this style, it’s just as easy to determine the beginning and ending of the code block, and over the course of many blocks you fill a lot less vertical space, which means that you can see more of your function or method at once (which is a good thing). Moreover, there are instances where you want to put something after the closing brace, such as in a do while loop.


do {
    statements
} while (condition);

Therefore, if you want to be consistent, you shouldn’t be opposed to similar constructs elsewhere.


if (condition) {
    statements
} else {
    statements
}

Speaking of consistency, I like to use the ‘shorter’ style for function or method implementations (and even for Objective-C @interface ivar declarations).


type function(arguments) {
    statements
}

It may be true that functions and methods are ‘special’, but I believe that the same considerations apply to them as to other blocks. I don’t understand why developers would deviate here from the coding style used everywhere else.

Party’s over

Oops, out of time.

Filling an NSMutableArray

Tuesday, October 10th, 2006

It is estimated that there are over 6 billion people living on Earth. This staggering number raises many issues. For the Cocoa programmer, one of the issues is, can they all fit in an NSArray?

I’ve always wondered how many objects can fit in an NSArray. My first guess is one fewer than the number of grains in a heap of sand. According to the class reference for NSMutableArray, the method +[NSMutableArray arrayWithCapacity:] takes an unsigned int argument, so what’s the maximum value of an unsigned int? This should be defined in the standard headers by the macro UINT_MAX. We could easily learn the value of UINT_MAX by calling NSLog(@"UINT_MAX: %u", UINT_MAX); but instead let’s go on a wild goose chase! That would be much more fun. The natural place to start would be /usr/include/limits.h.

Nope, no dice. Ok, time to give up.

Wait! Near the top of the file we find #include <machine/limits.h> and #include <sys/syslimits.h>. The next stop on our goose chase is /usr/include/machine/limits.h. This file just tells you where to look depending on your architecture. If you have a PowerPC machine, it’s /usr/include/ppc/limits.h; I have an Intel machine, so I’m going to try /usr/include/i386/limits.h. Bingo!

#define UINT_MAX 0xffffffff /* max value for an unsigned int */

For those of you who don’t count in hexadecimal, that’s decimal 4,294,967,295. For those of you who don’t count in decimal, that’s 4 billion, give or take (actually, give). Unfortunately, an array with capacity UINT_MAX is not big enough to hold every person, or every Person, as the sample code usually goes. On the other hand, who pays attention to the stated capacity? Certainly not elevator riders. Maybe we can just keep stuffing an array with objects if we push really hard. After all, an NSMutableArray is supposed to be able to expand beyond its Capacity: argument.

Before we send our array to the all-you-can-eat object buffet, we should carefully consider the consequences. Will it have to go on an NSDiet afterward? Will we be able to find an object after it has been added to the array? The method -[NSArray indexOfObject:] returns an unsigned int. What index will the (UINT_MAX + 1)th object return? Another worry is that this method returns NSNotFound when the array does not contain the object. The file NSObjCRuntime.h defines NSNotFound:

enum {NSNotFound = 0x7fffffff};

That’s 2,147,483,647 for the hex-impaired. It turns out, then, that NSNotFound < UINT_MAX. (Note to self: link to the sound of a car slamming on its brakes and screeching to a halt.) If we add more than NSNotFound objects to an array, how will we know whether -[NSArray indexOfObject:] has found an object or not?

We’ve accumulated plenty of unanswered questions. It’s time to put up or shut up. (Note to self: stop talking to yourself.) Let’s test what actually happens when we add a large number of objects to an NSMutableArray. I decided to begin with NSNotFound before moving up to UINT_MAX. In the end, it didn’t make a difference.

#import <Foundation/Foundation.h>

int main(int argc, const char * argv[]) {
	NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];

	unsigned int size = NSNotFound;
	NSMutableArray * array = [NSMutableArray arrayWithCapacity:size];
	NSNumber * number;
	unsigned int index;
	for (index = 0u; index < size; ++index) {
		number = [[NSNumber alloc] initWithUnsignedInt:index];
		[array addObject:number];
		[number release];
		if ((index % 1000000u) == 0u) {
			NSLog(@"Current index: %u", index);
		}
	}

	NSString * string = [[NSString alloc] initWithString:@"MAD_MAX"];
	[array addObject:string];
	NSLog(@"MAD_MAX index: %u", [array indexOfObject:string]);
	[string release];

    [pool release];
    return 0;
}

Can you say “SIGBUS”, children? Good! I knew you could! Yes, my test program crashed hard at somewhere between 115 and 116 million objects.

ArraySize(3780) malloc: *** vm_allocate(size=1069056) failed (error code=3)
ArraySize(3780) malloc: *** error: can't allocate region

Moral of the story: What’s the capacity of an NSMutableArray? I don’t know. Before it reaches any kind of limit, your computer will run out of memory. My iMac has 1 GB RAM; a Mac Pro has a maximum of 16 GB. Therefore, if you’re adding a completely unknown quantity of objects to an array, you might want to place your own limits. Or in other words, be excellent to each other, and party on dudes!