Archive for December, 2006

WordPress Bug Near Friday: HTML Injection Vulnerability

Saturday, December 30th, 2006

No sooner had I posted this, I noticed this. Case closed. I win by TKO.

Bring back STX and ETX

Saturday, December 30th, 2006

Some people think that XML is the greatest invention since sliced bread. (I won’t name names, to protect the non-existent.) In contrast, I think that it’s just a symptom of a disease, a terminal illness infecting the entire computing world. Programmers are supposed to be smart, but we’re actually the ones who are responsible for the spread of the disease. We started and promoted the practice of using printable text to delimit printable text. In my haughty opinion, this was one of the worst ideas in the history of computers (surpassed only by SMTP and MacAppADay). It was doomed to fail from the beginning, kind of like the paradox of the liar. The use of printable text to delimit printable text has been the cause of countless bugs — let’s say 500 billion — and indeed, countless security vulnerabilities. It continues to plague us today.

Having worked on a feed reader, I do know a thing or two about this issue. I can tell you that it’s a major pain to parse XML feeds. Parsing HTML is even worse, but thankfully we can leave the majority of that to WebKit. It’s hard enough when everything is perfect, but we inevitably run into issues where the text is improperly escaped or not properly escaped. This is no fun for anyone.

Since the beginning, ASCII contained a number of non-printing control characters, but for some reason they have fallen out of favor. Among the control characters are STX (0x2) and ETX (0x3). Their position in the list of character codes indicates their importance: they were used to delimit text. With character codes such as these, parsing data into strings becomes trivial:

  1. Start parsing a string when you see a STX code.
  2. Continue until you see a ETX code, you see a non-character code, you reach a preset maximum length, or you run out of data.
  3. If the last code was ETX, you’ve got a good string. Otherwise, you’ve encountered an error, and you can do whatever error handling you like.
  4. There are no more steps. The characters in the string are all literal, no unescaping necessary.

Unicode has added some similar codes such as SOS and ST. I’d like to see even more control codes, to allow for fine-grained specification of the structure of the text. For example, we could have control codes to delimit words, sentences, paragraphs, etc. This would be similar to tags in HTML but without the use of printable characters to represent the tags.

Why don’t we do this now? One objection is that files containing control characters are not human readable. I think that this is a lame excuse, because no computer file is human readable. Although my hard drive is enclosed, preventing me from examining the files on there, I have burned text files to DVD, and no matter how long I stare and squint at the shiny bottom, all I can see is my own reflection. Anyway, a lot of markup is human readable only in the sense that Derrida is human readable: there is a series of legible text characters, but do you really want to wade through all the crap to make sense of it?

Perhaps the real point underlying this objection is that control-character delimited text would not be readable by simple (i.e., dumb) text editors. This is true, but why should we be ruled by the lowest common denominator? Many modern text editors are quite intelligent and could handle the new format easily. They can already parse various forms of syntax and highlight them for the user. Let’s not let backward compatibility hold us back. That’s certainly not the Apple Way. It’s not entirely the Microsoft Way either; after all, the Word file format makes no concession to simple text editors. Neither does the cross-platform Adobe PDF.

The most powerful objection to using control characters as text delimiters is that we shouldn’t force users to learn how to input control characters along with text. I agree, which is why I think the burden should be placed on computer programs — the text editors and command line interpreters — rather than on users. When taking text input from users, an app should do the following:

  1. Use the context to guess the user’s intention.
  2. Give a visual indication of the guess to the user. Syntax coloring is one example, but the possibilities are endless. Be creative.
  3. Make it easy for the user to correct bad guesses.

In command line interpreters, by the way, there’s no good reason why the space key needs to separate arguments, as opposed to a key for a non-printable character such as escape. It’s the 21st century, by Jove, and we should be finally be able to use any printable character in a file name, including colons, quotes, slashes, and spaces, without having to do voodoo on the command line just to refer to it! (I won’t even mention hierarchical file systems, which are themselves a bad idea. Oops, I just did. Since I mentioned it, the ideal behavior when a user enters a file name is to quickly find the named file or files, which any decent file system should be able to do, and show a visual preview so that the user can verify or choose the correct file, if necessary.)

This rant has been brought to you by BBEdit. The makers of BBEdit, I assume, take no responsibility or credit for the content here, nor do they endorse the opinions I’ve expressed. (Or do they?)

(No, not as far as I know, which is nothing. In any case, I do endorse BBEdit.)

How to delete a master password in Tiger

Thursday, December 28th, 2006

Warning: Do not attempt if you’re using FileVault. Side effects could include Very Bad Things™ such as losing your entire home directory, headaches, vomiting, diarrhea, going to jail, not passing Go, not collecting $200, and ripping a hole in the space-time continuum. Do not attempt while under the influence of narcotics or while operating heavy machinery.

Warning #2: For goodness’ sake, back up your files before attempting. For safety’s sake, make multiple backups of your entire hard drive, and store them off-site, preferably off-planet.

Warning #3: My lawyer has just run away screaming, so I should offer the disclaimer that you proceed at your own risk. I assume no liability for any consequences, which may include but are not limited to lost data, broken marriages, seven years of bad luck, and the Apocalypse.

Given the above warnings, why would anyone in their right mind want to delete the master password from their computer? Answer: they wouldn’t. (My mind is going. I can feel it.) In my case, setting the master password was only an experiment, so I didn’t want to leave it on. Anyway, here’s what you do:

  1. Remove FileVaultMaster from the Keychain List in /Applications/Utilities/Keychain Access.app.
  2. Move the files FileVaultMaster.cer and FileVaultMaster.keychain in /Library/Keychains/ to the Trash. This requires an administrator password.
  3. If you set a password hint for the master password, enter the command below in /Applications/Utilities/Terminal.app. This also requires an administrator password.

    sudo defaults delete /Library/Preferences/com.apple.loginwindow MasterPasswordHint

  4. Restart. (This may not be necessary, but it won’t hurt.)

This conversation can serve no purpose anymore. Goodbye.

A real developer contest: Win a MacBook

Saturday, December 23rd, 2006

A Holiday Cocoa Duel and IronCoder are interesting, but what’s the incentive? Charity? The people’s ovation and fame forever? Please! Everyone knows that the true meaning of the Holiday Season™ is the hope of getting cool electronic stuff. In that spirit, check out the competition sponsored by Marko Karppinen & Co. Today’s theme ingredient: BaseTen, their new open source Cocoa database framework. Contestants have until January 31 to write an open source application based on BaseTen, and the developer of the app that best demonstrates the power of BaseTen will receive a new Core 2 Duo MacBook. How cool is that? Allez Cocoa!

IGNORE THIS FILE

Thursday, December 21st, 2006

/etc/fstab.hd: This file does nothing, contains no useful data, and might go away in future releases. Do not depend on this file or its contents.

Methinks the file system doth protest too much!

P.S. Pay no attention to the man behind the curtain.

WordPress Bug Friday: Wasting your bandwidth

Sunday, December 17th, 2006

I intended to post this on Friday. As they say when receiving a crappy Christmas gift, It’s the thought that counts. (They lie.) I should probably give myself a break and just change the official name to WordPress Bug Near Friday. Well, so be it. Make it so. Engage. Energize. Giddyup!

Usually I’m bemoaning the existence of HTTP 304 (Not Modified) responses, but this time the problem was the non-existence of them. (Can there be such a thing as non-existence? Where would you find it? And can you deduct it from your taxes?) I noticed when looking through my web site logs that feed requests from NetNewsWire always received an HTTP 200 (OK) response from WordPress, never 304, which means that NetNewsWire downloaded the entire content of a feed on every request. Since my web site gets more hits from NetNewsWire than from any other browser, that’s quite a lot of bandwidth used. (Relatively speaking, that is. In the grand scheme of things, my page ranking is right below the site for Grasshopper Enthusiasts of Eastern Ontario.)

Brent Simmons, the creator of NetNewsWire, was kind enough to talk to me about the problem, despite the fact that my app, Vienna, has undoubtedly taken away some sales from him. (In fact, I was all set to purchase NetNewsWire myself until I discovered Vienna.) I’m not worried about Brent, though: I heard that NewsGator paid him something like a trillion dollars for NetNewsWire, give or take. Plus he gets as many Café Lattes as he likes. Anyway, he explained that WordPress does not handle entity tags correctly.

In addition to Last-Modified headers, WordPress sends out ETag headers, which are basically gobbledygook strings that identify web content. Some web browsers, such as Vienna, only send conditional If-Modified-Since requests based on Last-Modified dates, but a browser can also store ETags and send them back on subsequent visits to the site as part of conditional If-None-Match requests. If the ETags don’t match the current content on the site, then there is new content that needs to be downloaded. NetNewsWire sends both kinds of conditional request. Unfortunately, WordPress does not parse its own ETags correctly on receiving If-None-Match requests — there seems to be a problem with quoting — so a match is never found, and it always sends a 200 response, along with the full feed content.

Brent passed along a suggestion to remove or comment out the following line in the file wp-includes/classes.php:

@header("ETag: $wp_etag");

After that, WordPress no longer sends out ETags, so it relies totally on Last-Modified dates. I’ve been testing this modification for a week, and NetNewsWire now receives both 200 and 304 responses, as appropriate. Moreover, my bandwidth has been cut by more than half. Thanks, Brent! We should form a tag team wrestling duo to pin down the WordPress developers and make them fix their feed bugs. My wrestling name will be the Penultimate Warrior.

Leopard Tech Talk: Postlude

Thursday, December 14th, 2006

If you think that America is a free country, try driving through Illinois. Their state motto is Give us all of your spare change. As I mentioned before, one of the Mac OS X Leopard Tech Talks was held in Chicago yesterday. (Except that it wasn’t yesterday when I mentioned it before.) I was there, and I had a good time. As you would expect, security was tight: I was asked for my name before receiving a name tag. Apple was kind enough to provide food and drink for the free event, which was much appreciated, though the NDA prevents me from disclosing the items on the menu. I will, however, reveal the latest secret Leopard feature. I can confirm that seeded builds now include a Spotlight-searchable C++ GUI Programming Guide. Thanks, PC guy!

I believe that I can also reveal, since I discovered this information myself by testing, that Vienna is Leopard-ready. Or at least, nothing appeared broken when I played with it for 5 minutes. Yay!

I met a number of people at the event, including fellow blogger Geoffrey Schmit of Sugar Maple Software. The biggest celebrity was Sal Soghoian, AppleScript product manager for Apple. Indeed, he was so famous that he wasn’t required to wear the standard Apple employee uniform, or the 15 pieces of flair. To their credit, the ‘software evangelists’ who gave the tech talks were open and honest. They answered my questions, and they also convinced me to shave my head and dance with an iPod at airports. I would like to thank them, as well as Apple for bringing the show on the road to a location near me.

All I can really say about the content of the talks is that Leopard has a lot of cool stuff for developers. I’m tempted now to get an ADC Select Membership, so that I can receive seeds. (In case you’re wondering, Leopard seeds are not distributed to the attendees of Leopard Tech Talks.) Perhaps that was their insidious purpose. Anyway, I would recommend attending if there’s a Leopard Tech Talk in your area. Amen!

Leopard Tech Talk: See you in Chicago

Friday, December 1st, 2006

I’ve received a confirmation email from ADC that I can attend the Leopard Tech Talk in Chicago on December 13. Thanks, Apple! I’ll try to ensure that Vienna is Leopard-ready.

If you’d like to meet me there, let me know. I should be conspicuous as the only poor soul without a laptop.