Wednesday, 29 April 2009

Elegant XML parsing with Objective-C

[This blog has moved and this article can now be found at It still appears here for archival purposes]

If you write for the Mac you get two Objective-C XML APIs, one tree-based, DOM-like interface, and the other a SAX-like event-driven interface.

If you write for the iPhone you only get the SAX interface. For many purposes this should be all your need.

I was a little disappointed, though, when I first looked at the NSXMLParser class, that it didn't really take advantage of what Objective-C could offer. Probably this was a performance trade-off, as we'll discuss later, but as it is there are only two benefits I can see to using it over a lower level C API: (1) string conversions are done for you and (2) attributes are already set into dictionaries.

I feel you can do so much more, though. So I did!

In this post I'm going to walk through my own wrapper for NSXMLParser, which I developed while writing my iPhone game, vConqr, and at the end give you access to my complete source, which is not terribly large or complex but due to some simplifying assumptions may need some tweaking to meet your needs. I'll bring those assumptions out as we go through.

What's in a name?

To start with, what's wrong with NSXMLParser? Well nothing as far as it goes. It looks like most SAX-like APIs. You provide a delegate object with methods like:

-(void) parser: (NSXMLParser*) parser
didStartElement: (NSString*) elementName
namespaceURI: (NSString*) namespaceURI
qualifiedName: (NSString*) qName
attributes: (NSDictionary*) attributeDict

This is probably the most important method in the interface. This will be called every time the opening tag for a new element is found. As you can see you get passed the parser object itself (which seems a waste, if you wanted it you'd surely just hold a reference at the start), the name of the element, a couple of namespace strings and all the attributes as a dictionary.

Aside from the redundant parser object, I also usually find that for my own small scale, app-specific, xml formats I don't bother with namespaces. If we're wrapping this API we'll probably drop all of those (this is where the first simplifying assumption comes in. If you do need the namespace data it's trivial to extend my code to add it back in - and even make it optional, as we'll see).

That leaves the element name and attributes. It doesn't get much simpler than that, does it? Well think about this for a moment. What's the first thing you're going to do in your handler method? I suspect there's not much you can do that's not element specific (actually there are a few things, but we'll even pull those out shortly). So you're probably going to need to switch on the element name.
Of course, in Objective-C you can't write a switch statement on a string, so you'll end up with something like:

if( [elementName isEqualToString: @"elephant"] )
// Do something with elephant elements
else if( [elementName isEqualToString: @"giraffe"] )
// Do something with giraffe elements
else if ( /* next check */ )
// ...

did you remember not to compare strings with == ? (catches a lot of newer, and sometimes not so new, Objective-C programmers out). Using == would compile but silently fail at runtime (it compares the pointers, not the strings).

So we have needless, repetitive, boilerplate with at least one thing that's easy to get wrong. Furthermore, if this is any more than a couple of checks and a small amount of code in each if block, you'll almost certainly want to forward on to more specialised methods anyway, such as handleElephantElement).

Can we do better? It seems that what we want here is a way to do dynamic dispatch of methods based on names we don't know until runtime. If that's not what a Dynamic Programming Language gives us then I don't know what it is.

Objective-C, or at least NSObject, has a class method called performSelector: that we can use for dynamic dispatch. But performSelector: takes a selector (of type SEL) as it's argument. Can we get a SEL from a string? Yes we can. There's a function called NSSelectorFromString() which does just that! Now, if we build our selector dynamically we can call it - but what happens if we don't implement a handler for every element? We'll get a runtime error, which will usually result in terminate being called. That's a bit harsh. Fortunately the common idiom of calling respondsToSelector: first serves us well here. Putting this all together we get something like:

SEL sel = NSSelectorFromString( [NSString stringWithFormat:@"handleElement_%@:", elementName] );
if( [delegate respondsToSelector:sel] )
[delegate performSelector:sel withObject: attributeDict];

Don't forget to include the : at the end of the selector name as you build it (so we can pass the attributes as the, currently, sole argument).

So now, with all that boilerplate pushed to the generic wrapper, we'll be able to write handlers like this:

-(void) handleElement_elephant: (NSDictionary*) attributes;
-(void) handleElement_giraffe: (NSDictionary*) attributes;

which I think you'll agree is much cleaner. Clearly there is some overhead here (hence my comments about performance earlier), but for most cases this is not going to be an issue at all.

We can do the same for end tags, but the attributes are not needed. I called this method, handleElementEnd_{name}: (where {name} is the name of the element) and used the same techniques.

So, that's elements and attributes handled, but there's one more entity that we need before we can parse any useful documents: Text nodes.

Don't text me, I'll text you

There are a number of aspects of text nodes that make them tricky to deal with in a SAX-like interface.
The first problem is what to do with whitespace? Often, within a text node, whitespace should be preserved - but at the beginning and end you just want it stripped.

Text nodes occur at any point in a document, within the root document and outside of element tags. That means that even just the newlines and indentation spaces you probably have between adjacent element tags will be represented as text nodes.

A common solution to the first issue, which may solve the second too, is to trim whitespace at the extremities (beginning and end), or better still, make it an option. If you do trim, then you need to make sure you don't raise an event for an empty text node. With text nodes trimmed, and empty text nodes suppressed, no events will be raised for formatting whitespace between tags.

The second problem (which you also need to allow for before trimming), is that a single element may contain any number of text nodes. According to most SAX-like API specs (and NSXMLParser is no exception here), there is no guarantee how the text that belongs to an element will be split up, if at all. In practice you can usually count on getting a separate event for text before any child elements, each gap between child elements, and any more text after child elements. If any of those blocks of text are large (for some value of large) it's likely they will be broken down further for memory contraint reasons (imagine that the input processor probably has a buffer it's writing text nodes into. I've certainly implemented a parser exactly that way before).

Attaching any meaning to how text is broken up is probably misguided at best, so before you process your text nodes you will almost certainly want to collate the text nodes into a single string (unless you're expecting very large blocks of text. We'll make the simplifying assumption here that that's not the case).

So, to collect our text nodes we'll need to maintain some state as to the current aggregated text. This problem is complicated if you have a mixture of text and child elements, and the text may appear before or after (or between) child elements. To manage this you'll have to maintain some sort of stack of text blocks, mapped to elements - with unbounded memory requirements.

In practice, mixing child elements and non-whitespace text is uncommon, so one way to simplify this is to ignore the whole problem. If we just take the first, or last, text nodes (ie before or after any potential child elements) then at worst the text will be truncated. For my purposes, where I was completely in control of the schema this was the route I took and is reflected in the code here (any time we see a child node we reset our text buffer). If you want to implement the more general case you'll have to look at the stack-based idea (perhaps a future blog article).

Tracking text nodes this way becomes quite simple. I declare an instance variable to hold the current text value:

NSMutableString* currentTextString;

Now to collect the text we need to implement the method parser:foundCharacters: on our delegate. This will simply append the incoming string to our current string state, or create a new mutable string copy as necessary:

-(void) parser: (NSXMLParser*) parser
foundCharacters: (NSString*) string
if( string && [string length] > 0 )
if( !currentTextString )
currentTextString = [[NSMutableString alloc] initWithCapacity:4];
[currentTextString appendString:string];

The choice of 4 characters to preallocate was entirely arbitrary and could be tweaked to your needs. Also, you might want to check if this is the first text node being appended, and is entirely whitespace (and whitespace at the ends is being trimmed), then don't even bother creating a new mutable string here just to throw it away. My version is simple and works for my needs.

In order to implement my simplification of throwing away any text before a child node we need to add a bit to parser:didStartElement:namespaceURI:qualifiedName:attributes: to release the string object if it's non-empty:

if( currentTextString )
[currentTextString release];
currentTextString = nil;

Now how do we get the text we've accumulated to the delegate? One way would be to create a new event method, handleText: for example. But you'll almost always want to tie the text up to the current element, so you'll want to track that and pass it to. I decided that this already looks so much like our end element handler that I just made it look like an extended version -

-(void) handleElementEnd_tiger: (NSDictionary*) attributes withText:(NSString*) text;

Note that if an end tag is found and there is non-empty text, both events will be fired, if implementations exist, so you can get handleElementEnd_tiger: and handleElementEnd_tiger:withText:

You could use the same technique of looking for two versions of a method, one with more arguments, to optionally pass namespace data, if you like.

So now, with elements, attributes and text nodes sorted we can start doing some basic parsing. However we soon run up against another issue.

In my element

When we write an element (start or end) handler, with our without text nodes, we know our immediate context. We at least know the current element name.

However XML is a hierarchy. While it is certainly possible to write XML such that any given element name can occur at exactly one level of the hierarchy, this is a bit limiting to enforce all the time. For example, in vConqr I have an element called path that contains the vector coordinates of part of the border of a territory. But my borders are split up into three types - internal, external and continent (where continent means an internal border that separates two continents). There are two ways to represent that relationship in XML. One is to make the border type a property of the border element (type="internal" etc). The other is to build it into the name (internalBorder, externalBorder, etc).

I chose to go with the latter because maintaining a stack of element names is easier than maintaining a stack of elements and attributes. To properly implement the element and attributes stack I'd be going down the road of building partial DOM objects in memory - which in the general case is not a direction I wanted to go in.

But just keeping a stack of element names is much simpler, and so I added this to the wrapper class.

I just have an instance variable for the stack:

NSMutableArray* elementStack;

and I push and pop the element names on and off the stack in parser:didStartElement:... and parser:didEndElement:... respectively.

Access is by a simple method:

-(NSString*) ancestorElementAtIndex:(int) index
return [elementStack objectAtIndex: elementStack.count-(index+1)];

Now, in my handleElement_path: method. I can call [parser ancestorElementAtIndex:1] and get back the border element that the path belongs to and act appropriately.

Thanks a bundle

With the handler code nicely simplified all that remains is to kick the parser off. This also has a fair bit of boilerplate associated with it. Again, I've made a few assumptions that are appropriate to the way I tend to use this. In particular I'm assuming that I'm targeting the iPhone, and that the XML files in question are in my application bundle. In my next version I have to deal with XML that I download elsewhere too, so I have an alternative version of the parser launching method that handles that.

For now, though, I have loadFromBundle:, which starts with the following code that generates the filename, looks it up in the application bundle and initialises the NSXMLParser with it:

NSString* filename = [NSString stringWithFormat:@"%@.xml", name ];
NSString *path = [[[NSBundle mainBundle] bundlePath] stringByAppendingPathComponent:filename];

NSURL* url = [NSURL fileURLWithPath: path];
NSXMLParser* parser = [[NSXMLParser alloc] initWithContentsOfURL:url];

Then some more simplifying assumptions:

[parser setShouldProcessNamespaces:NO];
[parser setShouldReportNamespacePrefixes:NO];
[parser setShouldResolveExternalEntities:NO];

Remember, we're not dealing with namespaces. We're also not interested in referencing external entities.

The wrapper class subclasses NSXMLParser, so the delegate is self (the wrapper maintains it's own delegate).
And we can start the parsing:

parser.delegate = self;
parser parse];

The parser will call back the low level handlers on the subclass, which will translate those to the dynamic handler names we discussed earlier, maintaining the text nodes and element path stack, and keeping the application code simple, elegant and expressive.

The full source code for my XML wrapper can be found here:


Sunday, 26 April 2009

ACCU 2009 - after the math

[This blog has moved and this article can now be found at It still appears here for archival purposes]

It's 23:57. That can only mean it's time to blog about the ACCU conference again!

First I'd like to say that the #accu_conf hashtag for the conference was a great success, yielding 11 pages of tweets at the current count. I strongly believe that twitter is becoming more important every day, and that it's power to change the way we look at communication has yet to be fully assessed.

As for the conference programme itself - I can reaffirm my statement that this is the premiere developer's conference in the world! What really sets it apart is that the content is almost exclusively about programming, rather than about specific tools and libraries, and the delegates genuinely do subscribe to the "professionalism in programming" ethic of the ACCU.

To give you a flavour, here's a tweet from "Uncle Bob" (Robert Martin - of Agile Alliance and Object Mentor fame): "This is probably the geekiest conference I've been to. Lots of coding, lots of interesting discussions. Wow.". And later, "And now for the long trek home. I wish I could stay, this is a really fun conference."

Sadly, for a combination of reasons, I missed about half the sessions this year - but what I did see were at least as high quality as I have come to expect.

It was especially revealing hearing about the threading support being added to C++0x (already starting to be referred to as C++1x) - then shortly after hearing about the concurrency support being added to D 2.
In the former case the focus was on catching up in terms of the memory model and library primitives. Welcome additions to be sure. However D continues to move forward in ways that only a language not constrained by it's own legacy can do. It's contributions this year expanded on last year's functional support with transitive immutable and const modifiers, to add keywords that mark variables as shared - thus making explicit the communication paths that have been the bane of just about every imperative language before by their implicitness.

D's concurrency was presented to us by Walter Bright (who first designed the language). However, I found it amusing how Andrei Alexandrescu, in both his presentations, appeared to be talking about C++, but held D up as being the true answer to just about every tricky point he raised. More subtle than Russell Winder's continuing "C++ serves no useful purpose" theme, to be sure, but no less damning!

As usual, half of the real content of the conference took place after hours in the hotel bar (or restaurants in Oxford). John Lakos' absence from this ritual was, therefore, all the more noticeable. He arrived on friday, apparently jet-lagged, and had to rush through his 400-odd slides at such a rapidly increasing rate that it appeared his head would explode before he finished it! I can report that his head did survive to see another day, but it wasn't seen in the bar.

Perhaps he felt he wasn't required to put a new puzzle out this year, as there was a cryptography contest going on already in aid of raising funds for Bletchley Park Museum.

In all the message is clear. If you weren't at the conference you missed out on something rather special. Make every preparation now to be there next year.

Sunday, 19 April 2009

ACCU Conference 2009

Next week (or this week, depending on how you count it... oh it's 23:57, by the time I finish this post it will be this week either way) is the 2009 Annual Conference of the ACCU (which doesn't stand for the Association of C & C++ Users... anymore).

I'm just finishing my slides for my presentation on Thursday morning where I'll be talking about Objective-C. I expect a full room, of course, and anyone at the conference who doesn't come must make it up to me by buying me a drink.

It seems like every year the conference programme gets stronger (despite my contributions to some of them) - and when I first started going (six years ago) it was already the best developer-oriented conference around.

This year has a special track on patterns, and has some world class experts on the subject presenting. The usual world class experts on other subjects will also be presenting too, of course. I won't say more because I'll only be repeating what you can already read here.

If you use Twitter (and if not, why not?) you can follow all the twitter-chatter about the conference, including what interview puzzle John Lakos will be posing in the bar this year, by following the #accu_conf hashtag.

If you use a twitter client that supports it (and if not, why not?) I find it's best to subscribe to a search for the tag, rather than rely on hashtags itself picking up all the tweets (seems to have a hit rate of about 30% at the moment). Personally I use TweetDeck, which I have found pretty good - but am looking forward to Tweetie for Mac being released tomorrow (or later today). But that's getting off topic.

See you all there!