Monday 18 June 2007

Readable RAII in C# ?

[This blog has moved and this article can now be found at http://www.levelofindirection.com/journal/2009/9/24/raii-and-readability-in-c.html. It still appears here for archival purposes]

In my last post I made the following comment:

"Of course, in C# there is the using keyword, along with the IDisposable interface, which gives you a little more C++-like scoped disposal. Even this is less clean and more awkward than the C++ model, and I believe also has scalability problems (caveat - I haven't really used this in real C# programs)."

Since that time I have been using C# almost exclusively and have had a chance to explore the using statement a bit more. In this post I'm going to expand a little on the theme of: "I believe [the using statement] also has scalability problems". How true is that statement, and what can we do to improve on things?

First - a quick review: what is the using statement?
Well, C#, like Java, but unlike C++, has non-deterministic destructors. That is, you don't know when destructors will be called because they are called by the garbage collector, rather than the point at which the object they are to be called on goes out of scope.

In C++, having deterministic destructors is very useful for implementing RAII (Resource Acquisition Is Initialization) techniques, a big part of which is cleaning up resources in the destructors. This includes, but is not limited to, memory allocations. Other resources could be file handles, database connections, GUI drawing object handles, or even more abstract "resources" such as closing tags in an XML outputer!

The subtlety comes in the face of exceptions - which could occur at moments that are difficult to determine from the static code. I'm glossing here, of course, because many other articles have been written on the details of RAII and exception safety.

In Java you can only really deal with exceptions when you have resources other than memory by using the try-finally construct. The finally clause gives you a place to put common clean-up code. You can abstract this further using Execute-Around. All this is covered in my previous post.

In C# the story is much the same, with try-finally available. But C# goes one better by supplying the using statement. The using statement gives us back much of what we had in C++, by allowing a method on the object (Dispose, a method of the IDisposable interface that you must derive from) to be called at the end of the scope - whether that scope is exited by the normal flow of execution, or by an exception.

However, we still have to put something in the client code - the using statement itself - and if you follow the traditional patterns you do, indeed, reach a scalability problem in terms of readability.

To illustrate, imagine you have a class, Foo, which implements IDisposable. To use it you'd write code much like this:
  using( Foo foo = new Foo() )
{
// do stuff with foo here

} // <-- foo's Dispose method is called here

This looks nice and clean so far. But imagine if you need three Foos:
  using (Foo foo1 = new Foo())
{
using (Foo foo2 = new Foo())
{
using (Foo foo3 = new Foo())
{
// Do stuff with foos here
}
}
}

Contrast that with what we'd need to to similar in C++:
  { 
Foo foo1;
Foo foo2;
Foo foo3;

// Do stuff with foos here
}

In our C# version, the more disposable objects you have the more the readability suffers!
Imagine too, that the declarations are more complex (as they usually will be), and things start to get messy.

But how often do you really need so many objects with deterministic disposal in C#? After all most things are going to be managed anyway, aren't they?

Well, one common example might be with database connections, along with transactions, command objects, etc - all needing to be composed together, then disposed of at the right times.
Another might be something that generates XML or HTML. Even when using such things as XmlWriter there are times it is helpful to represent the hierarchy with RAII.

So, can we improve on this situation?

Well, the first thing to note is that there is a way to assign multiple objects to one using statement in C#. However there is one important caveat: all the objects must be of the same type!
  using (Foo foo1 = new Foo(),
foo2 = new Foo(),
foo3 = new Foo())
{
// Do stuff with foos here
}
This is certainly better. However, if the declarations get more complex it can tend to suffer from readability problems along similar lines to passing anonymous delegates to methods - too much code between commas!
Also, that single type limitation is a cruel one. It won't help with our database connection, transaction etc problem.

However there is another way! It allows different types to be declared. It can co-exist with the previous construct for multiple declarations of the same type, and it doesn't use any new language features! (thanks to George Moudry for suggesting this).
In fact, on first appearances it might even appear a bit of a hack - but bear with me here.
The secret is to go back to our first example - with the nested using statements. The trouble was all the redundant braces and indentation. So remove them! What? Yes, you don't actually need them! the braces just allow you to write multiple lines of code and have it appear to the compiler as a single block. However, if your multiple lines are another using statement, then it's already a single block.
Before you bring out the flame-throwers, take a look at this example:
  using (Foo foo1 = new Foo())
using (Foo foo2 = new Foo())
using (Bar bar = new Bar())
{
// do stuff with foos and bars here
}

It actually looks quite neat. How about something more realistic (thanks to Vishal Doshi for this example):
  using (new System.Transactions.TransactionScope())
using (OracleConnection conn = m_DB.GetConnection())
using (OracleTransaction tx = conn.BeginTransaction())
{
//use conn, tx within the transaction scope
}

Also, remember I said you can mix it with C#'s construct for multiple declarations of the same type. Our foo-bar example could be written as:
  using (Foo foo4 = new Foo(),
foo5 = new Foo() )
using (Bar bar = new Bar() )
{
// Do stuff with foos and bars
}
So, in conclusion: I still prefer the elegance of C++'s RAII capabilities - but creative use of C#'s using statement make using using more usable!