Sunday, January 30, 2011

Safe IO on the Xbox 360 in F#

... or how computation expressions can help you write concise, clean and exception-safe code.

The XNA framework offers a number of APIs to access files. The Storage API, described on MSDN, is a popular one.

All seasoned XBLIG developers know that this API has a number of pitfalls that are not always easy to avoid. Here are those that come to my mind:
  1. Files can only be accessed after getting access to a storage device, which requires interaction with the user as soon as more than one storage device (hard-disk, memory unit, usb stick...) is available. As all of the steps cannot be performed within a single iteration of the update-loop, this forces the programmer to spread the steps over multiple iterations.
  2. Attempting to get a storage device while the guide is open results in an exception. The guide is a graphical interface to the console's "operating system" which is available at any time.
  3. At most one storage container may be in use at any time (a storage container is a collection of files, it can be seen as a directory).
  4. The user may at any time remove a storage device, which results in exceptions being thrown while accessing the device.
  5. File access is quite slow. In order to keep the GUI responsive, games must perform IO asynchronously.
Attempting to use the XNA API as it is almost invariably leads to bugs. I would say storage-related crashes are among the top 3 reasons games fail peer review. EasyStorage is a popular C# component that simplifies safe IO in games. In this article, I describe an alternative component written in F#.

Let us look at each pitfall and consider ways to avoid them.

Problem 1: Getting a storage device asynchronously


A simple popular solution consists of requesting the storage device, which is an asynchronous operation, and busy-wait until the operation completes when the user chooses a device.

let getUserStorageDevice player = task {
    let! async_result = doOnGuide(fun() -> StorageDevice.BeginShowSelector(player, null, null))
    do! waitUntil(fun() -> async_result.IsCompleted)
    let device = StorageDevice.EndShowSelector(async_result)
    return
        if device = null then
            None
        else
            Some device
}

This function takes a PlayerIndex and returns an Eventually computation expression (which I call a task). doOnGuide is another task which I describe shortly hereafter. Busy-waiting occurs in "do! waitUntil" on the third line.

Problem 2: Avoiding GuideAlreadyVisible exceptions


Whenever you want to open the guide to ask the user to choose a device, to show a message box, send a message to a friend, open the virtual keyboard, you must check whether Guide.IsVisible is false. Even if it is, you have to surround your call to the guide with a try...with block, as GuideAlreadyVisibleException may be thrown. It may surprise beginners, but so is the case, as I have experienced during peer review of Asteroid Sharpshooter.

let rec doOnGuide f = task {
    do! waitUntil(fun() -> not Guide.IsVisible)
    let! result = task {
        try
            return f()
        with
        | :? GuideAlreadyVisibleException ->
            do! wait(0.5f)
            let! eventually_some_bloody_result = doOnGuide f
            return eventually_some_bloody_result
    }
    return result
}

doOnGuide is a recursive function which repeatedly busy-waits until Guide.IsVisible is false. Then it tries to execute the provided function f. If GuideAlreadyVisibleException is thrown, it is caught, discarded, and doOnGuide calls itself again after waiting a short while. This additional wait for half a second is not strictly necessary, I put it there mostly because the idea of raising one exception per update cycle disturbed me a bit.

I don't find this repeated get-rejected-and-retry particularly pleasing to the eye, but if you have seen it's "hacked-the-night-before-sending-to-peer-review" variant in C#, you'll probably find my version decently pretty.

Problem 3: At most one storage container opened at any time


The solution is again pretty simple in principle: keep track in a central place whether a storage container is already opened. If it is, busy-wait until it isn't.

type Storage() =
    let is_busy = ref false
    member this.DoTitleStorage(container_name, f) = task {
            do! waitUntil(fun () -> not !is_busy)
            try
                is_busy := true

                let! result = doInContainer device container_name f
                return result
            finally
                is_busy := false
    }

Class Storage is the class that coordinates access to storage devices. Only parts of the class relevant to problem 3 are shown here.

The first line of method DoTitleStorage busy-waits until is_busy becomes false. When this happens, it goes ahead and immediately sets is_busy to true again. Readers concerned about race conditions and multiple waiters proceeding into the critical section unaware of each other may rest reassured: multiple tasks are picked and executed one a time using cooperative multi-tasking. True concurrency and its pitfalls are out of the picture.

Note the finesse about using finally to reset is_busy. We are not quite sure of what f will do in the container. Should it do something nasty and get surprised by an uncaught exception, the storage component won't be left in an unusable state. Doing proper clean-up and recovery in traditional asynchronous programming using events and callbacks can be difficult. Actually, the difficult part is to remember to do it when the code is turned "inside out".

Problem 4: Uncaring users yanking storage devices at inappropriate times


The only solution here is to sprinkle your code with try...with and checks for StorageDevice.IsConnected. Again, the availability of try...with in computation expressions makes it relatively painless in F#. See problem 2 above for a code example.

Problem 5: Asynchronous file operations


I haven't tackled this problem yet, mostly because I have only dealt with very small files so far (score tables and user settings). I will leave this for another post, if the need arises.
The only point I wanted to mention is that programmers should be wary of simultaneous progression in the GUI and asynchronous IO. Typical tricky situations include users quitting the game while data is being saved, triggering new IO while previous IO operations are still ongoing. For these reasons, it is advisable to limit the responsiveness of the GUI to showing a "please wait" animation, and busy-waiting until IO finishes.

Wrap-up

That's all for now. Complete code is available in XNAUtils. It's still work in progress, but it's already usable. It can be interesting to compare to an earlier attempt I did at dealing with the storage API, using state machines. The previous attempt is both longer in lines-of-code and harder to read. I think it's a lot easier to convince oneself or detect bugs in the newer version using Eventually computation expressions and cooperative multi-tasking.

No comments: