Promises by example

Published 2016-01-20, updated 2016-01-20

I had previously described an experimental implementation of promises for Tcl. On re-reading my earlier post, I was somewhat dissatisfied with the treatment there in that I did not feel it fully reflected the value of the promise abstraction, getting somewhat caught up in the details. This post takes a somewhat different approach, concentrating more on examples and refraining from going into detail about each command or method. Here I am more interested giving you a flavor of programming with promises and motivating you to explore further.

This post is based on the promise package (see documentation) which is a further developed version of the one in my earlier post.

Sequential versus async code

The benefits of code written in a sequential style is that it easy (or easier) to read and write and therefore easier to reason about. The drawbacks are of course that sequential code often blocks, waiting for a resource or engaging in a long computation, and consequently does not allow other activity, such as user interaction, to occur in a timely manner.

On the other hand, asynchronous code permits multiple operations to proceed in parallel thereby providing better utilization and response. The cost is of course that async code is harder to write, understand and reason about. Situations where async actions have dependencies or multiple outcomes with dependencies become particularly tricky to manage.

Promises offer the best of both worlds in that they make it possible to write async code in a sequential style. The examples here are intended to illustrate their benefit.

Running non-blocking computations

The first example involves running a compute-intensive task, say you want to calculate the 100,000th Fibonacci number and display the number of digits in it (don't ask me why).

A sequential version might look like this

package require math
tk_messageBox -message "[string length [math::fibonacci 100000]] digits"

That one liner could not be easier to write but the problem is that user interaction is blocked while the computation chugs along for several seconds.

One possibility is to break up the Fibonacci computation into chunks and schedule them using the event loop. This is awkward and unnatural and you would have to write your own Fibonacci code.

An alternative is to fire off a separate thread, let it run asynchronously, then collect and display the result when its done. You can write the code yourself as an exercise using the Threads package.

Or you could just use promises. The code should be almost self-explanatory.

set task [ptask {
    package require math
    string length [math::fibonacci 100000]
}]
$task done [lambda value { tk_messageBox -message "$value digits" }]

Here ptask is a utility command from the promise package that runs a script in a separate Tcl thread. Essentially we create an asynchronous task to do the computation and display the value when it is done. That is not a whole not more complicated than the sequential version and at the same time, being asynchronous, does not tie up the whole computation.

Now, folks who use the Thread package regularly are going to go Pffooh (or something similarly rude) and claim this is just a matter of wrapping the asynchronous form of the Thread send command. I assure you, dear reader, it is not so. As we go along, you will see you can compose and combine promises for asynchronous operations with the same ease as you do for results from synchronous operations, something you cannot do with plain old asynchronous library calls.

Parallel computations with threads

Let us extend the previous problem a bit. Instead of a single Fibonacci number, suppose we are given a list of integers and have to generate the corresponding Fibonacci numbers.

In sequential code, lmap makes this easy.

tk_messageBox [lmap n $numbers {
    math::fibonacci $n
}]

Spend a moment on thinking how you would do this asynchronously in parallel without promises. Implementing with promises is (almost) as easy as the sequential case.

set computation [all [lmap n $numbers {
    ptask "package require math; math::fibonacci $n"
}]]
$computation done [list tk_messageBox -message]

Or if we don't mind being a bit more obscure and get rid of the computation variable,

[all [lmap n $numbers {
    ptask "package require math; math::fibonacci $n"
}]] done [list tk_messageBox -message]

Before I explain the above, consider that not only is the promise-based code non-blocking, it will also complete faster as the computations are run in separate threads that will run in parallel on multi-core systems.

The above code works similar to the sequential version, except that the lmap in the sequential code collects values whereas in the promise fragment the lmap collects promises, essentially placeholders which will be filled in later. This is one reason that promises make async programming easier - the same patterns, lmap in this case, that are commonly used in sequential programming can be used with promises.

The example also illustrates one way in which promises can be combined. Here the all command creates a new promise that is fulfilled when the promises passed to it have all fulfilled. This ability to combine promises, we will see other possible combinations later, is another reason promises make async programming more convenient.

Chaining asynchronous actions with dependencies

Lest the reader think multithreaded computations and calculating Fibonacci numbers are all promises are good for, I will move on to a different use case scenario. We would like to download some Web pages to a directory, and then zip the directory and then send it off via email.

The sequential version of the code looks something like this (ignoring all error handling, encodings, issues with file names etc. since they are not relevant for this discussion)

proc save_url {dir url content} {
    set fd [open [file join $dir [file tail $url]] w]
    puts -nonewline $fd $content
    close $fd
}
foreach url $urls {
    set tok [http::geturl $url]
    save_file $dir $url [http::data $tok]
    http::cleanup $tok
}
exec zip -r pages.zip $dir
exec blat pages.zip -to [email protected]; # blat is an SMTP sender

We would like this to take place behind the scenes so other activities are not blocked at any stage. Writing an asynchronous version involves firing off multiple async activities, in parallel where possible, and keeping track of their completions and dependency ordering.

Again, think for a moment how you would write this in asynchronous code before reading the promise-based implementation below.

This implementation is a bit more complicated than what we saw earlier so we will break it down into steps. Even though all execution is happening asynchronously, we can write the steps in a sequential fashion

First, given a url, we have to download it and then write it to a file. The code says almost exactly that.

set download [pgeturl $url]
$download then [lambda {dir url http_state} {
    save_file $dir $url [dict get $http_state body]
} $dir $url]

The pgeturl command returns a promise in the first line. This promise will be fulfilled when the download completes. The second line states Download, then invoke the anonymous procedure to save the content. The real value of a promise is delivered as the last argument to the procedure. In this case, it is the HTTP state dictionary with the body key containing the content of the URL.

That takes care of one URL download. What we need is a collection of them and how do we get that? Just use lmap or foreach of course. Just like for sequential code. At the risk of belaboring the point, you cannot do that with plain event or callback based code; you can however do so with promises.

Once you have a list of promises, we can invoke the all command to create a promise that represents their collective computation. The above code for a single URL then becomes

set downloads [all [lmap url $urls {
    [pgeturl $url] then [lambda {dir url http_state} {
        save_file $dir $url [dict get $http_state body]
    } $dir $url]
}]]

where we have gotten rid of the download variable as it is not really needed.

Continuing in the same fashion, download, then save, then zip, then email, and when finally done, inform the user.

set zip [$downloads then [lambda {dir dontcare} {
    then_chain [pexec zip -r pages.zip $dir]
} $dir]]
set email [$zip then [lambda dontcare {
    then_chain [pexec blat pages.zip -to [email protected]]
}]]
$email done [lambda {dontcare} {
    tk_messageBox -message "Zipped and sent!"
}]

I will explain the then_chain command in a bit but first let's rewrite the above sequence as a procedure.

proc zipnmail {dir urls} {
    set downloads [all [lmap url $urls {
        [pgeturl $url] then [lambda {dir url http_state} {
            save_file $dir $url [dict get $http_state body]
        } $dir $url]
    }]]
    set zip [$downloads then [lambda {dir dontcare} {
        then_chain [pexec zip -r pages.zip $dir]
    } $dir]]
    set email [$zip then [lambda dontcare {
        then_chain [pexec blat pages.zip -to [email protected]]
    }]]
    $email done [lambda {dontcare} {
        tk_messageBox -message "Zipped and sent!"
    }]
}

Comparing this with the sequential version, though this is clearly not as simple, it is not hugely complicated either. The style is still sequential and the flow of execution is quite simple to understand even though there are multiple asynchronous operations going on, some in parallel and some sequentially dependent on others. Not only will the operation complete faster, the user interface will not be blocked while they are ongoing.

Conversely, implementing the same functionality using async callback and events would have been much harder to write and get correct.

I'll now explain the then method and the then_chain command. Both the done method, which we saw earlier, and the then method register handlers, called reactions in the Javascript/ECMA standards, to be called when the corresponding async operations complete. These reactions are passed the value of the completed operations so they can take the appropriate action. The difference between the two is that whereas the done method does not return a value, the then method returns another promise. This is what allows the async operations represented as promises to be chained or composed where one operations depends on the result of a prior one.

In the above example, once the downloads are done, the pexec command, which runs the zip program returns another promise which will complete when the zipping is done. Remember all this is happening asynchronously.

The then_chain command is subsequently used to chain the new promise (returned by pexec) to the one returned by the then command so that the latter completes when the pexec one completes. (Remember all operations are asynchronous so it would not have been possible to directly tie the pexec, which happens later, to the promise returned by then which happens earlier).

The use of chaining through then is the part of promises that I had the hardest time wrapping my head around. It took a couple of examples to work through at which point it becomes fairly straightforward.

Racing promises

We have seen two ways of combining promises - using all to construct a promise that is fulfilled when all promises in a set are fulfilled, and then to compose or chain a sequence of promises.

A third way of combining promises is the race command, or its sibling race*. Like all, these commands return a promise that combine a list of promises. Unlike all though, the new promise is fulfilled when any of the combined promises is fulfilled.

Let us return to our dear Fibonacci number computation. Except now we would like to not wait beyond a certain time for the computation to complete. We accomplish this by creating a separate timer based promise and wait for either the computation to finish or the timer to fire.

set task [ptask {
    package require math
    string length [math::fibonacci 100000]
}]
set timer [ptimer 1000 "Can't wait any more"]
[race* $task $timer] puts

Now depending on whether the task completes within 1000ms or not, we will see either the result of the computation or the message from the timer.

Note how in this last example we combined promises from different sources. That is another feature of the promise abstraction; once the computation or operation is written to be promise-based we can combine them irrespective of the underlying operation.

Looking ahead

The examples presented here have hopefully stirred your interest in promises as part of your Tcl programming arsenal. However, there are a couple of matters we have left out in our examples in this post.

First, using promises requires the underlying async operation to be wrapped with a promise interface. All the promises we used in our examples were created using wrappers, like ptask and pgeturl, that are built into the promise package. How does one then implement such a wrapper for some async operation that is not already provided by the package? This will be the topic of a future post but for starters you can look at the implementations of pgeturl, ptask and friends by clicking on the Show source link in the command documentation. Wrapping is usually fairly straightforward.

Second, we have brushed off any considerations of errors and how they are handled. This is one of the major complicating factors in async programming and another aspect that is greatly simplified through the use of promises. Again, that is a topic for a future post.

In the meanwhile, having seen some examples of promises in action, you may want to revisit my prior post which went into a little more detailed description of promises, albeit with a slightly different API.