Transforming recursion to iteration via coroutines
This blog post is adapted from my book The Tcl Programming Language.
Many programming tasks are very simply expressed and implemented through recursive algorithms, traversing a tree data structure being just one example. The primary reason recursion simplifies implementation is that the state of the computation is implicitly maintained, freeing the programmer from the burden of explicitly tracking the computational state of the program. For example, in a recursive tree walking implementation, the "current location" in the tree is implicitly tracked.
However, there are situations where a recursive model does not fit the needs of an application. For example, the application may want to traverse a tree in iterative fashion, retrieving one node at a time, operating on it and then potentially doing some unrelated computation before retrieving the next node at some unknown point in the future.
Here is where coroutines can bridge the impedance mismatch, presenting an iterative interface to a naturally recursive algorithm.
This post assumes you know the basics of coroutines.
As a simple, but slightly contrived example, consider a collection of integers stored as a nested list of arbitrary depth.
% set nested {0 {1 {2 3}} {4 5}}
0 {1 {2 3}} {4 5}
Writing a recursive routine to invoke a command on each element of such a tree-like structure is very straightforward.
proc iterate {l cmd} {
set n [llength $l]
if {$n > 1} {
foreach e $l {
iterate $e $cmd
}
} elseif {[llength $l] == 1} {
{*}$cmd [lindex $l 0]
}
}
We can then apply some arbitrary operation to each element, for example output its square.
% iterate $nested [lambda i {puts [expr {$i*$i}]}]
0
1
4
...Additional lines omitted...
However, this is not what we are after. What we actually want is an iterator command that would return the successive integers from this collection on each call.
Converting recursive code to an iterator command is normally not possible because by its very nature the recursion encodes the state of the computation and will be lost when the iterator returns. On the other hand, writing a non-recursive implementation that explicitly keeps track of state involves quite a bit more bookkeeping.
Coroutines to the rescue. Because they can return values while preserving execution state, we can easily wrap a coroutine around the above to construct an iterator command.
proc iterator_wrapper {collection} {
yield
iterate $collection yield
}
coroutine iterator iterator_wrapper $nested
We can now retrieve one element at a time.
% iterator
0
% iterator
1
% iterator
2
To better appreciate the simplicity of the above iterator implementation, the reader might want to write one without using coroutines.
As another more real world example assume we want an iterator that will return each file or directory within a directory tree. Happily, we have fileutil::find
command from tcllib
that will do most of the hard work for us by returning a list of file paths under a specified directory. It so happens that the command is implemented using recursion.
% print_list [fileutil::find c:/temp]
c:/temp/fromDir
c:/temp/newDir
...Additional lines omitted...
However, we do not want this entire list of files at one shot for many reasons. It may consume too much memory, we may only be interested in the first few entries based on some criteria and not want to waste cycles with unnecessary disk retrievals, and so on. We would therefore like an iterator that will return a single entry at a time. Now it so happens that the fileutil::find
command takes a second optional argument that is the name of a filter command. This command is invoked for each file or directory. If the command returns a boolean true value, the file is included in the returned list. We will misappropriate this feature for our own use.
proc yield_one {fname} {
yield [file join [pwd] $fname]
return 0
}
proc file_iterator {dir} {
yield
fileutil::find $dir yield_one
}
We have cunningly fooled fileutil::find
to yield as part of its filter functionality. Let us try it out.
% coroutine nextfile file_iterator c:/temp
% nextfile
C:/temp/fromDir
% nextfile
C:/temp/newDir
% while {[set next [nextfile]] ne ""} { puts $next }
C:/temp/fromDir/fileA.txt
C:/temp/fromDir/subDir
C:/temp/newDir/fileA.txt
...Additional lines omitted...
As an aside, we could have simply used the filter callback to directly print the file path. There would be no need for a coroutine in that case. However, that is only because the "operation" on the file just consists of printing the path. In a more complex scenario, like an FTP server returning directory contents to a client one at a time, the callback approach would not be feasible.
The ability to return values from a computation while preserving its implicit state greatly simplifies programming. Once again, this is best appreciated by implementing the above example without the use of coroutines.
Of course, it goes without saying that the use of coroutines as illustrated in this post is only one of the many wide variety of applications of coroutines. Refer to The Tcler's Wiki, the aforementioned book or the article by Arjen Markus for other examples.