Introducing Tcl 8.7 Part 13: more list enhancements
This is the thirteenth in a series of posts about new features in the upcoming version 8.7 of Tcl. New commands for list manipulation in Tcl 8.7 were described in a previous post. Since then, some additional enhancements related to lists have been implemented. These are described in this post.
To take Tcl 8.7 for a spin, you can download a pre-alpha binary for your platform. Alternatively, you can build it yourself from the
core-8-branch
branch in the Tcl fossil repository.
The enhancements described here are not available in an officially released Tcl 8.7 build at the time of writing. You will need to build from source in the core-8-branch
in the Tcl fossil source repository. On Windows, you can download a binary distribution from magicsplat
The lseq
command
The lseq
command creates a list containing a sequence of numeric values. It can take one of several syntactic forms. The first of these generates a sequence starting at 0 and containing a specified number of values.
lseq COUNT ?by STEP?
The optional argument STEP
is the difference between successive values in the sequence and defaults to 1. It may be negative as well.
% lseq 5
0 1 2 3 4
% lseq 5 by 2
0 2 4 6 8
% lseq 5 by -1
0 -1 -2 -3 -4
The second form of the command is similar except that it permits the initial value to be specified.
lseq START count COUNT ??by? STEP??
The generated sequence will begin at START
instead of at 0
.
% lseq -2 count 5
-2 -1 0 1 2
% lseq 2 count 5 by -1
2 1 0 -1 -2
% lseq 2 count 5 -1
2 1 0 -1 -2
Note the by
keyword is optional as shown above, but the count
keyword must be specified to distinguish this form from the third form of the command described below.
The final form of the command differs from the first two in that it specifies the range covered by its end value as opposed to a count.
lseq START ?..|to? END ??by? STEP?
The sequence is generated by incrementing each prior element by STEP
until it crosses the END
value.
% lseq 1 to 5
1 2 3 4 5
% lseq 1 .. 5
1 2 3 4 5
% lseq 1 5
1 2 3 4 5
% lseq 1 to 5 by 2
1 3 5
% lseq 1 to 5 by 3
1 4
% lseq 1 to -5 by -2
1 -1 -3 -5
The ..
and to
keywords are synonyms and optional as is by
. I prefer to use them for readability. Note in the above examples that the END
value may or may not be present in the sequence depending on whether it differs from START
by an integral number of steps.
In this third form, if the STEP
argument is not specified, it defaults to an absolute value of 1
with the sign of the step determined by the relative values of START
and END
. For example,
% lseq -2 to 2
-2 -1 0 1 2
% lseq 2 to -2
2 1 0 -1 -2
In all the above examples, the numeric values passed to the command were constants. However, lseq
will also accept expressions in the same syntax as the Tcl expr
command. For example,
% set n 5
5
% lseq {$n-2} {$n+2}
3 4 5 6 7
The lseq
command is not restricted to integer values. It can be used with floating point values as well. However, you need to be aware of issues around floating point computations that are not specific to lseq
, or even Tcl, but computers in general. For example,
% lseq 0 2 by 0.5
0.0 0.5 1.0 1.5 2.0
% lseq 0 2 by 0.4
0.0 0.4 0.8 1.2000000000000002 1.6
Note in the second case how the last element is 1.6, and not 2.0. That's floating point math for you. Because of the imprecision of floating point computation, the value after 1.6
is calculated as greater than 2.0
albeit by a very minute amount.
Below are a couple of examples to illustrate how lseq
can make code more concise. You may want to implement the same functionality in Tcl 8.6 for a comparison.
In many cases, lseq
replaces uses of for
with more intuitive list based iterations. Generate a list of first 5 squares:
% set l [lmap i [lseq 1 count 5] { expr {$i*$i} }]
1 4 9 16 25
Remove elements at even indices from a list:
% set l {a b c d e f g}
a b c d e f g
% lremove $l {*}[lseq [llength $l] by 2]
b d f
lseq
can also be useful within expressions. Check if an integer is odd and within a specified range:
% expr {$i in [lseq 1 99 by 2]}
1
One final important point to keep in mind about lseq
is that it is very efficient in terms of memory as unlike "traditional" lists it does not need to actually store values of every element. Thus even [lseq 100000000]
takes up less than a hundred bytes of storage.
The ledit
command
The new ledit
command is very similar to the lreplace
command in Tcl 8.6 except that instead of operating directly on a list value, it operates on a list value contained in a variable and stores it back in that variable. The command syntax is
ledit LISTVAR FIRST LAST ?ELEM ...?
The FIRST
and LAST
arguments are indices that specify the range of elements to be deleted (inclusive). The optional ELEM
arguments specify the new elements to be inserted at index FIRST
.
% set l [list a b c d e]
a b c d e
% ledit l 2 3 X Y Z
a b X Y Z e
% set l
a b X Y Z e
Because of Tcl's reference counting mechanism, commands that operate on variables are generally more efficient than those operating on values. This is as true for ledit
as it is for other list commands like lappend
that operate on variables and is the reason for prefering it to lreplace
when the existing value does not need to be preserved. The following performance measurement samples illustrate the difference.
% timerate -calibrate {}
0.03035015489656804 µs/#-overhead 0.031086 µs/# 62792591 # 32168335 #/sec
% set l [lrepeat 1000 x]; llength $l
1000
% timerate {set l [lreplace $l 499 500 X Y]}
2.956879 µs/# 334759 # 338194 #/sec 989.842 net-ms
% set l [lrepeat 1000 x]; llength $l
1000
% timerate {ledit l 499 500 X Y}
0.240055 µs/# 3698156 # 4165711 #/sec 887.761 net-ms
Like lreplace
, the command can be used not just for replacing a range of elements but for pure insertions and deletes as well. The command can be used in lieu of linsert
by specifying LAST
as a value less than FIRST
. Moreover, for the same reasons as stated above it can be more efficient as well.
% set l [lrepeat 1000 x]; llength $l
1000
% timerate {set l [linsert $l 500 X]}
44.4620 µs/# 22477 # 22491.1 #/sec 999.373 net-ms
% set l [lrepeat 1000 x]; llength $l
1000
% timerate {ledit l 500 499 X}
0.337232 µs/# 2720481 # 2965318 #/sec 917.433 net-ms
For large lists, the performance improvement is substantial as seen above. Note: similar performance can be achieved with linsert
as well with use of the K operator. That is however less obvious and clear from a readability perspective.
Similarly, ledit
can be used in place of lremove
when a contiguous range of elements are to be removed leaving out any ELEM
arguments.
% set l [list a b c d e]
a b c d e
% ledit l 2 3
a b e
% set l
a b e
Performance improvements
The internal representation of lists has changed in Tcl 8.7 resulting in significant improvements in performance even for existing commands in some common scenarios.
Repeated insertion at the front of a list is faster, order of magnitude as lists get longer.
Tcl 8.6:
% set l {}
% timerate {set l [linsert $l 0 X]}
79.5236 µs/# 40010 # 12574.9 #/sec 3181.741 net-ms
Tcl 8.7:
% set l {}
% timerate {set l [linsert $l 0 X]}
0.193749 µs/# 4447614 # 5161315 #/sec 861.721 net-ms
When writing for Tcl 8.6, appending to a list using lappend
was often favoured over prepending as the former was so much faster. This need no longer be the case in 8.7.
Range operations on large lists are also signficantly faster.
Tcl 8.6:
% set l [lrepeat 1000 X] ; llength $l
1000
% timerate {lrange $l 1 end-1}
3.350451 µs/# 295636 # 298467 #/sec 990.514 net-ms
Tcl 8.7:
% set l [lrepeat 1000 X] ; llength $l
1000
% timerate {lrange $l 1 end-1}
0.115875 µs/# 6804325 # 8630012 #/sec 788.449 net-ms
The difference is even more pronounced for larger lists. There are also benefits in terms of reduced memory usage.
The lassign
command also benefits from faster range operations. Other less pronounced improvements are listed in TIP 625.
References
TIP 629: Add a lseq (formally "range") command to the core of list commands
TIP 631: ledit - a generalized insert/delete command for list variables