summaryrefslogtreecommitdiffstats
path: root/tcllib/modules/generator/generator.man
diff options
context:
space:
mode:
Diffstat (limited to 'tcllib/modules/generator/generator.man')
-rw-r--r--tcllib/modules/generator/generator.man482
1 files changed, 482 insertions, 0 deletions
diff --git a/tcllib/modules/generator/generator.man b/tcllib/modules/generator/generator.man
new file mode 100644
index 0000000..1a37669
--- /dev/null
+++ b/tcllib/modules/generator/generator.man
@@ -0,0 +1,482 @@
+[manpage_begin generator n 0.1]
+[keywords {control structure}]
+[keywords coroutine]
+[keywords filter]
+[keywords foldl]
+[keywords foldr]
+[keywords foreach]
+[keywords generator]
+[keywords iterator]
+[keywords map]
+[keywords reduce]
+[keywords scanl]
+[moddesc {Tcl Generator Commands}]
+[titledesc {Procedures for creating and using generators.}]
+[require Tcl 8.6]
+[require generator [opt 0.1]]
+[description]
+[para]
+
+The [cmd generator] package provides commands to define and iterate over
+generator expressions. A [emph generator] is a command that returns a sequence
+of values. However, unlike an ordinary command that returns a list, a
+generator [emph yields] each value and then suspends, allowing subsequent
+values to be fetched on-demand. As such, generators can be used to efficiently
+iterate over a set of values, without having to generate all answers in-memory.
+Generators can be used to iterate over elements of a data structure, or rows
+in the result set of a database query, or to decouple producer/consumer software
+designs such as parsers and tokenizers, or to implement sophisticated custom
+control strategies such as backtracking search. Generators reduce the need to
+implement custom control structures, as many such structures can be recast as
+generators, leading to both a simpler implementation and a more standardised
+interface. The generator mechanism is built on top of the Tcl 8.6 coroutine
+mechanism.
+
+[para]
+
+The package exports a single ensemble command, [cmd generator]. All
+functionality is provided as subcommands of this command. The core subcommands
+of the package are [method define], [method yield], and [method foreach]. The
+[method define] command works like Tcl's [cmd proc] command, but creates a
+generator procedure; that is, a procedure that returns a generator when called.
+The generator itself is a command that can be called multiple times: each time
+it returns the next value in the generated series. When the
+series has been exhausted, the generator command returns an empty list and then
+destroys itself. Rather than manually call a generator, however, the package
+also provides a flexible [method foreach] command that loops through the values of
+one or more generators. This loop construct mimicks the functionality of the
+built-in Tcl [cmd foreach] command, including handling multiple return values
+and looping over multiple generators at once. Writing a generator is also a
+simple task, much like writing a normal procedure: simply use the [method define]
+command to define the generator, and then call [method yield] instead of [cmd return].
+For example, we can define a generator for looping through the integers
+in a particular range:
+
+[para]
+[example {
+ generator define range {n m} {
+ for {set i $n} {$i <= $m} {incr i} { generator yield $i }
+ }
+ generator foreach x [range 1 10] {
+ puts "x = $x"
+ }
+}]
+
+[para]
+
+The above example will print the numbers from 1 to 10 in sequence, as you would
+expect. The difference from a normal loop over a list is that the numbers are
+only generated as they are needed. If we insert a break into the loop then any
+remaining numbers in the sequence would never be generated. To illustrate, we
+can define a generator that produces the sequence of natural numbers: an
+infinite series. A normal procedure would never return trying to produce this
+series as a list. By using a generator we only have to generate those values
+which are actually used:
+
+[para]
+[example {
+ generator define nats {} {
+ while 1 { generator yield [incr nat] }
+ }
+ generator foreach n [nats] {
+ if {$n > 100} { break }
+ }
+}]
+
+[section COMMANDS]
+[list_begin definitions]
+
+[call [cmd generator] [method define] [arg name] [arg params] [arg body]]
+
+Creates a new generator procedure. The arguments to the command are identical to
+those for [cmd proc]: a [arg name], a list of parameters, and a body. The
+parameter list format is identical to a procedure. In particular, default values
+and the [opt args] syntax can be used as usual. Each time the resulting
+generator procedure is called it creates a new generator command (coroutine)
+that will yield a list of values on each call. Each result from a generator is
+guaranteed to be a non-empty list of values. When a generator is exhausted it
+returns an empty list and then destroys itself to free up resources. It is an
+error to attempt to call an exhausted generator as the command no longer exists.
+
+[call [cmd generator] [method yield] [arg arg] [opt [arg args..]]]
+
+Used in the definition of a generator, this command returns the next set of
+values to the consumer. Once the [method yield] command has been called the
+generator will suspend to allow the consumer to process that value. When the
+next value is requested, the generator will resume as if the yield command had
+just returned, and can continue processing to yield the next result. The
+[method yield] command must be called with at least one argument, but can be called with
+multiple arguments, in which case this is equivalent to calling [method yield]
+once for each argument.
+
+[call [cmd generator] [method foreach] [arg varList] [arg generator] [arg varList] \
+ [arg generator] [opt ...] [arg body]]
+
+Loops through one or more generators, assigning the next values to variables and
+then executing the loop body. Works much like the built-in [cmd foreach]
+command, but working with generators rather than lists. Multiple generators can
+be iterated over in parallel, and multiple results can be retrieved from a
+single generator at once. Like the built-in [cmd foreach], the loop will
+continue until all of the generators have been exhausted: variables for
+generators that are exhausted early will be set to the empty string.
+
+[para]
+
+The [method foreach] command will automatically clean-up all of the generators
+at the end of the loop, regardless of whether the loop terminated early or not.
+This behaviour is provided as a convenience to avoid having to explicitly
+clean up a generator in the usual cases. Generators can however be destroyed
+before the end of the loop, in which case the loop will continue as normal until
+all the other generators have been destroyed or exhausted.
+
+[para]
+
+The [method foreach] command does not take a snapshot of the generator. Any
+changes in the state of the generator made inside the loop or by other code will
+affect the state of the loop. In particular, if the code in the loop invokes the
+generator to manually retrieve the next element, this element will then be
+excluded from the loop, and the next iteration will continue from the element
+after that one. Care should be taken to avoid concurrent updates to generators
+unless this behaviour is required (e.g., in argument processing).
+
+[call [cmd generator] [method next] [arg generator] [opt [arg varName..]]]
+
+Manually retrieves the next values from a generator. One value is retrieved for
+each variable supplied and assigned to the corresponding variable. If the
+generator becomes exhausted at any time then any remaining variables are set to
+the empty string.
+
+[call [cmd generator] [method exists] [arg generator]]
+
+Returns 1 if the generator (still) exists, or 0 otherwise.
+
+[call [cmd generator] [method names]]
+
+Returns a list of all currently existing generator commands.
+
+[call [cmd generator] [method destroy] [opt [arg generator..]]]
+
+Destroys one or more generators, freeing any associated resources.
+
+[call [cmd generator] [method finally] [arg cmd] [opt [arg arg..]]]
+
+Used in the definition of a generator procedure, this command arranges for a
+resource to be cleaned up whenever the generator is destroyed, either explicitly
+or implicitly when the generator is exhausted. This command can be used like a
+[method finally] block in the [cmd try] command, except that it is tied to the
+life-cycle of the generator rather than to a particular scope. For example, if
+we create a generator to iterate over the lines in a text file, we can use
+[method finally] to ensure that the file is closed whenever the generator is
+destroyed:
+
+[para]
+[example {
+ generator define lines file {
+ set in [open $file]
+ # Ensure file is always closed
+ generator finally close $in
+ while {[gets $in line] >= 0} {
+ generator yield $line
+ }
+ }
+ generator foreach line [lines /etc/passwd] {
+ puts "[incr count]: $line"
+ if {$count > 10} { break }
+ }
+ # File will be closed even on early exit
+}]
+
+[para]
+
+If you create a generator that consumes another generator (such as the standard
+[method map] and [method filter] generators defined later), then you should use
+a [method finally] command to ensure that this generator is destroyed when its
+parent is. For example, the [method map] generator is defined as follows:
+
+[para]
+[example {
+ generator define map {f xs} {
+ generator finally generator destroy $xs
+ generator foreach x $xs { generator yield [{*}$f $x] }
+ }
+}]
+
+[call [cmd generator] [method from] [arg format] [arg value]]
+
+Creates a generator from a data structure. Currently, supported formats are
+[option list], [option dict], or [option string]. The list format yields each
+element in turn. For dictionaries, each key and value are yielded separately.
+Finally, strings are yielded a character at a time.
+
+[call [cmd generator] [method to] [arg format] [arg generator]]
+
+Converts a generator into a data structure. This is the reverse operation of the
+[method from] command, and supports the same data structures. The two operations
+obey the following identity laws (where [method =] is interpreted
+appropriately):
+
+[para]
+[example {
+ [generator to $fmt [generator from $fmt $value]] = $value
+ [generator from $fmt [generator to $fmt $gen]] = $gen
+
+}]
+
+[list_end]
+
+[section PRELUDE]
+[para]
+
+The following commands are provided as a standard library of generator
+combinators and functions that perform convenience operations on generators. The
+functions in this section are loosely modelled on the equivalent functions from
+the Haskell Prelude. [emph Warning:] most of the functions in this prelude
+destroy any generator arguments they are passed as a side-effect. If you want to
+have persistent generators, see the streams library.
+
+[list_begin definitions]
+
+[call [cmd generator] [method map] [arg function] [arg generator]]
+
+Apply a function to every element of a generator, returning a new generator of
+the results. This is the classic map function from functional programming,
+applied to generators. For example, we can generate all the square numbers using
+the following code (where [cmd nats] is defined as earlier):
+
+[para]
+[example {
+ proc square x { expr {$x * $x} }
+ generator foreach n [generator map square [nats]] {
+ puts "n = $n"
+ if {$n > 1000} { break }
+ }
+}]
+
+[call [cmd generator] [method filter] [arg predicate] [arg generator]]
+
+Another classic functional programming gem. This command returns a generator
+that yields only those items from the argument generator that satisfy the
+predicate (boolean function). For example, if we had a generator [var employees]
+that returned a stream of dictionaries representing people, we could filter all
+those whose salaries are above 100,000 dollars (or whichever currency you prefer)
+using a simple filter:
+
+[para]
+[example {
+ proc salary> {amount person} { expr {[dict get $person salary] > $amount} }
+ set fat-cats [generator filter {salary> 100000} $employees]
+}]
+
+[call [cmd generator] [method reduce] [arg function] [arg zero] [arg generator]]
+
+This is the classic left-fold operation. This command takes a function, an
+initial value, and a generator of values. For each element in the generator it
+applies the function to the current accumulator value (the [arg zero] argument
+initially) and that element, and then uses the result as the new accumulator
+value. This process is repeated through the entire generator (eagerly) and the
+final accumulator value is then returned. If we consider the function to be a
+binary operator, and the zero argument to be the left identity element of that
+operation, then we can consider the [method reduce] command as [emph folding]
+the operator between each successive pair of values in the generator in a
+left-associative fashion. For example, the sum of a sequence of numbers can be
+calculated by folding a [cmd +] operator between them, with 0 as the identity:
+
+[para]
+[example {
+ # sum xs = reduce + 0 xs
+ # sum [range 1 5] = reduce + 0 [range 1 5]
+ # = reduce + [+ 0 1] [range 2 5]
+ # = reduce + [+ 1 2] [range 3 5]
+ # = ...
+ # = reduce + [+ 10 5] <empty>
+ # = ((((0+1)+2)+3)+4)+5
+ # = 15
+ proc + {a b} { expr {$a + $b} }
+ proc sum gen { generator reduce + 0 $gen }
+ puts [sum [range 1 10]]
+}]
+
+[para]
+
+The [method reduce] operation is an extremely useful one, and a great variety of
+different operations can be defined using it. For example, we can define a
+factorial function as the product of a range using generators. This definition
+is both very clear and also quite efficient (in both memory and running time):
+
+[para]
+[example {
+ proc * {x y} { expr {$x * $y} }
+ proc prod gen { generator reduce * 0 $gen }
+ proc fac n { prod [range 1 $n] }
+}]
+
+[para]
+
+However, while the [method reduce] operation is efficient for finite generators,
+care should be taken not to apply it to an infinite generator, as this will
+result in an infinite loop:
+
+[para]
+[example {
+ sum [nats]; # Never returns
+}]
+
+[call [cmd generator] [method foldl] [arg function] [arg zero] [arg generator]]
+
+This is an alias for the [method reduce] command.
+
+[call [cmd generator] [method foldr] [arg function] [arg zero] [arg generator]]
+
+This is the right-associative version of [method reduce]. This operation is
+generally inefficient, as the entire generator needs to be evaluated into memory
+(as a list) before the reduction can commence. In an eagerly evaluated language
+like Tcl, this operation has limited use, and should be avoided if possible.
+
+[call [cmd generator] [method all] [arg predicate] [arg generator]]
+
+Returns true if all elements of the generator satisfy the given predicate.
+
+[call [cmd generator] [method and] [arg generator]]
+
+Returns true if all elements of the generator are true (i.e., takes the logical
+conjunction of the elements).
+
+[call [cmd generator] [method any] [arg generator]]
+
+Returns true if any of the elements of the generator are true (i.e., logical
+disjunction).
+
+[call [cmd generator] [method concat] [arg generator] [opt [arg generator..]]]
+
+Returns a generator which is the concatenation of each of the argument
+generators.
+
+[call [cmd generator] [method concatMap] [arg function] [arg generator]]
+
+Given a function which maps a value to a series of values, and a generator of
+values of that type, returns a generator of all of the results in one flat
+series. Equivalent to [method concat] applied to the result of [method map].
+
+[call [cmd generator] [method drop] [arg n] [arg generator]]
+
+Removes the given number of elements from the front of the generator and returns
+the resulting generator with those elements removed.
+
+[call [cmd generator] [method dropWhile] [arg predicate] [arg generator]]
+
+Removes all elements from the front of the generator that satisfy the predicate.
+
+[call [cmd generator] [method contains] [arg element] [arg generator]]
+
+Returns true if the generator contains the given element. Note that this will
+destroy the generator!
+
+[call [cmd generator] [method foldl1] [arg function] [arg generator]]
+
+A version of [method foldl] that takes the [arg zero] argument from the first
+element of the generator. Therefore this function is only valid on non-empty
+generators.
+
+[call [cmd generator] [method foldli] [arg function] [arg zero] [arg generator]]
+
+A version of [method foldl] that supplies the integer index of each element as
+the first argument to the function. The first element in the generator at this
+point is given index 0.
+
+[call [cmd generator] [method foldri] [arg function] [arg zero] [arg generator]]
+
+Right-associative version of [method foldli].
+
+[call [cmd generator] [method head] [arg generator]]
+
+Returns the first element of the generator.
+
+[call [cmd generator] [method tail] [arg generator]]
+
+Removes the first element of the generator, returning the rest.
+
+[call [cmd generator] [method init] [arg generator]]
+
+Returns a new generator consisting of all elements except the last of the
+argument generator.
+
+[call [cmd generator] [method takeList] [arg n] [arg generator]]
+
+Returns the next [arg n] elements of the generator as a list. If not enough
+elements are left in the generator, then just the remaining elements are
+returned.
+
+[call [cmd generator] [method take] [arg n] [arg generator]]
+
+Returns the next [arg n] elements of the generator as a new generator. The old
+generator is destroyed.
+
+[call [cmd generator] [method iterate] [arg function] [arg init]]
+
+Returns an infinite generator formed by repeatedly applying the function to the
+initial argument. For example, the Fibonacci numbers can be defined as follows:
+
+[para]
+[example {
+ proc fst pair { lindex $pair 0 }
+ proc snd pair { lindex $pair 1 }
+ proc nextFib ab { list [snd $ab] [expr {[fst $ab] + [snd $ab]}] }
+ proc fibs {} { generator map fst [generator iterate nextFib {0 1}] }
+}]
+
+[call [cmd generator] [method last] [arg generator]]
+
+Returns the last element of the generator (if it exists).
+
+[call [cmd generator] [method length] [arg generator]]
+
+Returns the length of the generator, destroying it in the process.
+
+[call [cmd generator] [method or] [arg predicate] [arg generator]]
+
+Returns 1 if any of the elements of the generator satisfy the predicate.
+
+[call [cmd generator] [method product] [arg generator]]
+
+Returns the product of the numbers in a generator.
+
+[call [cmd generator] [method repeat] [arg n] [arg value..]]
+
+Returns a generator that consists of [arg n] copies of the given elements. The
+special value [emph Inf] can be used to generate an infinite sequence.
+
+[call [cmd generator] [method sum] [arg generator]]
+
+Returns the sum of the values in the generator.
+
+[call [cmd generator] [method takeWhile] [arg predicate] [arg generator]]
+
+Returns a generator of the first elements in the argument generator that satisfy
+the predicate.
+
+[call [cmd generator] [method splitWhen] [arg predicate] [arg generator]]
+
+Splits the generator into lists of elements using the predicate to identify
+delimiters. The resulting lists are returned as a generator. Elements matching
+the delimiter predicate are discarded. For example, to split up a generator
+using the string "|" as a delimiter:
+
+[para]
+[example {
+ set xs [generator from list {a | b | c}]
+ generator split {string equal "|"} $xs ;# returns a then b then c
+}]
+
+[call [cmd generator] [method scanl] [arg function] [arg zero] [arg generator]]
+
+Similar to [method foldl], but returns a generator of all of the intermediate
+values for the accumulator argument. The final element of this generator is
+equivalent to [method foldl] called on the same arguments.
+
+[list_end]
+
+[section {BUGS, IDEAS, FEEDBACK}]
+
+Please report any errors in this document, or in the package it describes, to
+[uri {mailto:nem@cs.nott.ac.uk} {Neil Madden}].
+[manpage_end]