diff options
author | William Joye <wjoye@cfa.harvard.edu> | 2016-10-27 19:39:39 (GMT) |
---|---|---|
committer | William Joye <wjoye@cfa.harvard.edu> | 2016-10-27 19:39:39 (GMT) |
commit | ea28451286d3ea4a772fa174483f9a7a66bb1ab3 (patch) | |
tree | 6ee9d8a7848333a7ceeee3b13d492e40225f8b86 /tcllib/modules/grammar_me | |
parent | b5ca09bae0d6a1edce939eea03594dd56383f2c8 (diff) | |
parent | 7c621da28f07e449ad90c387344f07a453927569 (diff) | |
download | blt-ea28451286d3ea4a772fa174483f9a7a66bb1ab3.zip blt-ea28451286d3ea4a772fa174483f9a7a66bb1ab3.tar.gz blt-ea28451286d3ea4a772fa174483f9a7a66bb1ab3.tar.bz2 |
Merge commit '7c621da28f07e449ad90c387344f07a453927569' as 'tcllib'
Diffstat (limited to 'tcllib/modules/grammar_me')
26 files changed, 8610 insertions, 0 deletions
diff --git a/tcllib/modules/grammar_me/ChangeLog b/tcllib/modules/grammar_me/ChangeLog new file mode 100644 index 0000000..07b3df3 --- /dev/null +++ b/tcllib/modules/grammar_me/ChangeLog @@ -0,0 +1,211 @@ +2013-02-01 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * + * Released and tagged Tcllib 1.15 ======================== + * + +2013-01-08 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * me_tcl.test: (mevmtcl-ict_match_tokclass-1.2*): Fixed test + results for Tcl 8.6 and higher. + +2011-12-13 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * + * Released and tagged Tcllib 1.14 ======================== + * + +2011-01-24 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * + * Released and tagged Tcllib 1.13 ======================== + * + +2009-12-07 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * + * Released and tagged Tcllib 1.12 ======================== + * + +2008-12-12 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * + * Released and tagged Tcllib 1.11.1 ======================== + * + +2008-10-16 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * + * Released and tagged Tcllib 1.11 ======================== + * + +2008-05-15 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * me_cpucore.man: Updated to changes in doctools (sub)section + * me_cpu.man: reference handling. + * me_tcl.man: + +2008-03-20 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * gasm.man: Added documentation for package 'grammar::me::cpu::gasm'. + +2007-09-12 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * + * Released and tagged Tcllib 1.10 ======================== + * + +2007-08-01 Andreas Kupries <andreask@activestate.com> + + * me_util.test: Updated to use the TestAccel utility commands to + handle accelerators. + * me_tcl.test: Updated 8.5 specific test to extensions in 'string is'. + +2007-06-22 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * me_cpu.testsuite: Replaced deprecated {expand} syntax in + comments with {*}. + +2007-03-22 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * me_ast.man: Fixed all warnings due to use of now deprecated + * me_cpu.man: commands. Added a section about how to give + * me_cpucore.man: feedback. + * me_intro.man: + * me_tcl.man: + * me_util.man: + * me_vm.man: + +2006-10-03 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * + * Released and tagged Tcllib 1.9 ======================== + * + +2006-09-14 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * me_cpu.testsuite: Replaced hardwired snit error messages with + command constructing them based on the version of snit + used. Fixed test names as well. + + * me_cpucore.testsuite: Replaced hardwired error messages with + command constructing them based on the version of Tcl used. + +2006-06-28 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * grammar_me/me_cpucore.tests.semantics.txt: Extended to cover the + remaining untested instructions. This completes the behavioural + tests. + + * grammar_me/me_cpucore.man: Added documentation for the new + * grammar_me/me_cpucore.tcl: accessor commands. Fixed problems + * grammar_me/me_cpucore.test: uncovered by the last tests. + + * grammar_me/me_cpu.man: Lifted all changes to the cpucore into + * grammar_me/me_cpu.tcl: the cpu object (extended acessors, + * grammar_me/me_cpu.test: documentation, etc.). Created testsuite + * grammar_me/me_cpu.testsuite: using the core testsuite as + template, and sharing the instruction descriptions with it. + + * grammar_me/gasm.tcl: New package for the assembly of a ME + * grammar_me/pkgIndex.tcl: program, using a graph as internal + structure. Bumped the versions of the cpu::core and cpu packages + as well. + +2006-06-22 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * me_cpucore.tcl (asm): Extended the assembler to detect and + ignore comment pseudo instructions. + +2006-06-20 Andreas Kupries <andreask@activestate.com> + + * me_cpucore.tests.semantics.txt: Extended coverage of testsuite, + * me_cpucore.testsuite: more bugfixes. + * me_cpucore.tcl: + + * me_cpucore.tcl: Bug fixes, additional state accessor commands, + more argument checking, extended error messages from validator + used by disassembler and state creation. + + * me_cpucore.test: Added testsuite for the cpu, + * me_cpucore.testsuite: already semi-prepared for when + * me_cpucore.tests.asm-map.txt: we get a C impl. of the ME cpu. + * me_cpucore.tests.badasm-map.txt: Largely table-driven. + * me_cpucore.tests.badmach-map.txt: Incomplete. + * me_cpucore.tests.semantics.txt: + +2006-06-15 Andreas Kupries <andreask@activestate.com> + + * me_util.test: Split tests into separate file and added handling + of both regular and critcl tree. + * me_util.testsuite: New file. Actual tests. + +2006-01-28 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * me_tcl.test: Fixed use of duplicate test names. + +2006-01-22 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * me_tcl.test: More boilerplate simplified via use of test support. + * me_util.test: + +2006-01-19 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * me_tcl.test: Hooked into the new common test support code. + * me_util.test: + +2005-10-06 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * + * Released and tagged Tcllib 1.8 ======================== + * + +2005-10-03 Andreas Kupries <andreask@activestate.com> + + * me_tcl.test (ME_state): token ranks are stored in an array/dict, + used dictsort to generate a canonical representation we can + compare against. ... Also duplicate tokclass test depending on + error message by "string is", different results in 8.4 and 8.5. + +2005-09-30 Andreas Kupries <andreask@activestate.com> + + * me_cpucore.tcl: Fixed more typos. + * me_cpucore.tcl: Fixed namespace typo. + + * me_tcl.test: Added forgotten check to tests, the packages cannot + * me_utils.test: run against Tcl 8.3 and below. + +2005-09-08 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * me_vm.man: Removed the hardwired generation of error + * me_tcl.man: messages and made them explicit arguments of the + * me_tcl.tcl: instructions which can generate errors. More + * me_tcl.test: work for a generator, but ensures that + * me_cpucore.man: user-strings are not contaminated by Tcl code + * me_cpucore.tcl: (character representation). + +2005-09-01 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * me_tcl.tcl: Fixed bug uncovered by the testuite. + + * me_tcl.test: Completed the testsuite. + +2005-08-31 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * me_tcl.man: Added more accessor commands, to be able to inspect + the full state of the ME vm when running the testsuite. + + * me_tcl.tcl: Implemented the newly specified accessor commands. + + * me_tcl.test: **New file** Skeleton framework for the testsuite + of the ME vm. + + * me_util.tcl: Added argument sanity checking. + + * me_util.test: **New file** Added a testsuite for the commands + doing the conversions between AST representations. + +2005-04-15 Andreas Kupries <andreas_kupries@users.sourceforge.net> + + * New module: Virtual machine for parsing, various + implementations, utilities. diff --git a/tcllib/modules/grammar_me/gasm.man b/tcllib/modules/grammar_me/gasm.man new file mode 100644 index 0000000..10172fc --- /dev/null +++ b/tcllib/modules/grammar_me/gasm.man @@ -0,0 +1,439 @@ +[comment {-*- tcl -*- doctools manpage}] +[manpage_begin grammar::me::cpu::gasm n 0.1] +[keywords assembler] +[keywords grammar] +[keywords graph] +[keywords parsing] +[keywords tree] +[keywords {virtual machine}] +[copyright {2005 Andreas Kupries <andreas_kupries@users.sourceforge.net>}] +[moddesc {Grammar operations and usage}] +[titledesc {ME assembler}] +[category {Grammars and finite automata}] +[require grammar::me::cpu::gasm [opt 0.1]] +[description] + +This package provides a simple in-memory assembler. Its origin is that +of a support package for use by packages converting PEG and other +grammars into a corresponding matcher based on the ME virtual machine, +like [package page::compiler::peg::mecpu]. Despite that it is actually +mostly agnostic regarding the instructions, users can choose any +instruction set they like. + +[para] + +The program under construction is held in a graph structure (See +package [package struct::graph]) during assembly and subsequent +manipulation, with instructions represented by nodes, and the flow of +execution between instructions explicitly encoded in the arcs between +them. + +[para] + +In this model jumps are not encoded explicitly, they are implicit in +the arcs. The generation of explicit jumps is left to any code +converting the graph structure into a more conventional +representation. The same goes for branches. They are implicitly +encoded by all instructions which have two outgoing arcs, whereas all +other instructions have only one outgoing arc. Their conditonality is +handled by tagging their outgoing arcs with information about the +conditions under which they are taken. + +[para] + +While the graph the assembler operates on is supplied from the +outside, i.e. external, it does manage some internal state, namely: + +[list_begin enumerated] +[enum] The handle of the graph node most assembler operations will +work on, the [term anchor]. + +[enum] A mapping from arbitrary strings to instructions. I.e. it is +possible to [term label] an instruction during assembly, and later +recall that instruction by its label. + +[enum] The condition code to use when creating arcs between +instructions, which is one of [const always], [const ok], and +[const fail]. + +[enum] The current operation mode, one of [const halt], +[const okfail], and [const !okfail]. + +[enum] The name of a node in a tree. This, and the operation mode +above are the parts most heavily influenced by the needs of a grammar +compiler, as they assume some basic program structures (selected +through the operation mode), and intertwine the graph with a tree, +like the AST for the grammar to be compiled. + +[list_end] + +[section DEFINITIONS] + +As the graph the assembler is operating on, and the tree it is +intertwined with, are supplied to the assembler from the outside it is +necessary to specify the API expected from them, and to describe the +structures expected and/or generated by the assembler in either. + +[para] + +[list_begin enumerated] + +[enum] Any graph object command used by the assembler has to provide +the API as specified in the documentation for the package +[package struct::graph]. + +[enum] Any tree object command used by the assembler has to provide +the API as specified in the documentation for the package +[package struct::tree]. + +[enum] Any instruction (node) generated by the assembler in a graph +will have at least two, and at most three attributes: + +[list_begin definitions] + +[def [const instruction]] The value of this attribute is the name of +the instruction. The only names currently defined by the assembler are +the three pseudo-instructions + +[comment {Fix nroff backend so that the put the proper . on the command name}] +[list_begin definitions] + +[def [const NOP]] This instruction does nothing. Useful for fixed +framework nodes, unchanging jump destinations, and the like. No +arguments. + +[def [const C]] A .NOP to allow the insertion of arbitrary comments +into the instruction stream, i.e. a comment node. One argument, the +text of the comment. + +[def [const BRA]] A .NOP serving as explicitly coded conditional +branch. No arguments. + +[list_end] + +However we reserve the space of all instructions whose names begin +with a "." (dot) for future use by the assembler. + +[def [const arguments]] The value of this attribute is a list of +strings, the arguments of the instruction. The contents are dependent +on the actual instruction and the assembler doesn't know or care about +them. This means for example that it has no builtin knowledge about +what instruction need which arguments and thus doesn't perform any +type of checking. + +[def [const expr]] This attribute is optional. When it is present its +value is the name of a node in the tree intertwined with the graph. + +[list_end] + +[enum] Any arc between two instructions will have one attribute: + +[list_begin definitions] + +[def [const condition]] The value of this attribute determines under which +condition execution will take this arc. It is one of [const always], +[const ok], and [const fail]. The first condition is used for all arcs +which are the single outgoing arc of an instruction. The other two are +used for the two outgoing arcs of an instruction which implicitly +encode a branch. + +[list_end] + +[enum] A tree node given to the assembler for cross-referencing will +be written to and given the following attributes, some fixed, some +dependent on the operation mode. All values will be references to +nodes in the instruction graph. Some of the instruction will expect +some or specific sets of these attributes. + +[list_begin definitions] +[def [const gas::entry]] Always written. +[def [const gas::exit]] Written for all modes but [const okfail]. +[def [const gas::exit::ok]] Written for mode [const okfail]. +[def [const gas::exit::fail]] Written for mode [const okfail]. +[list_end] + +[list_end] + +[section API] + +[list_begin definitions] + +[call [cmd ::grammar::me::cpu::gasm::begin] [arg g] [arg n] [opt [arg mode]] [opt [arg note]]] + +This command starts the assembly of an instruction sequence, and +(re)initializes the state of the assembler. After completion of the +instruction sequence use [cmd ::grammar::me::cpu::gasm::done] to +finalize the assembler. + +[para] + +It will operate on the graph [arg g] in the specified [arg mode] +(Default is [const okfail]). As part of the initialization it will +always create a standard .NOP instruction and label it "entry". The +creation of the remaining standard instructions is +[arg mode]-dependent: + +[list_begin definitions] + +[def [const halt]] An "icf_halt" instruction labeled "exit/return". + +[def [const !okfail]] An "icf_ntreturn" instruction labeled "exit/return". + +[def [const okfail]] Two .NOP instructions labeled "exit/ok" and +"exit/fail" respectively. + +[list_end] + +The [arg note], if specified (default is not), is given to the "entry" .NOP instruction. + +[para] + +The node reference [arg n] is simply stored for use by +[cmd ::grammar::me::cpu::gasm::done]. It has to refer to a node in the +tree [arg t] argument of that command. + +[para] + +After the initialization is done the "entry" instruction will be the +[term anchor], and the condition code will be set to [const always]. + +[para] + +The command returns the empy string as its result. + +[call [cmd ::grammar::me::cpu::gasm::done] [const -->] [arg t]] + +This command finalizes the creation of an instruction sequence and +then clears the state of the assembler. +[emph NOTE] that this [emph {does not}] delete any of the created +instructions. They can be made available to future begin/done cycles. +Further assembly will be possible only after reinitialization of the +system via [cmd ::grammar::me::cpu::gasm::begin]. + +[para] + +Before the state is cleared selected references to selected +instructions will be written to attributes of the node [arg n] in the +tree [arg t]. + +Which instructions are saved is [arg mode]-dependent. Both [arg mode] +and the destination node [arg n] were specified during invokation of +[cmd ::grammar::me::cpu::gasm::begin]. + +[para] + +Independent of the mode a reference to the instruction labeled "entry" +will be saved to the attribute [const gas::entry] of [arg n]. The +reference to the node [arg n] will further be saved into the attribute +"expr" of the "entry" instruction. Beyond that + +[list_begin definitions] + +[def [const halt]] A reference to the instruction labeled +"exit/return" will be saved to the attribute [const gas::exit] of +[arg n]. + +[def [const okfail]] See [const halt]. + +[def [const !okfail]] Reference to the two instructions labeled +"exit/ok" and "exit/fail" will be saved to the attributes +[const gas::exit::ok] and [const gas::exit::fail] of [arg n] +respectively. + +[list_end] + +[para] + +The command returns the empy string as its result. + +[call [cmd ::grammar::me::cpu::gasm::state]] + +This command returns the current state of the assembler. Its format is +not documented and considered to be internal to the package. + +[call [cmd ::grammar::me::cpu::gasm::state!] [arg s]] + +This command takes a serialized assembler state [arg s] as returned by +[cmd ::grammar::me::cpu::gasm::state] and makes it the current state +of the assembler. + +[para] + +[emph Note] that this may overwrite label definitions, however all +non-conflicting label definitions in the state before are not touched +and merged with [arg s]. + +[para] + +The command returns the empty string as its result. + +[call [cmd ::grammar::me::cpu::gasm::lift] [arg t] [arg dst] [const =] [arg src]] + +This command operates on the tree [arg t]. It copies the contents of +the attributes [const gas::entry], [const gas::exit::ok] and +[const gas::exit::fail] from the node [arg src] to the node [arg dst]. + +It returns the empty string as its result. + +[call [cmd ::grammar::me::cpu::gasm::Inline] [arg t] [arg node] [arg label]] + +This command links an instruction sequence created by an earlier +begin/done pair into the current instruction sequence. + +[para] + +To this end it + +[list_begin enumerated] + +[enum] reads the instruction references from the attributes +[const gas::entry], [const gas::exit::ok], and [const gas::exit::fail] +from the node [arg n] of the tree [arg t] and makes them available to +assembler und the labels [arg label]/entry, [arg label]/exit::ok, and +[arg label]/exit::fail respectively. + +[enum] Creates an arc from the [term anchor] to the node labeled +[arg label]/entry, and tags it with the current condition code. + +[enum] Makes the node labeled [arg label]/exit/ok the new [term anchor]. + +[list_end] + +The command returns the empty string as its result. + +[call [cmd ::grammar::me::cpu::gasm::Cmd] [arg cmd] [opt [arg arg]...]] + +This is the basic command to add instructions to the graph. + +It creates a new instruction of type [arg cmd] with the given +arguments [arg arg]... + +If the [term anchor] was defined it will also create an arc from the +[term anchor] to the new instruction using the current condition code. + +After the call the new instruction will be the [term anchor] and the +current condition code will be set to [const always]. + +[para] + +The command returns the empty string as its result. + +[call [cmd ::grammar::me::cpu::gasm::Bra]] + +This is a convenience command to create a .BRA pseudo-instruction. It +uses [cmd ::grammar::me::cpu::gasm::Cmd] to actually create the +instruction and inherits its behaviour. + +[call [cmd ::grammar::me::cpu::gasm::Nop] [arg text]] + +This is a convenience command to create a .NOP pseudo-instruction. It +uses [cmd ::grammar::me::cpu::gasm::Cmd] to actually create the +instruction and inherits its behaviour. + +The [arg text] will be saved as the first and only argument of the new +instruction. + +[call [cmd ::grammar::me::cpu::gasm::Note] [arg text]] + +This is a convenience command to create a .C pseudo-instruction, +i.e. a comment. It uses [cmd ::grammar::me::cpu::gasm::Cmd] to +actually create the instruction and inherits its behaviour. + +The [arg text] will be saved as the first and only argument of the new +instruction. + +[call [cmd ::grammar::me::cpu::gasm::Jmp] [arg label]] + +This command creates an arc from the [term anchor] to the instruction +labeled with [arg label], and tags with the the current condition +code. + +[para] + +The command returns the empty string as its result. + +[call [cmd ::grammar::me::cpu::gasm::Exit]] + +This command creates an arc from the [term anchor] to one of the exit +instructions, based on the operation mode (see +[cmd ::grammar::me::cpu::gasm::begin]), and tags it with current +condition code. + +[para] + +For mode [const okfail] it links to the instruction labeled either +"exit/ok" or "exit/fail", depending on the current condition code, and +tagging it with the current condition code + +For the other two modes it links to the instruction labeled +"exit/return", tagging it condition code [const always], independent +the current condition code. + +[para] + +The command returns the empty string as its result. + +[call [cmd ::grammar::me::cpu::gasm::Who] [arg label]] + +This command returns a reference to the instruction labeled with +[arg label]. + +[call [cmd ::grammar::me::cpu::gasm::/Label] [arg name]] + +This command labels the [term anchor] with [arg name]. + +[emph Note] that an instruction can have more than one label. + +[para] + +The command returns the empty string as its result. + +[call [cmd ::grammar::me::cpu::gasm::/Clear]] + +This command clears the [term anchor], leaving it undefined, and +further resets the current condition code to [const always]. + +[para] + +The command returns the empty string as its result. + +[call [cmd ::grammar::me::cpu::gasm::/Ok]] + +This command sets the current condition code to [const ok]. + +[para] + +The command returns the empty string as its result. + +[call [cmd ::grammar::me::cpu::gasm::/Fail]] + +This command sets the current condition code to [const fail]. + +[para] + +The command returns the empty string as its result. + +[call [cmd ::grammar::me::cpu::gasm::/At] [arg name]] + +This command sets the [term anchor] to the instruction labeled with +[arg name], and further resets the current condition code to +[const always]. + +[para] + +The command returns the empty string as its result. + +[call [cmd ::grammar::me::cpu::gasm::/CloseLoop]] + +This command marks the [term anchor] as the last instruction in a loop +body, by creating the attribute [const LOOP]. + +[para] + +The command returns the empty string as its result. + +[list_end] + +[vset CATEGORY grammar_me] +[include ../doctools2base/include/feedback.inc] +[manpage_end] diff --git a/tcllib/modules/grammar_me/gasm.tcl b/tcllib/modules/grammar_me/gasm.tcl new file mode 100644 index 0000000..a42fd40 --- /dev/null +++ b/tcllib/modules/grammar_me/gasm.tcl @@ -0,0 +1,207 @@ +# -*- tcl -*- +# ### ### ### ######### ######### ######### +## Package description + +## (struct::)Graph based ME Assembler, for use in grammar +## translations. + +# ### ### ### ######### ######### ######### +## Requisites + +namespace eval grammar::me::cpu::gasm {} + +# ### ### ### ######### ######### ######### +## Implementation + +proc ::grammar::me::cpu::gasm::begin {g n {mode okfail} {note {}}} { + variable gas + array unset gas * + + # (Re)initialize the assmebler state, create the framework nodes + # upon which we will hang all instructions on. + + set gas(mode) $mode + set gas(node) $n + set gas(grap) $g + array set gas {last {} cond always} + + Nop $note ; /Label entry ; /Clear + if {$mode eq "okfail"} { + Nop Exit'OK ; /Label exit/ok ; /Clear + Nop Exit'FAIL ; /Label exit/fail ; /Clear + } elseif {$mode eq "halt"} { + Cmd icf_halt ; /Label exit/return ; /Clear + } else { + Cmd icf_ntreturn ; /Label exit/return ; /Clear + } + + /At entry + return +} + +proc ::grammar::me::cpu::gasm::done {__ t} { + variable gas + + # Save the framework nodes in a grammar tree and shut the + # assembler down. + + $t set $gas(node) gas::entry $gas(_entry) + + if {$gas(mode) eq "okfail"} { + $t set $gas(node) gas::exit::ok $gas(_exit/ok) + $t set $gas(node) gas::exit::fail $gas(_exit/fail) + } else { + $t set $gas(node) gas::exit $gas(_exit/return) + } + + # Remember the node in the grammar tree which is responsible for + # this entry point. + + $gas(grap) node set $gas(_entry) expr $gas(node) + + array unset gas * + return +} + +proc ::grammar::me::cpu::gasm::lift {t dst __ src} { + + $t set $dst gas::entry [$t get $src gas::entry] + $t set $dst gas::exit::ok [$t get $src gas::exit::ok] + $t set $dst gas::exit::fail [$t get $src gas::exit::fail] + return +} + +proc ::grammar::me::cpu::gasm::state {} { + variable gas + return [array get gas] +} + +proc ::grammar::me::cpu::gasm::state! {s} { + variable gas + array set gas $s +} + +proc ::grammar::me::cpu::gasm::Inline {t node label} { + variable gas + + set gas(_${label}/entry) [$t get $node gas::entry] + set gas(_${label}/exit/ok) [$t get $node gas::exit::ok] + set gas(_${label}/exit/fail) [$t get $node gas::exit::fail] + + __Link $gas(_${label}/entry) $gas(cond) + /At ${label}/exit/ok + return +} + +proc ::grammar::me::cpu::gasm::Cmd {cmd args} { + variable gas + + # Add a new instruction, and link it to the anchor. The created + # instruction becomes the new anchor. + + upvar 0 gas(grap) g gas(last) anchor gas(cond) cond + + set node [$g node insert] + $g node set $node instruction $cmd + $g node set $node arguments $args + + if {$anchor ne ""} {__Link $node $cond} + + set anchor $node + set cond always + return +} + +proc ::grammar::me::cpu::gasm::Bra {} { + Cmd .BRA +} + +proc ::grammar::me::cpu::gasm::Nop {{text {}}} { + Cmd .NOP $text +} + +proc ::grammar::me::cpu::gasm::Note {text} { + Cmd .C $text +} + +proc ::grammar::me::cpu::gasm::Jmp {label} { + variable gas + __Link $gas(_$label) $gas(cond) + return +} + +proc ::grammar::me::cpu::gasm::Exit {} { + variable gas + if {$gas(mode) eq "okfail"} { + __Link $gas(_exit/$gas(cond)) $gas(cond) + } else { + __Link $gas(_exit/return) always + } + return +} + +proc ::grammar::me::cpu::gasm::Who {label} { + variable gas + return $gas(_$label) +} + +proc ::grammar::me::cpu::gasm::__Link {to cond} { + variable gas + upvar 0 gas(grap) g gas(last) anchor + + set arc [$g arc insert $anchor $to] + $g arc set $arc condition $cond + return +} + +proc ::grammar::me::cpu::gasm::/Label {name} { + variable gas + set gas(_$name) $gas(last) + return +} + +proc ::grammar::me::cpu::gasm::/Clear {} { + variable gas + set gas(last) {} + set gas(cond) always + return +} + +proc ::grammar::me::cpu::gasm::/Ok {} { + variable gas + set gas(cond) ok + return +} + +proc ::grammar::me::cpu::gasm::/Fail {} { + variable gas + set gas(cond) fail + return +} + +proc ::grammar::me::cpu::gasm::/At {name} { + variable gas + set gas(last) $gas(_$name) + set gas(cond) always + return +} + +proc ::grammar::me::cpu::gasm::/CloseLoop {} { + variable gas + $gas(grap) node set $gas(last) LOOP . + return +} + +# ### ### ### ######### ######### ######### +## Interfacing + +namespace eval grammar::me::cpu::gasm { + namespace export begin done lift state state! + namespace export Inline Cmd Bra Nop Note Jmp Exit Who + namespace export /Label /Clear /Ok /Fail /At /CloseLoop +} + +# ### ### ### ######### ######### ######### +## Ready + +package provide grammar::me::cpu::gasm 0.1 diff --git a/tcllib/modules/grammar_me/me_ast.man b/tcllib/modules/grammar_me/me_ast.man new file mode 100644 index 0000000..2768ffa --- /dev/null +++ b/tcllib/modules/grammar_me/me_ast.man @@ -0,0 +1,134 @@ +[comment {-*- tcl -*- doctools manpage}] +[manpage_begin grammar::me_ast n 0.1] +[keywords {abstract syntax tree}] +[keywords AST] +[copyright {2005 Andreas Kupries <andreas_kupries@users.sourceforge.net>}] +[moddesc {Grammar operations and usage}] +[titledesc {Various representations of ASTs}] +[category {Grammars and finite automata}] +[description] + +This document specifies various representations for the + +[term {abstract syntax tree}]s (short [term AST]) generated by +instances of ME virtual machines, independent of variant. + +Please go and read the document [syscmd grammar::me_intro] first if +you do not know what a ME virtual machine is. + +[para] + +ASTs and all the representations we specify distinguish between two +types of nodes, namely: + +[para] +[list_begin definitions] + +[def Terminal] + +Terminal nodes refer to the terminal symbols found in the token +stream. They are always leaf nodes. I.e. terminal nodes never have +children. + +[def Nonterminal] + +Nonterminal nodes represent a nonterminal symbol of the grammar used +during parsing. They can occur as leaf and inner nodes of the +tree. + +[list_end] +[para] + +Both types of nodes carry basic range information telling a user which +parts of the input are covered by the node by providing the location +of the first and last tokens found within the range. Locations are +provided as non-negative integer offsets from the beginning of the +token stream, with the first token found in the stream located at +offset 0 (zero). + +[para] + +The root of an AS tree can be either a terminal or nonterminal node. + +[section {AST VALUES}] + +This representation of ASTs is a Tcl list. The main list represents +the root node of the tree, with the representations of the children +nested within. + +[para] + +Each node is represented by a single Tcl list containing three or more +elements. The first element is either the empty string or the name of +a nonterminal symbol (which is never the empty string). The second and +third elements are then the locations of the first and last tokens. + +Any additional elements after the third are then the representations +of the children, with the leftmost child first, i.e. as the fourth +element of the list representing the node. + +[section {AST OBJECTS}] + +In this representation an AST is represented by a Tcl object command +whose API is compatible to the tree objects provided by the package +[package struct::tree]. I.e it has to support at least all of the +methods described by that package, and may support more. + +[para] + +Because of this the remainder of the specifications is written using +the terms of [package struct::tree]. + +[para] + +Each node of the AST directly maps to a node in the tree object. All +data beyond the child nodes, i.e. node type and input locations, are +stored in attributes of the node in the tree object. They are: + +[list_begin definitions] +[def type] + +The type of the AST node. The recognized values are [const terminal] +and [const nonterminal]. + +[def range] + +The locations of the first and last token of the terminal data in the +input covered by the node. This is a list containing two locations. + +[def detail] + +This attribute is present only for nonterminal nodes. It contains the +name of the nonterminal symbol stored in the node. + +[list_end] + +[section {EXTENDED AST OBJECTS}] + +Extended AST objects are like AST objects, with additional +information. + +[list_begin definitions] + +[def detail] + +This attribute is now present at all nodes. Its contents are unchanged +for nonterminal nodes. For terminal nodes it contains a list +describing all tokens from the input which are covered by the node. + +[para] + +Each element of the list contains the token name, the associated +lexeme attribute, line number, and column index, in this order. + +[def range_lc] + +This new attribute is defined for all nodes, and contains the +locations from attribute [term range] translated into line number and +column index. Lines are counted from 1, columns are counted from 0. + +[list_end] + +[vset CATEGORY grammar_me] +[include ../doctools2base/include/feedback.inc] +[manpage_end] diff --git a/tcllib/modules/grammar_me/me_cpu.man b/tcllib/modules/grammar_me/me_cpu.man new file mode 100644 index 0000000..5961cf3 --- /dev/null +++ b/tcllib/modules/grammar_me/me_cpu.man @@ -0,0 +1,289 @@ +[comment {-*- tcl -*- doctools manpage}] +[manpage_begin grammar::me::cpu n 0.2] +[keywords grammar] +[keywords parsing] +[keywords {virtual machine}] +[copyright {2005-2006 Andreas Kupries <andreas_kupries@users.sourceforge.net>}] +[moddesc {Grammar operations and usage}] +[titledesc {Virtual machine implementation II for parsing token streams}] +[category {Grammars and finite automata}] +[require Tcl 8.4] +[require grammar::me::cpu [opt 0.2]] +[description] +[para] + +This package provides an implementation of the ME virtual machine. + +Please go and read the document [syscmd grammar::me_intro] first if +you do not know what a ME virtual machine is. + +[para] + +This implementation provides an object-based API and the machines are +not truly tied to Tcl. A C implementation of the same API is quite +possible. + +[para] + +Internally the package actually uses the value-based machine +manipulation commands as provided by the package + +[package grammar::me::cpu::core] to perform its duties. + +[section API] +[subsection {CLASS API}] + +The package directly provides only a single command for the +construction of ME virtual machines. + +[list_begin definitions] + +[call [cmd ::grammar::me::cpu] [arg meName] [arg matchcode]] + +The command creates a new ME machine object with an associated global +Tcl command whose name is [arg meName]. This command may be used to +invoke various operations on the machine. + +It has the following general form: + +[list_begin definitions] +[call [cmd meName] [method option] [opt [arg "arg arg ..."]]] + +[arg Option] and the [arg arg]s determine the exact behavior of the +command. + +[list_end] +[para] + +The argument [arg matchcode] contains the match instructions the +machine has to execute while parsing the input stream. Please read +section [sectref-external {MATCH CODE REPRESENTATION}] of the +documentation for the package [package grammar::me::cpu::core] for +the specification of the structure of this value. + +[para] + +The [arg tokmap] argument taken by the implementation provided by the +package [package grammar::me::tcl] is here hidden inside of the match +instructions and therefore not needed. + +[list_end] +[para] + +[subsection {OBJECT API}] + +All ME virtual machine objects created by the class command specified +in section [sectref {CLASS API}] support the methods listed below. + +[para] + +The machines provided by this package provide methods for operation in +both push- and pull-styles. Push-style means that tokens are pushed +into the machine state when they arrive, triggering further execution +until they are consumed. In other words, this allows the machine to be +suspended and resumed at will and an arbitrary number of times, the +quasi-parallel operation of several machines, and the operation as +part of the event loop. + +[list_begin definitions] +[call [arg meName] [method lc] [arg location]] + +This method converts the location of a token given as offset in the +input stream into the associated line number and column index. The +result of the command is a 2-element list containing the two values, +in the order mentioned in the previous sentence. + +This allows higher levels to convert the location information found in +the error status and the generated AST into more human readable data. + +[para] + +[emph Note] that the command is not able to convert locations which +have not been reached by the machine yet. In other words, if the +machine has read 7 tokens the command is able to convert the offsets +[const 0] to [const 6], but nothing beyond that. This also shows that +it is not possible to convert offsets which refer to locations before +the beginning of the stream. + +[call [arg meName] [method tok] [opt "[arg from] [opt [arg to]]"]] + +This method returns a Tcl list containing the part of the input stream +between the locations [arg from] and [arg to] (both inclusive). If +[arg to] is not specified it will default to the value of [arg from]. +If [arg from] is not specified either the whole input stream is returned. + +[para] + +Each element of the returned list is a list of four elements, the +token, its associated lexeme, line number, and column index, in this +order. + +This command places the same restrictions on its location arguments as +the method [method lc]. + +[call [arg meName] [method pc] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the current value of the stored program counter. + +[call [arg meName] [method iseof] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the current value of the stored eof flag. + +[call [arg meName] [method at] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the current location in the input stream. + +[call [arg meName] [method cc] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the current token. + +[call [arg meName] [method sv]] + +This command returns the current semantic value [term SV] stored in +the machine. This is an abstract syntax tree as specified in the +document [syscmd grammar::me_ast], section [sectref-external {AST VALUES}]. + +[call [arg meName] [method ok]] + +This method returns the current match status [term OK]. + +[call [arg meName] [method error]] + +This method returns the current error status [term ER]. + +[call [arg meName] [method lstk] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the location stack. + +[call [arg meName] [method astk] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the AST stack. + +[call [arg meName] [method mstk] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the AST marker stack. + +[call [arg meName] [method estk] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the error stack. + +[call [arg meName] [method rstk] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the subroutine return stack. + +[call [arg meName] [method nc] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the nonterminal match cache as a dictionary. + +[call [arg meName] [method ast]] + +This method returns the current top entry of the AST stack [term AS]. + +This is an abstract syntax tree as specified in the document +[syscmd grammar::me_ast], section [sectref-external {AST VALUES}]. + +[call [arg meName] [method halted]] + +This method returns a boolean value telling the caller whether the +engine has halted execution or not. Halt means that no further +matching is possible, and the information retrieved via the other +method is final. Attempts to [method run] the engine will be ignored, +until a [method reset] is made. + +[call [arg meName] [method code]] + +This method returns the [arg code] information used to construct the +object. In other words, the match program executed by the machine. + +[call [arg meName] [method eof]] + +This method adds an end of file marker to the end of the input stream. +This signals the machine that the current contents of the input queue +are the final parts of the input and nothing will come after. Attempts +to put more characters into the queue will fail. + +[call [arg meName] [method put] [arg tok] [arg lex] [arg line] [arg col]] + +This method adds the token [arg tok] to the end of the input stream, +with associated lexeme data [arg lex] and [arg line]/[arg col]umn +information. + +[call [arg meName] [method putstring] [arg string] [arg lvar] [arg cvar]] + +This method adds each individual character in the [arg string] as a +token to the end of the input stream, from first to last. The lexemes +will be empty and the line/col information is computed based on the +characters encountered and the data in the variables [arg lvar] and +[arg cvar]. + +[call [arg meName] [method run] [opt [arg n]]] + +This methods causes the engine to execute match instructions until +either + +[list_begin itemized] +[item] [arg n] instructions have been executed, or +[item] a halt instruction was executed, or +[item] +the input queue is empty and the code is asking for more tokens to +process. +[list_end] +[para] + +If no limit [arg n] was set only the last two conditions are checked +for. + +[call [arg meName] [method pull] [arg nextcmd]] + +This method implements pull-style operation of the machine. It causes +it to execute match instructions until either a halt instruction is +reached, or the command prefix + +[arg nextcmd] ceases to deliver more tokens. + +[para] + +The command prefix [arg nextcmd] represents the input stream of +characters and is invoked by the machine whenever the a new character +from the stream is required. The instruction for handling this is +[term ict_advance]. + +The callback has to return either the empty list, or a list of 4 +elements containing the token, its lexeme attribute, and its location +as line number and column index, in this order. + +The empty list is the signal that the end of the input stream has been +reached. The lexeme attribute is stored in the terminal cache, but +otherwise not used by the machine. + +[para] + +The end of the input stream for this method does not imply that method +[method eof] is called for the machine as a whole. By avoiding this +and still asking for an explicit call of the method it is possible to +mix push- and pull-style operation during the lifetime of the machine. + +[call [arg meName] [method reset]] + +This method resets the machine to its initial state, discarding any +state it may have. + +[call [arg meName] [method destroy]] + +This method deletes the object and releases all resurces it claimed. + +[list_end] + +[vset CATEGORY grammar_me] +[include ../doctools2base/include/feedback.inc] +[manpage_end] diff --git a/tcllib/modules/grammar_me/me_cpu.tcl b/tcllib/modules/grammar_me/me_cpu.tcl new file mode 100644 index 0000000..89d7eae --- /dev/null +++ b/tcllib/modules/grammar_me/me_cpu.tcl @@ -0,0 +1,103 @@ +# -*- tcl -*- +# (C) 2005-2006 Andreas Kupries <andreas_kupries@users.sourceforge.net> +# ### ### ### ######### ######### ######### +## Package description + +## Implementation of ME virtual machines, object-based API to the +## state values provided by "grammar::me::cpu::core". + +# ### ### ### ######### ######### ######### +## Requisites + +package require snit +package require grammar::me::cpu::core + +# ### ### ### ######### ######### ######### +## Implementation + +snit::type ::grammar::me::cpu { + constructor {code_} { + # The 'core new' call validates the code as well. + + set state [core::new $code_] + return + } + + method lc {location} {return [core::lc $state $location]} + method tok {args} {return [eval [linsert $args 0 core::tok $state]]} + method pc {} {return [core::pc $state]} + method iseof {} {return [core::iseof $state]} + method at {} {return [core::at $state]} + method cc {} {return [core::cc $state]} + method sv {} {return [core::sv $state]} + method ok {} {return [core::ok $state]} + method error {} {return [core::error $state]} + method lstk {} {return [core::lstk $state]} + method astk {} {return [core::astk $state]} + method mstk {} {return [core::mstk $state]} + method estk {} {return [core::estk $state]} + method rstk {} {return [core::rstk $state]} + method nc {} {return [core::nc $state]} + method ast {} {return [core::ast $state]} + method halted {} {return [core::halted $state]} + method code {} {return [core::code $state]} + + method eof {} { + core::eof state + return + } + + method put {tok lex line col} { + core::put state $tok $lex $line $col + return + } + + method putstring {str lvar cvar} { + upvar 1 $lvar line $cvar col + foreach ch [split $str {}] { + core::put state $ch {} $line $col + if {$ch eq "\n"} { + incr line + set col 0 + } else { + incr col + } + } + return + } + + method run {{n -1}} { + core::run state $n + return + } + + method pull {next} { + while {1} { + core::run state + if {[core::halted $state]} break + + set tokdata [uplevel \#0 $next] + if {![llength $tokdata]} break + if {[llength $tokdata] != 4} { + return -code error "Bad callback result, expected 4 elements" + } + foreach {tok lex line col} $tokdata break + core::put state $tok $lex $line $col + } + } + + method reset {} { + set state [core::new [core::code $state]] + return + } + + # ### ### ### ######### ######### ######### + ## Data structures + + variable state ; # State of ME cpu handled here. +} + +# ### ### ### ######### ######### ######### +## Ready + +package provide grammar::me::cpu 0.2 diff --git a/tcllib/modules/grammar_me/me_cpu.test b/tcllib/modules/grammar_me/me_cpu.test new file mode 100644 index 0000000..3555ab2 --- /dev/null +++ b/tcllib/modules/grammar_me/me_cpu.test @@ -0,0 +1,162 @@ +# me_cpucore.test: Tests for the ME virtual machine -*- tcl -*- +# +# This file contains a collection of tests for one or more of the +# commands making up the ME virtual machine. Sourcing this file into +# Tcl runs the tests and generates output for errors. No output means +# no errors were found. +# +# Copyright (c) 2005-2006 Andreas Kupries <andreas_kupries@users.sourceforge.net> +# All rights reserved. +# +# RCS: @(#) $Id: me_cpu.test,v 1.3 2006/10/09 21:41:40 andreas_kupries Exp $ + +# ------------------------------------------------------------------------- + +source [file join \ + [file dirname [file dirname [file join [pwd] [info script]]]] \ + devtools testutilities.tcl] + +testsNeedTcl 8.4 +testsNeedTcltest 2.1 + +support { + use fileutil/fileutil.tcl fileutil + useLocal me_cpucore.tcl grammar::me::cpu::core +} +testing { + useLocalKeep me_cpu.tcl grammar::me::cpu +} + +# ------------------------------------------------------------------------- + +snitErrors + +proc cpustate {cpu} { + set vstate {} + lappend vstate cd [$cpu code ] + lappend vstate pc [$cpu pc ] + lappend vstate ht [$cpu halted] + lappend vstate eo [$cpu iseof ] + lappend vstate tc [$cpu tok ] + lappend vstate at [$cpu at ] + lappend vstate cc [$cpu cc ] + lappend vstate ok [$cpu ok ] + lappend vstate sv [$cpu sv ] + lappend vstate er [$cpu error ] + lappend vstate ls [$cpu lstk ] + lappend vstate as [$cpu astk ] + lappend vstate ms [$cpu mstk ] + lappend vstate es [$cpu estk ] + lappend vstate rs [$cpu rstk ] + lappend vstate nc [$cpu nc ] + return $vstate +} + +proc cpudelta {prev now} { + array set _ {} + foreach {k v} $prev { + set _($k) $v + } + set res {} + foreach {k v} $now { + if {[info exists _($k)] && ($_($k) eq $v)} continue + lappend res $k $v + } + return $res +} + +proc cpufstate {vstate} { + set res {} + foreach {k v} $vstate {lappend res [list $k $v]} + join $res \n +} + +proc cpusubst {vstate args} { + array set _ $vstate + foreach {k v} $args {set _($k) $v} + set res {} + foreach k {cd pc ht eo tc at cc ok sv er ls as ms es rs nc} { + if {![info exists _($k)]} continue + lappend res $k $_($k) + } + return $res +} + +proc cpufilter {vstate args} { + array set _ $vstate + set res {} + foreach k $args { lappend res $k $_($k) } + return $res +} + +proc canon_code {code} { + foreach {i p t} $code break + # Sorting the token map, canonical rep for direct comparison + return [list $i $p [dictsort $t]] +} + +# ------------------------------------------------------------------------- + +set asm_table [string trimright \ + [fileutil::cat \ + [localPath me_cpucore.tests.asm-map.txt]]] + +set badmach_table [string trimright \ + [fileutil::cat \ + [localPath me_cpucore.tests.badmach-map.txt]]] + +set semantics [string trimright \ + [fileutil::cat \ + [localPath me_cpucore.tests.semantics.txt]]] + +# ------------------------------------------------------------------------- +# In this section we run all the tests depending on a grammar::me::cpu::core, +# and thus have to test all the available implementations. + +set tests [file join [file dirname [info script]] me_cpu.testsuite] + +catch {memory validate on} + +set impl tcl +set usec [time {source $tests} 1] + +if 0 { + foreach impl [grammar::me::cpu::core::Implementations] { + grammar::me::cpu::core::SwitchTo $impl + + # The global variable 'impl' is part of the public API the + # testsuit (in htmlparse_tree.testsuite) can expect from the + # environment. + + namespace import -force grammar::me::cpu::core + + set usec [time {source $tests} 1] + + #puts "$impl:\t$usec" + } +} + +catch {memory validate off} + +unset usec +unset tests + +#puts "" + +# Reset system to fully inactive state. +# grammar::me::cpu::core::SwitchTo {} + +# ------------------------------------------------------------------------- + +# ### ### ### ######### ######### ######### +## Cleanup and statistics. + +rename cpustate {} +rename cpufstate {} +rename cpudelta {} +rename cpufilter {} +rename canon_code {} + +unset asm_table badmach_table semantics + +testsuiteCleanup diff --git a/tcllib/modules/grammar_me/me_cpu.testsuite b/tcllib/modules/grammar_me/me_cpu.testsuite new file mode 100644 index 0000000..9d4f8a7 --- /dev/null +++ b/tcllib/modules/grammar_me/me_cpu.testsuite @@ -0,0 +1,445 @@ +# -*- tcl -*- me_cpu.test +# ### ### ### ######### ######### ######### + +# ### ### ### ######### ######### ######### +## Cpu creation. + +test me-cpu-new-${impl}-1.0 {new, wrong args} -body { + grammar::me::cpu cpu +} -result {Error in constructor: wrong # args: should be "::grammar::me::cpu::Snit_constructor type selfns win self code_"} \ + -returnCodes error + +test me-cpu-new-${impl}-1.1 {new, wrong args} -body { + grammar::me::cpu cpu a b +} -result {Error in constructor: wrong # args: should be "::grammar::me::cpu::Snit_constructor type selfns win self code_"} \ + -returnCodes error + +test me-cpu-run-${impl}-2.0 run -setup { + grammar::me::cpu cpu {{} {} {}} +} -body { + cpu run +} -cleanup { + cpu destroy +} -returnCodes error -result {No instructions to execute} + +set n -1 +foreach {cmd cargs expected} $asm_table { + set asm [list [linsert $cargs 0 branchlabel $cmd]] + incr n + + test me-cpu-new-${impl}-3.$n "new, $cmd, code" -body { + grammar::me::cpu cpu \ + [canon_code [grammar::me::cpu::core::asm $asm]] + cpu code + } -cleanup { + cpu destroy + } -result $expected + + test me-cpu-new-${impl}-4.$n "new, $cmd, state" -body { + grammar::me::cpu cpu \ + [canon_code [grammar::me::cpu::core::asm $asm]] + cpusubst [cpustate cpu] cd {} + } -cleanup { + cpu destroy + } -result {cd {} pc 0 ht 0 eo 0 tc {} at -1 cc {} ok 0 sv {} er {} ls {} as {} ms {} es {} rs {} nc {}} +} + +set n -1 +foreach {insns expected} $badmach_table { + incr n + + test me-cpu-new-${impl}-5.$n "new error" -body { + grammar::me::cpu cpu $insns + } -result "Error in constructor: $expected" -returnCodes error +} + +# ### ### ### ######### ######### ######### +## CPU manipulation - Add tokens I + +test me-cpu-put-${impl}-1.0 {put, wrong args} -setup { + grammar::me::cpu cpu {{} {} {}} +} -body { + cpu put +} -returnCodes error -cleanup { + cpu destroy +} -result [snitWrongNumArgs cpu ::grammar::me::cpu::Snit_methodput {tok lex line col} 0] + +test me-cpu-put-${impl}-1.1 {put, wrong args} -setup { + grammar::me::cpu cpu {{} {} {}} +} -body { + cpu put a +} -returnCodes error -cleanup { + cpu destroy +} -result [snitWrongNumArgs cpu ::grammar::me::cpu::Snit_methodput {tok lex line col} 1] + +test me-cpu-put-${impl}-1.2 {put, wrong args} -setup { + grammar::me::cpu cpu {{} {} {}} +} -body { + cpu put a b +} -returnCodes error -cleanup { + cpu destroy +} -result [snitWrongNumArgs cpu ::grammar::me::cpu::Snit_methodput {tok lex line col} 2] + +test me-cpu-put-${impl}-1.3 {put, wrong args} -setup { + grammar::me::cpu cpu {{} {} {}} +} -body { + cpu put a b c +} -returnCodes error -cleanup { + cpu destroy +} -result [snitWrongNumArgs cpu ::grammar::me::cpu::Snit_methodput {tok lex line col} 3] + +test me-cpu-put-${impl}-1.4 {put, wrong args} -setup { + grammar::me::cpu cpu {{} {} {}} +} -body { + cpu put a b c d e +} -returnCodes error -cleanup { + cpu destroy +} -result [snitTooManyArgs cpu ::grammar::me::cpu::Snit_methodput {tok lex line col}] + +test me-cpu-put-${impl}-2.0 put -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + set base [cpustate cpu] +} -body { + cpu put ID ident 1 0 + cpu put NUM 12345 1 5 + cpudelta $base [cpustate cpu] +} -cleanup { + cpu destroy + unset base +} -result {tc {{ID ident 1 0} {NUM 12345 1 5}}} + +test me-cpu-put-${impl}-3.0 {put after eof} -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + cpu eof +} -body { + cpu put ID ident 1 0 +} -returnCodes error -cleanup { + cpu destroy +} -result {Cannot add input data after eof} + + +# ### ### ### ######### ######### ######### +## CPU manipulation - Add tokens II + +test me-cpu-putstring-${impl}-1.0 {putstring, wrong args} -setup { + grammar::me::cpu cpu {{} {} {}} +} -body { + cpu putstring +} -returnCodes error -cleanup { + cpu destroy +} -result [snitWrongNumArgs cpu ::grammar::me::cpu::Snit_methodputstring {str lvar cvar} 0] + +test me-cpu-putstring-${impl}-1.1 {putstring, wrong args} -setup { + grammar::me::cpu cpu {{} {} {}} +} -body { + cpu putstring a +} -returnCodes error -cleanup { + cpu destroy +} -result [snitWrongNumArgs cpu ::grammar::me::cpu::Snit_methodputstring {str lvar cvar} 1] + +test me-cpu-putstring-${impl}-1.2 {putstring, wrong args} -setup { + grammar::me::cpu cpu {{} {} {}} +} -body { + cpu putstring a b +} -returnCodes error -cleanup { + cpu destroy +} -result [snitWrongNumArgs cpu ::grammar::me::cpu::Snit_methodputstring {str lvar cvar} 2] + +test me-cpu-putstring-${impl}-1.3 {putstring, wrong args} -setup { + grammar::me::cpu cpu {{} {} {}} +} -body { + cpu putstring a b c d +} -returnCodes error -cleanup { + cpu destroy +} -result [snitTooManyArgs cpu ::grammar::me::cpu::Snit_methodputstring {str lvar cvar}] + +test me-cpu-putstring-${impl}-2.0 put -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + set base [cpustate cpu] + set l 0 + set c 0 +} -body { + cpu putstring ID l c + list $l $c [cpudelta $base [cpustate cpu]] +} -cleanup { + cpu destroy + unset base +} -result {0 2 {tc {{I {} 0 0} {D {} 0 1}}}} + +test me-cpu-putstring-${impl}-3.0 {put after eof} -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + cpu eof + set l 0 + set c 0 +} -body { + cpu putstring ID l c +} -returnCodes error -cleanup { + cpu destroy +} -result {Cannot add input data after eof} + +# ### ### ### ######### ######### ######### +## State manipulation - Set eof + +test me-cpu-eof-${impl}-1.0 {eof, wrong args} -setup { + grammar::me::cpu cpu {{} {} {}} +} -body { + cpu eof x +} -returnCodes error -cleanup { + cpu destroy +} -result [snitTooManyArgs cpu ::grammar::me::cpu::Snit_methodeof {}] + +test me-cpu-eof-${impl}-2.0 eof -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + set base [cpustate cpu] +} -body { + cpu eof + cpudelta $base [cpustate cpu] +} -cleanup { + cpu destroy +} -result {eo 1} + +# ### ### ### ######### ######### ######### +## State accessors - line/col retrieval + +test me-cpu-lc-${impl}-1.0 {lc, wrong args} -setup { + grammar::me::cpu cpu {{} {} {}} +} -body { + cpu lc +} -returnCodes error -cleanup { + cpu destroy +} -result [snitWrongNumArgs cpu ::grammar::me::cpu::Snit_methodlc {location} 0] + +test me-cpu-lc-${impl}-1.1 {lc, wrong args} -setup { + grammar::me::cpu cpu {{} {} {}} +} -body { + cpu lc a b +} -returnCodes error -cleanup { + cpu destroy +} -result [snitTooManyArgs cpu ::grammar::me::cpu::Snit_methodlc {location}] + +test me-cpu-lc-${impl}-2.0 lc -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + cpu put NUM 12345 1 5 +} -body { + cpu lc 0 +} -cleanup { + cpu destroy +} -result {1 5} + +test me-cpu-lc-${impl}-3.0 {lc, bad index} -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + cpu put NUM 12345 1 5 +} -body { + cpu lc -1 +} -returnCodes error -cleanup { + cpu destroy +} -result {Illegal location -1} + +test me-cpu-lc-${impl}-3.1 {lc, bad index} -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + cpu put NUM 12345 1 5 +} -body { + cpu lc 1 +} -returnCodes error -cleanup { + cpu destroy +} -result {Illegal location 1} + +test me-cpu-lc-${impl}-3.2 {lc, bad index} -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] +} -body { + cpu lc 0 +} -returnCodes error -cleanup { + cpu destroy +} -result {Illegal location 0} + +# ### ### ### ######### ######### ######### +## State accessors - Token retrieval + +test me-cpu-tok-${impl}-1.0 {tok, wrong args} -setup { + grammar::me::cpu cpu {{} {} {}} +} -body { + cpu tok a b c +} -returnCodes error -cleanup { + cpu destroy +} -result {wrong # args: should be "grammar::me::cpu::core::tok state ?from ?to??"} + +test me-cpu-tok-${impl}-2.0 tok -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] +} -body { + cpu tok +} -cleanup { + cpu destroy +} -result {} + +test me-cpu-tok-${impl}-2.1 tok -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + cpu put NUM 12345 1 5 +} -body { + cpu tok +} -cleanup { + cpu destroy +} -result {{NUM 12345 1 5}} + +test me-cpu-tok-${impl}-2.2 tok -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + cpu put ID lalal 0 0 + cpu put NUM 12345 1 5 +} -body { + cpu tok +} -cleanup { + cpu destroy +} -result {{ID lalal 0 0} {NUM 12345 1 5}} + +test me-cpu-tok-${impl}-3.0 tok -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] +} -body { + cpu tok 0 +} -cleanup { + cpu destroy +} -result {Illegal location 0} -returnCodes error + +test me-cpu-tok-${impl}-3.1 tok -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + cpu put NUM 12345 1 5 +} -body { + cpu tok -1 +} -cleanup { + cpu destroy +} -result {Illegal location -1} -returnCodes error + +test me-cpu-tok-${impl}-3.2 tok -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + cpu put NUM 12345 1 5 +} -body { + cpu tok 1 +} -cleanup { + cpu destroy +} -result {Illegal location 1} -returnCodes error + +test me-cpu-tok-${impl}-3.3 tok -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + cpu put NUM 12345 1 5 +} -body { + cpu tok 0 +} -cleanup { + cpu destroy +} -result {{NUM 12345 1 5}} + +test me-cpu-tok-${impl}-3.4 tok -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + cpu put ID lalal 0 0 + cpu put NUM 12345 1 5 +} -body { + cpu tok 0 +} -cleanup { + cpu destroy +} -result {{ID lalal 0 0}} + +test me-cpu-tok-${impl}-4.0 tok -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + cpu put NUM 12345 1 5 +} -body { + cpu tok -1 0 +} -cleanup { + cpu destroy +} -result {Illegal start location -1} -returnCodes error + +test me-cpu-tok-${impl}-4.1 tok -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + cpu put NUM 12345 1 5 +} -body { + cpu tok 1 0 +} -cleanup { + cpu destroy +} -result {Illegal start location 1} -returnCodes error + +test me-cpu-tok-${impl}-4.2 tok -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + cpu put NUM 12345 1 5 +} -body { + cpu tok 0 -1 +} -cleanup { + cpu destroy +} -result {Illegal end location -1} -returnCodes error + +test me-cpu-tok-${impl}-4.3 tok -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + cpu put NUM 12345 1 5 +} -body { + cpu tok 0 1 +} -cleanup { + cpu destroy +} -result {Illegal end location 1} -returnCodes error + +test me-cpu-tok-${impl}-4.4 tok -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + cpu put ID lalal 0 0 + cpu put NUM 12345 1 5 +} -body { + cpu tok 1 0 +} -cleanup { + cpu destroy +} -result {Illegal empty location range 1 .. 0} -returnCodes error + +test me-cpu-tok-${impl}-4.5 tok -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + cpu put ID lalal 0 0 + cpu put NUM 12345 1 5 +} -body { + cpu tok 0 1 +} -cleanup { + cpu destroy +} -result {{ID lalal 0 0} {NUM 12345 1 5}} + +test me-cpu-tok-${impl}-4.6 tok -setup { + grammar::me::cpu cpu [canon_code [grammar::me::cpu::core::asm {}]] + cpu put ID lalal 0 0 + cpu put ID lalal 0 0 +} -body { + cpu tok 0 0 +} -cleanup { + cpu destroy +} -result {{ID lalal 0 0}} + +# ### ### ### ######### ######### ######### +## Checking the instruction semantics + +test me-cpu-run-${impl}-1.0 {run, wrong args} -setup { + grammar::me::cpu cpu {{} {} {}} +} -body { + cpu run a b +} -returnCodes error -cleanup { + cpu destroy +} -result [snitTooManyArgs cpu ::grammar::me::cpu::Snit_methodrun {?n?}] + +set n -1 +foreach {description input eof stepsSetup steps code expectedDelta} $semantics { + incr n + + if 0 { + puts ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + puts $description + puts "INPUT $input" + puts "EOF $eof" + puts "CODE $stepsSetup $steps $code" + puts $expectedDelta + } + + test me-cpu-run-${impl}-2.$n "run $description" -setup { + grammar::me::cpu cpu $code + foreach token $input { + eval [linsert $token 0 cpu put] + # cpu put {*}$token + } + if {$eof} {cpu eof} + if {$stepsSetup} {cpu run $stepsSetup} + set save [cpustate cpu] + } -body { + cpu run $steps + cpudelta $save [cpustate cpu] + } -cleanup { + cpu destroy + } -result $expectedDelta +} + +return diff --git a/tcllib/modules/grammar_me/me_cpucore.man b/tcllib/modules/grammar_me/me_cpucore.man new file mode 100644 index 0000000..b81c046 --- /dev/null +++ b/tcllib/modules/grammar_me/me_cpucore.man @@ -0,0 +1,374 @@ +[comment {-*- tcl -*- doctools manpage}] +[manpage_begin grammar::me::cpu::core n 0.2] +[keywords grammar] +[keywords parsing] +[keywords {virtual machine}] +[copyright {2005-2006 Andreas Kupries <andreas_kupries@users.sourceforge.net>}] +[moddesc {Grammar operations and usage}] +[titledesc {ME virtual machine state manipulation}] +[category {Grammars and finite automata}] +[require Tcl 8.4] +[require grammar::me::cpu::core [opt 0.2]] +[description] +[para] + +This package provides an implementation of the ME virtual machine. + +Please go and read the document [syscmd grammar::me_intro] first if +you do not know what a ME virtual machine is. + +[para] + +This implementation represents each ME virtual machine as a Tcl value +and provides commands to manipulate and query such values to show the +effects of executing instructions, adding tokens, retrieving state, +etc. + +[para] + +The values fully follow the paradigm of Tcl that every value is a +string and while also allowing C implementations for a proper +Tcl_ObjType to keep all the important data in native data structures. + +Because of the latter it is recommended to access the state values +[emph only] through the commands of this package to ensure that +internal representation is not shimmered away. + +[para] + +The actual structure used by all state values is described in section +[sectref {CPU STATE}]. + +[section API] + +The package directly provides only a single command, and all the +functionality is made available through its methods. + +[list_begin definitions] + +[call [cmd ::grammar::me::cpu::core] [method disasm] [arg asm]] + +This method returns a list containing a disassembly of the match +instructions in [arg asm]. The format of [arg asm] is specified in the +section [sectref {MATCH PROGRAM REPRESENTATION}]. + +[para] + +Each element of the result contains instruction label, instruction +name, and the instruction arguments, in this order. The label can be +the empty string. Jump destinations are shown as labels, strings and +tokens unencoded. Token names are prefixed with their numeric id, if, +and only if a tokmap is defined. The two components are separated by a +colon. + +[call [cmd ::grammar::me::cpu::core] [method asm] [arg asm]] + +This method returns code in the format as specified in section +[sectref {MATCH PROGRAM REPRESENTATION}] generated from ME assembly +code [arg asm], which is in the format as returned by the method +[method disasm]. + +[call [cmd ::grammar::me::cpu::core] [method new] [arg asm]] + +This method creates state value for a ME virtual machine in its +initial state and returns it as its result. + +[para] + +The argument [arg matchcode] contains a Tcl representation of the +match instructions the machine has to execute while parsing the input +stream. Its format is specified in the section +[sectref {MATCH PROGRAM REPRESENTATION}]. + +[para] + +The [arg tokmap] argument taken by the implementation provided by the +package [package grammar::me::tcl] is here hidden inside of the match +instructions and therefore not needed. + +[call [cmd ::grammar::me::cpu::core] [method lc] [arg state] [arg location]] + +This method takes the state value of a ME virtual machine and uses it +to convert a location in the input stream (as offset) into a line +number and column index. The result of the method is a 2-element list +containing the two pieces in the order mentioned in the previous +sentence. + +[para] + +[emph Note] that the method cannot convert locations which the machine +has not yet read from the input stream. In other words, if the machine +has read 7 characters so far it is possible to convert the offsets +[const 0] to [const 6], but nothing beyond that. This also shows that +it is not possible to convert offsets which refer to locations before +the beginning of the stream. + +[para] + +This utility allows higher levels to convert the location offsets +found in the error status and the AST into more human readable data. + +[call [cmd ::grammar::me::cpu::core] [method tok] [arg state] [opt "[arg from] [opt [arg to]]"]] + +This method takes the state value of a ME virtual machine and returns +a Tcl list containing the part of the input stream between the +locations [arg from] and [arg to] (both inclusive). If [arg to] is not +specified it will default to the value of [arg from]. If [arg from] is +not specified either the whole input stream is returned. + +[para] + +This method places the same restrictions on its location arguments as +the method [method lc]. + +[call [cmd ::grammar::me::cpu::core] [method pc] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the current value of the stored program counter. + +[call [cmd ::grammar::me::cpu::core] [method iseof] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the current value of the stored eof flag. + +[call [cmd ::grammar::me::cpu::core] [method at] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the current location in the input stream. + +[call [cmd ::grammar::me::cpu::core] [method cc] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the current token. + +[call [cmd ::grammar::me::cpu::core] [method sv] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the current semantic value stored in it. + +This is an abstract syntax tree as specified in the document +[syscmd grammar::me_ast], section [sectref-external {AST VALUES}]. + +[call [cmd ::grammar::me::cpu::core] [method ok] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the match status stored in it. + +[call [cmd ::grammar::me::cpu::core] [method error] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the current error status stored in it. + +[call [cmd ::grammar::me::cpu::core] [method lstk] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the location stack. + +[call [cmd ::grammar::me::cpu::core] [method astk] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the AST stack. + +[call [cmd ::grammar::me::cpu::core] [method mstk] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the AST marker stack. + +[call [cmd ::grammar::me::cpu::core] [method estk] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the error stack. + +[call [cmd ::grammar::me::cpu::core] [method rstk] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the subroutine return stack. + +[call [cmd ::grammar::me::cpu::core] [method nc] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the nonterminal match cache as a dictionary. + +[call [cmd ::grammar::me::cpu::core] [method ast] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the abstract syntax tree currently at the top of the AST stack stored +in it. + +This is an abstract syntax tree as specified in the document +[syscmd grammar::me_ast], section [sectref-external {AST VALUES}]. + +[call [cmd ::grammar::me::cpu::core] [method halted] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the current halt status stored in it, i.e. if the machine has stopped +or not. + +[call [cmd ::grammar::me::cpu::core] [method code] [arg state]] + +This method takes the state value of a ME virtual machine and returns +the code stored in it, i.e. the instructions executed by the machine. + +[call [cmd ::grammar::me::cpu::core] [method eof] [arg statevar]] + +This method takes the state value of a ME virtual machine as stored in +the variable named by [arg statevar] and modifies it so that the eof +flag inside is set. This signals to the machine that whatever token +are in the input queue are the last to be processed. There will be no +more. + +[call [cmd ::grammar::me::cpu::core] [method put] [arg statevar] [arg tok] [arg lex] [arg line] [arg col]] + +This method takes the state value of a ME virtual machine as stored in +the variable named by [arg statevar] and modifies it so that the token +[arg tok] is added to the end of the input queue, with associated +lexeme data [arg lex] and [arg line]/[arg col]umn information. + +[para] + +The operation will fail with an error if the eof flag of the machine +has been set through the method [method eof]. + +[call [cmd ::grammar::me::cpu::core] [method run] [arg statevar] [opt [arg n]]] + +This method takes the state value of a ME virtual machine as stored in +the variable named by [arg statevar], executes a number of +instructions and stores the state resulting from their modifications +back into the variable. + +[para] + +The execution loop will run until either + +[list_begin itemized] +[item] [arg n] instructions have been executed, or +[item] a halt instruction was executed, or +[item] +the input queue is empty and the code is asking for more tokens to +process. +[list_end] +[para] + +If no limit [arg n] was set only the last two conditions are checked +for. + +[list_end] + +[subsection {MATCH PROGRAM REPRESENTATION}] + +A match program is represented by nested Tcl list. The first element, +[term asm], is a list of integer numbers, the instructions to execute, +and their arguments. The second element, [term pool], is a list of +strings, referenced by the instructions, for error messages, token +names, etc. The third element, [term tokmap], provides ordering +information for the tokens, mapping their names to their numerical +rank. This element can be empty, forcing lexicographic comparison when +matching ranges. + +[para] + +All ME instructions are encoded as integer numbers, with the mapping +given below. A number of the instructions, those which handle error +messages, have been given an additional argument to supply that +message explicitly instead of having it constructed from token names, +etc. This allows the machine state to store only the message ids +instead of the full strings. + +[para] + +Jump destination arguments are absolute indices into the [term asm] +element, refering to the instruction to jump to. Any string arguments +are absolute indices into the [term pool] element. Tokens, characters, +messages, and token (actually character) classes to match are coded as +references into the [term pool] as well. + +[para] +[list_begin enumerated] + +[enum] "[cmd ict_advance] [arg message]" +[enum] "[cmd ict_match_token] [arg tok] [arg message]" +[enum] "[cmd ict_match_tokrange] [arg tokbegin] [arg tokend] [arg message]" +[enum] "[cmd ict_match_tokclass] [arg code] [arg message]" +[enum] "[cmd inc_restore] [arg branchlabel] [arg nt]" +[enum] "[cmd inc_save] [arg nt]" +[enum] "[cmd icf_ntcall] [arg branchlabel]" +[enum] "[cmd icf_ntreturn]" +[enum] "[cmd iok_ok]" +[enum] "[cmd iok_fail]" +[enum] "[cmd iok_negate]" +[enum] "[cmd icf_jalways] [arg branchlabel]" +[enum] "[cmd icf_jok] [arg branchlabel]" +[enum] "[cmd icf_jfail] [arg branchlabel]" +[enum] "[cmd icf_halt]" +[enum] "[cmd icl_push]" +[enum] "[cmd icl_rewind]" +[enum] "[cmd icl_pop]" +[enum] "[cmd ier_push]" +[enum] "[cmd ier_clear]" +[enum] "[cmd ier_nonterminal] [arg message]" +[enum] "[cmd ier_merge]" +[enum] "[cmd isv_clear]" +[enum] "[cmd isv_terminal]" +[enum] "[cmd isv_nonterminal_leaf] [arg nt]" +[enum] "[cmd isv_nonterminal_range] [arg nt]" +[enum] "[cmd isv_nonterminal_reduce] [arg nt]" +[enum] "[cmd ias_push]" +[enum] "[cmd ias_mark]" +[enum] "[cmd ias_mrewind]" +[enum] "[cmd ias_mpop]" +[list_end] + +[section {CPU STATE}] + +A state value is a list containing the following elements, in the order listed below: + +[list_begin enumerated] +[enum] [term code]: Match instructions, see [sectref {MATCH PROGRAM REPRESENTATION}]. +[enum] [term pc]: Program counter, [term int]. +[enum] [term halt]: Halt flag, [term boolean]. +[enum] [term eof]: Eof flag, [term boolean] +[enum] [term tc]: Terminal cache, and input queue. Structure see below. +[enum] [term cl]: Current location, [term int]. +[enum] [term ct]: Current token, [term string]. +[enum] [term ok]: Match status, [term boolean]. +[enum] [term sv]: Semantic value, [term list]. +[enum] [term er]: Error status, [term list]. +[enum] [term ls]: Location stack, [term list]. +[enum] [term as]: AST stack, [term list]. +[enum] [term ms]: AST marker stack, [term list]. +[enum] [term es]: Error stack, [term list]. +[enum] [term rs]: Return stack, [term list]. +[enum] [term nc]: Nonterminal cache, [term dictionary]. +[list_end] +[para] + +[term tc], the input queue of tokens waiting for processing and the +terminal cache containing the tokens already processing are one +unified data structure simply holding all tokens and their +information, with the current location separating that which has been +processed from that which is waiting. + +Each element of the queue/cache is a list containing the token, its +lexeme information, line number, and column index, in this order. + +[para] + +All stacks have their top element aat the end, i.e. pushing an item is +equivalent to appending to the list representing the stack, and +popping it removes the last element. + +[para] + +[term er], the error status is either empty or a list of two elements, +a location in the input, and a list of messages, encoded as references +into the [term pool] element of the [term code]. + +[para] + +[term nc], the nonterminal cache is keyed by nonterminal name and +location, each value a four-element list containing current location, +match status, semantic value, and error status, in this order. + +[vset CATEGORY grammar_me] +[include ../doctools2base/include/feedback.inc] +[manpage_end] diff --git a/tcllib/modules/grammar_me/me_cpucore.tcl b/tcllib/modules/grammar_me/me_cpucore.tcl new file mode 100644 index 0000000..9a3a402 --- /dev/null +++ b/tcllib/modules/grammar_me/me_cpucore.tcl @@ -0,0 +1,1156 @@ +# -*- tcl -*- +# (C) 2005-2006 Andreas Kupries <andreas_kupries@users.sourceforge.net> +# ### ### ### ######### ######### ######### +## Package description + +## Implementation of ME virtual machines based on state values +## manipulated by the commands according to the match +## instructions. Allows for implementation in C. + +# ### ### ### ######### ######### ######### +## Requisites + +namespace eval ::grammar::me::cpu::core {} + +# ### ### ### ######### ######### ######### +## Implementation, API. Ensemble command. + +proc ::grammar::me::cpu::core {cmd args} { + # Dispatcher for the ensemble command. + variable core::cmds + return [uplevel 1 [linsert $args 0 $cmds($cmd)]] +} + +namespace eval grammar::me::cpu::core { + variable cmds + + # Mapping from cmd names to procedures for quick dispatch. The + # objects will shimmer into resolved command references. + + array set cmds { + disasm ::grammar::me::cpu::core::disasm + asm ::grammar::me::cpu::core::asm + new ::grammar::me::cpu::core::new + lc ::grammar::me::cpu::core::lc + tok ::grammar::me::cpu::core::tok + pc ::grammar::me::cpu::core::pc + iseof ::grammar::me::cpu::core::iseof + at ::grammar::me::cpu::core::at + cc ::grammar::me::cpu::core::cc + sv ::grammar::me::cpu::core::sv + ok ::grammar::me::cpu::core::ok + error ::grammar::me::cpu::core::error + lstk ::grammar::me::cpu::core::lstk + astk ::grammar::me::cpu::core::astk + mstk ::grammar::me::cpu::core::mstk + estk ::grammar::me::cpu::core::estk + rstk ::grammar::me::cpu::core::rstk + nc ::grammar::me::cpu::core::nc + ast ::grammar::me::cpu::core::ast + halted ::grammar::me::cpu::core::halted + code ::grammar::me::cpu::core::code + eof ::grammar::me::cpu::core::eof + put ::grammar::me::cpu::core::put + run ::grammar::me::cpu::core::run + } +} + +# ### ### ### ######### ######### ######### +## Ensemble implementation + +proc ::grammar::me::cpu::core::disasm {code} { + variable iname + variable tclass + variable anum + + Validate $code ord dst jmp + + set label 0 + foreach k [array names jmp] { + set jmp($k) bra$label + incr label + } + foreach k [array names dst] { + if {![info exists jmp($k)]} { + set jmp($k) {} + } + } + + set result {} + foreach {asm pool tokmap} $code break + + set pc 0 + set pcend [llength $asm] + + while {$pc < $pcend} { + set base $pc + set insn [lindex $asm $pc] ; incr pc + set an [lindex $anum $insn] + + if {$an == 1} { + set a [lindex $asm $pc] ; incr pc + } elseif {$an == 2} { + set a [lindex $asm $pc] ; incr pc + set b [lindex $asm $pc] ; incr pc + } elseif {$an == 3} { + set a [lindex $asm $pc] ; incr pc + set b [lindex $asm $pc] ; incr pc + set c [lindex $asm $pc] ; incr pc + } + + set instruction {} + lappend instruction $jmp($base) + lappend instruction $iname($insn) + + switch -exact $insn { + 0 - 5 - 20 - 24 - 25 - 26 - + a/string { + lappend instruction [lindex $pool $a] + } + 1 { + # a/tok b/string + if {![llength $tokmap]} { + lappend instruction [lindex $pool $a] + } else { + lappend instruction ${a}:$ord($a) + } + lappend instruction [lindex $pool $b] + } + 2 { + # a/tokstart b/tokend c/string + if {![llength $tokmap]} { + lappend instruction [lindex $pool $a] + lappend instruction [lindex $pool $b] + } else { + # tokmap defined: a = b = order rank. + lappend instruction ${a}:$ord($a) + lappend instruction ${b}:$ord($b) + } + lappend instruction [lindex $pool $c] + } + 3 { + # a/class(0-5) b/string + lappend instruction [lindex $tclass $a] + lappend instruction [lindex $pool $b] + } + 4 { + # a/branch b/string + lappend instruction $jmp($a) + lappend instruction [lindex $pool $b] + } + 6 - 11 - 12 - 13 - + a/branch { + lappend instruction $jmp($a) + } + default {} + } + + lappend result $instruction + } + + return $result +} + +proc ::grammar::me::cpu::core::asm {code} { + variable iname + variable anum + variable tccode + + # code = list(insn), insn = list (label insn-name ...) + + # I. Indices for the labels, based on instruction sizes. + + array set jmp {} + set off 0 + foreach insn $code { + foreach {label name} $insn break + # Ignore embedded comments, except for labels + if {$label ne ""} { + set jmp($label) $off + } + if {$name eq ".C"} continue + if {![info exists iname($name)]} { + return -code error "Bad instruction \"$insn\", unknown command \"$name\"" + } + set an [lindex $anum $iname($name)] + if {[llength $insn] != ($an+2)} { + return -code error "Bad instruction \"$insn\", expected $an argument[expr {$an == 1 ? "" : "s"}]" + } + incr off + incr off [lindex $anum $iname($name)] + } + + set asm {} + set pool {} + array set poolh {} + array set tokmap {} + array set ord {} + set plain 0 + + foreach insn $code { + foreach {label name} $insn break + # Ignore embedded comments + if {$name eq ".C"} continue + set an [lindex $anum $iname($name)] + + # Instruction code to assembly ... + lappend asm $iname($name) + + # Encode arguments ... + switch -exact -- $name { + ict_advance - + inc_save - + ier_nonterminal - + isv_nonterminal_leaf - + isv_nonterminal_range - + isv_nonterminal_reduce { + lappend asm [Str [lindex $insn 2]] + } + ict_match_token { + lappend asm [Tok [lindex $insn 2]] + lappend asm [Str [lindex $insn 3]] + } + ict_match_tokrange { + lappend asm [Tok [lindex $insn 2]] + lappend asm [Tok [lindex $insn 3]] + lappend asm [Str [lindex $insn 4]] + } + ict_match_tokclass { + set ccode [lindex $insn 2] + if {![info exists tccode($ccode)]} { + return -code error "Bad instruction \"$insn\", unknown class code \"$ccode\"" + } + lappend asm $tccode($ccode) + lappend asm [Str [lindex $insn 3]] + + } + inc_restore { + set jmpto [lindex $insn 2] + if {![info exists jmp($jmpto)]} { + return -code error "Bad instruction \"$insn\", unknown branch destination \"$jmpto\"" + } + lappend asm $jmp($jmpto) + lappend asm [Str [lindex $insn 3]] + } + icf_ntcall - + icf_jalways - + icf_jok - + icf_jfail { + set jmpto [lindex $insn 2] + if {![info exists jmp($jmpto)]} { + return -code error "Bad instruction \"$insn\", unknown branch destination \"$jmpto\"" + } + lappend asm $jmp($jmpto) + } + } + } + + return [list $asm $pool [array get tokmap]] +} + +proc ::grammar::me::cpu::core::new {code} { + # The code generating the state is drawn out to integrate a + # specification of how the machine state is mapped to Tcl as well. + + Validate $code + + set state {} ; # The state is representend as a Tcl list. + # ### ### ### ######### ######### ######### + lappend state $code ; # [_0] code - list - code to run (-) + lappend state 0 ; # [_1] pc - int - Program counter + lappend state 0 ; # [_2] halt - bool - Flag, set (internal) when machine was halted (icf_halt). + lappend state 0 ; # [_3] eof - bool - Flag, set (external) when where will be no more input. + lappend state {} ; # [_4] tc - list - Terminal cache, pending and processed tokens. + lappend state -1 ; # [_5] cl - int - Current Location + lappend state {} ; # [_6] ct - token - Current Character + lappend state 0 ; # [_7] ok - bool - Match Status + lappend state {} ; # [_8] sv - any - Semantic Value + lappend state {} ; # [_9] er - list - Error status (*) + lappend state {} ; # [10] ls - list - Location Stack (x) + lappend state {} ; # [11] as - list - Ast Stack + lappend state {} ; # [12] ms - list - Ast Marker Stack + lappend state {} ; # [13] es - list - Error Stack + lappend state {} ; # [14] rs - list - Return Stack + lappend state {} ; # [15] nc - dict - Nonterminal Cache (backtracking) + # ### ### ### ######### ######### ######### + + # tc = list(token) + # token = list(str lexeme line col) + + + # (-) See manpage of this package for the representation. + + # (*) 2 elements, first is error location, second is list of + # ... strings, the error messages. The strings are actually + # ... represented by references into the pool element of the code. + + # (x) Regarding the various stacks maintained in the state, their + # top element is always at the right end, i.e. the last + # element in the list representing it. + + return $state +} + +proc ::grammar::me::cpu::core::ntok {state} { + return [llength [lindex $state 4]] +} + +proc ::grammar::me::cpu::core::lc {state loc} { + set tc [lindex $state 4] + set loc [INDEX $tc $loc "Illegal location"] + return [lrange [lindex $tc $loc] 2 3] + # result = list(line col) +} + +proc ::grammar::me::cpu::core::tok {state args} { + if {[llength $args] > 2} { + return -code error {wrong # args: should be "grammar::me::cpu::core::tok state ?from ?to??"} + } + set tc [lindex $state 4] + if {[llength $args] == 0} { + return $tc + } elseif {[llength $args] == 1} { + set at [INDEX $tc [lindex $args 0] "Illegal location"] + return [lrange $tc $at $at] + } else { + set from [INDEX $tc [lindex $args 0] "Illegal start location"] + set to [INDEX $tc [lindex $args 1] "Illegal end location"] + if {$from > $to} { + return -code error "Illegal empty location range $from .. $to" + } + return [lrange $tc $from $to] + } + # result = list(token), token = list(str lex line col) +} + +proc ::grammar::me::cpu::core::pc {state} { + return [lindex $state 1] +} + +proc ::grammar::me::cpu::core::iseof {state} { + return [lindex $state 3] +} + +proc ::grammar::me::cpu::core::at {state} { + return [lindex $state 5] +} + +proc ::grammar::me::cpu::core::cc {state} { + return [lindex $state 6] +} + +proc ::grammar::me::cpu::core::sv {state} { + return [lindex $state 8] +} + +proc ::grammar::me::cpu::core::ok {state} { + return [lindex $state 7] +} + +proc ::grammar::me::cpu::core::error {state} { + set er [lindex $state 9] + if {[llength $er]} { + foreach {l m} $er break + + set pool [lindex $state 0 1] ; # state ->/0 code ->/1 pool + set mx {} + foreach id $m { + lappend mx [lindex $pool $id] + } + set er [list $l $mx] + } + return $er +} + +proc ::grammar::me::cpu::core::lstk {state} { + return [lindex $state 10] +} + +proc ::grammar::me::cpu::core::astk {state} { + return [lindex $state 11] +} + +proc ::grammar::me::cpu::core::mstk {state} { + return [lindex $state 12] +} + +proc ::grammar::me::cpu::core::estk {state} { + return [lindex $state 13] +} + +proc ::grammar::me::cpu::core::rstk {state} { + return [lindex $state 14] +} + +proc ::grammar::me::cpu::core::nc {state} { + return [lindex $state 15] +} + +proc ::grammar::me::cpu::core::ast {state} { + return [lindex $state 11 end] +} + +proc ::grammar::me::cpu::core::halted {state} { + return [lindex $state 2] +} + +proc ::grammar::me::cpu::core::code {state} { + return [lindex $state 0] +} + +proc ::grammar::me::cpu::core::eof {statevar} { + upvar 1 $statevar state + lset state 3 1 + return +} + +proc ::grammar::me::cpu::core::put {statevar tok lex line col} { + upvar 1 $statevar state + if {[lindex $state 3]} { + return -code error "Cannot add input data after eof" + } + set tc [K [lindex $state 4] [lset state 4 {}]] + lappend tc [list $tok $lex $line $col] + lset state 4 $tc + return +} + +proc ::grammar::me::cpu::core::run {statevar {steps -1}} { + # Execution loop. Should be instrumented for statistics about + # dynamic instruction frequency. I.e. which instructions are + # executed the most => put them at the front of the if/switch for + # quicker selection. I.e. frequency coding of the branches for + # speed. + + # A C implementation can shimmer the state into a directly + # accessible data structure. And the asm instructions can shimmer + # into an integer index upon which we can switch fast. + + variable anum + variable tclass + upvar 1 $statevar state + variable iname ; # For debug output + + # Do nothing for a stopped machine (halt flag set). + if {[lindex $state 2]} {return $state} + + # Fail if there are no instruction to execute + if {![llength [lindex $state 0 0]]} { + # No instructions to execute + return -code error "No instructions to execute" + } + + # Unpack state into locally accessible variables + # 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 + foreach {code pc halt eof tc cl ct ok sv er ls as ms es rs nc} $state break + + # Unpack match program for easy access as well. + # 0 1 2 + foreach {asm pool tokmap} $code break + + if 0 { + puts ________________________ + puts [join [disasm $code] \n] + puts ________________________ + } + + # Ensure that the unpacked information is not shared + unset state + + # Internal flags for optimal handling of the nonterminal + # cache. Avoid multiple unpacking of the dictionary, and avoid + # repacking if it was not modified. + + set ncunpacked 0 + set ncmodified 0 + set tmunpacked 0 + + while {1} { + # Stop execution if the specified number of instructions have + # been executed. Ignore if infinity was specified. + if {$steps == 0} break + if {$steps > 0} {incr steps -1} + + # Get current instruction ... + + if 0 {puts .$pc:\t$iname([lindex $asm $pc])} + if 0 {puts -nonewline .$pc:\t$iname([lindex $asm $pc])} + + set insn [lindex $asm $pc] ; incr pc + + # And its arguments ... + + set an [lindex $anum $insn] + if {$an == 1} { + set a [lindex $asm $pc] ; incr pc + if 0 {puts \t<$a>} + } elseif {$an == 2} { + set a [lindex $asm $pc] ; incr pc + set b [lindex $asm $pc] ; incr pc + if 0 {puts \t<$a|$b>} + } elseif {$an == 3} { + set a [lindex $asm $pc] ; incr pc + set b [lindex $asm $pc] ; incr pc + set c [lindex $asm $pc] ; incr pc + if 0 {puts \t<$a|$b|$c>} + } ;# else {puts ""} + + # Dispatch to implementation of the instruction ... + + # Separate if commands are used for easier ordering of the + # dispatch. The order of the branches should be frequency + # coded to have the most frequently used instructions first. + + # ict_advance <a:message> + if {$insn == 0} { + if 0 {puts \t\[$cl|[llength $tc]|$eof\]} + incr cl + if {$cl < [llength $tc]} { + if 0 {puts \tConsume} + + set ct [lindex $tc $cl 0] + set ok 1 + set er {} + } elseif {$eof} { + if 0 {puts \tFail<Eof>} + + # We have no input, and there won't be more coming in + # either. Fail the advance. We do _not_ stop the match + # loop, the program has to complete. The failure might + # be no such, revealed during backtracking. The current + # location is not rewound automatically, this is the + # responsibility of any backtracking. + + set er [list $cl [list $a]] + set ok 0 + } else { + if 0 {puts \tSuspend&Wait} + + # We have no input, stop matching and wait for + # more. We reset the machine into a state + # which will restart this instruction when + # execution resumes. + + incr cl -1 + incr pc -2 ; # code and message argument + break + } + if 0 {puts .Next} + continue + } + + # ict_match_token <a:token> <b:message> + if {$insn == 1} { + if {[llength $tokmap]} { + if {!$tmunpacked} { + array set tm $tokmap + set tmunpacked 1 + } + set ok [expr {$a == $tm($ct)}] + } else { + set xch [lindex $pool $a] + set ok [expr {$xch eq $ct}] + } + if {!$ok} { + set er [list $cl [list $b]] + } else { + set er {} + } + continue + } + + # ict_match_tokrange <a:tokstart> <b:tokend> <c:message> + if {$insn == 2} { + if {[llength $tokmap]} { + if {!$tmunpacked} { + array set tm $tokmap + set tmunpacked 1 + } + set x $tm($ct) + set ok [expr {($a <= $x) && ($x <= $b)}] + } else { + set a [lindex $pool $a] + set b [lindex $pool $b] + set ok [expr { + ([string compare $a $ct] <= 0) && + ([string compare $ct $b] <= 0) + }] ; # {} + } + if {!$ok} { + set er [list $cl [list $c]] + } else { + set er {} + } + continue + } + + # ict_match_tokclass <a:code> <b:message> + if {$insn == 3} { + set strcode [lindex $tclass $a] + set ok [string is $strcode -strict $ct] + if {!$ok} { + set er [list $cl [list $b]] + } else { + set er {} + } + continue + } + + # inc_restore <a:branchtarget> <b:nonterminal> + if {$insn == 4} { + set sym [lindex $pool $b] + + # Unpack the cache dict, only here. + # 8.5 - Use dict operations instead. + + if {!$ncunpacked} { + array set ncc $nc + set ncunpacked 1 + } + + if {[info exists ncc($cl,$sym)]} { + foreach {go ok error sv} $ncc($cl,$sym) break + + # Go forward, as the nonterminal matches (or not). + set cl $go + set pc $a + } + continue + } + + # inc_save <a:nonterminal> + if {$insn == 5} { + set sym [lindex $pool $a] + set at [lindex $ls end] + set ls [lrange $ls 0 end-1] + + # Unpack, modify, only here. + # 8.5 - Use dict operations instead. + + if {!$ncunpacked} { + array set ncc $nc + set ncunpacked 1 + } + + set ncc($at,$sym) [list $cl $ok $er $sv] + set ncmodified 1 + continue + } + + # icf_ntcall <a:branchtarget> + if {$insn == 6} { + lappend rs $pc + set pc $a + continue + } + + # icf_ntreturn + if {$insn == 7} { + set pc [lindex $rs end] + set rs [lrange $rs 0 end-1] + continue + } + + # iok_ok + if {$insn == 8} { + set ok 1 + continue + } + + # iok_fail + if {$insn == 9} { + set ok 0 + continue + } + + # iok_negate + if {$insn == 10} { + set ok [expr {!$ok}] + continue + } + + # icf_jalways <a:branchtarget> + if {$insn == 11} { + set pc $a + continue + } + + # icf_jok <a:branchtarget> + if {$insn == 12} { + if {$ok} {set pc $a} + # !ok => pc is already on next instruction. + continue + } + + # icf_jfail <a:branchtarget> + if {$insn == 13} { + if {!$ok} {set pc $a} + # ok => pc is already on next instruction. + continue + } + + # icf_halt + if {$insn == 14} { + set halt 1 + break + } + + # icl_push + if {$insn == 15} { + lappend ls $cl + continue + } + + # icl_rewind + if {$insn == 16} { + set cl [lindex $ls end] + set ls [lrange $ls 0 end-1] + continue + } + + # icl_pop + if {$insn == 17} { + set ls [lrange $ls 0 end-1] + continue + } + + # ier_push + if {$insn == 18} { + lappend es $er + continue + } + + # ier_clear + if {$insn == 19} { + set er {} + continue + } + + # ier_nonterminal <a:nonterminal> + if {$insn == 20} { + if {[llength $er]} { + set pos [lindex $ls end] + incr pos + set eloc [lindex $er 0] + if {$eloc == $pos} { + set er [list $eloc [list $a]] + } + } + continue + } + + # ier_merge + if {$insn == 21} { + set old [lindex $es end] + set es [lrange $es 0 end-1] + + # We have either old or current error data, keep it. + + if {![llength $er]} { + # No current data, keep old + set er $old + } elseif {[llength $old]} { + # If one of the errors is further on in the input + # choose that as the information to propagate. + + foreach {loe msgse} $er break + foreach {lon msgsn} $old break + + if {$lon > $loe} { + set er $old + } elseif {$loe == $lon} { + # Equal locations, merge the message lists. + + foreach m $msgsn {lappend msgse $m} + set er [list $loe [lsort -uniq $msgse]] + } + # else lon < loe - er is better - nothing + } + # else - !old, but er - nothing + + continue + } + + # isv_clear + if {$insn == 22} { + set sv {} + continue + } + + # isv_terminal (implied ias_push) + if {$insn == 23} { + set sv [list {} $cl $cl] + lappend as $sv + continue + } + + # isv_nonterminal_leaf <a:nonterminal> + if {$insn == 24} { + set pos [lindex $ls end] + set sv [list $a $pos $cl] + continue + } + + # isv_nonterminal_range <a:nonterminal> + if {$insn == 25} { + set pos [lindex $ls end] + set sv [list $a $pos $cl [list {} $pos $cl]] + continue + } + + # isv_nonterminal_reduce <a:nonterminal> + if {$insn == 26} { + set pos [lindex $ls end] + if {[llength $ms]} { + set mrk [lindex $ms end] + incr mrk + } else { + set mrk 0 + } + set sv [lrange $as $mrk end] + set sv [linsert $sv 0 $a $pos $cl] + continue + } + + # ias_push + if {$insn == 27} { + lappend as $sv + continue + } + + # ias_mark + if {$insn == 28} { + set mark [llength $as] + incr mark -1 + lappend ms $mark + continue + } + + # ias_mrewind + if {$insn == 29} { + set mark [lindex $ms end] + set ms [lrange $ms 0 end-1] + set as [lrange $as 0 $mark] + continue + } + + # ias_mpop + if {$insn == 30} { + set ms [lrange $ms 0 end-1] + continue + } + + return -code error "Illegal instruction $insn" + } + + # Repack a modified cache dictionary, then repack and store the + # updated state value. + + if 0 {puts .Repackage\ state} + + if {$ncmodified} {set nc [array get ncc]} + set state [list $code $pc $halt $eof $tc $cl $ct $ok $sv $er $ls $as $ms $es $rs $nc] + return +} + +namespace eval grammar::me::cpu::core { + # Map between class codes and names + variable tclass {} + variable tccode + + foreach {x code} { + 0 alnum + 1 alpha + 2 digit + 3 xdigit + 4 punct + 5 space + } { + lappend tclass $code + set tccode($code) $x + } + + # Number of arguments per ME instruction. + # Indexed by instruction code. + variable anum {} + + # Mapping between instruction codes and names. + variable iname + + foreach {z insn x notes} { + 0 ict_advance 1 {-- TESTED} + 1 ict_match_token 2 {-- TESTED} + 2 ict_match_tokrange 3 {-- TESTED} + 3 ict_match_tokclass 2 {-- TESTED} + 4 inc_restore 2 {-- TESTED} + 5 inc_save 1 {-- TESTED} + 6 icf_ntcall 1 {-- TESTED} + 7 icf_ntreturn 0 {-- TESTED} + 8 iok_ok 0 {-- TESTED} + 9 iok_fail 0 {-- TESTED} + 10 iok_negate 0 {-- TESTED} + 11 icf_jalways 1 {-- TESTED} + 12 icf_jok 1 {-- TESTED} + 13 icf_jfail 1 {-- TESTED} + 14 icf_halt 0 {-- TESTED} + 15 icl_push 0 {-- TESTED} + 16 icl_rewind 0 {-- TESTED} + 17 icl_pop 0 {-- TESTED} + 18 ier_push 0 {-- TESTED} + 19 ier_clear 0 {-- TESTED} + 20 ier_nonterminal 1 {-- TESTED} + 21 ier_merge 0 {-- TESTED} + 22 isv_clear 0 {-- TESTED} + 23 isv_terminal 0 {-- TESTED} + 24 isv_nonterminal_leaf 1 {-- TESTED} + 25 isv_nonterminal_range 1 {-- TESTED} + 26 isv_nonterminal_reduce 1 {-- TESTED} + 27 ias_push 0 {-- TESTED} + 28 ias_mark 0 {-- TESTED} + 29 ias_mrewind 0 {-- TESTED} + 30 ias_mpop 0 {-- TESTED} + } { + lappend anum $x + set iname($z) $insn + set iname($insn) $z + } +} + +# ### ### ### ######### ######### ######### +## Helper commands ((Dis)Assembler, runtime). + +proc ::grammar::me::cpu::core::INDEX {list i label} { + if {$i eq "end"} { + set i [expr {[llength $list] - 1}] + } elseif {[regexp {^end-([0-9]+)$} $i -> n]} { + set i [expr {[llength $list] - $n -1}] + } + if { + ![string is integer -strict $i] || + ($i < 0) || + ($i >= [llength $list]) + } { + return -code error "$label $i" + } + return $i +} + +proc ::grammar::me::cpu::core::K {x y} {set x} + +proc ::grammar::me::cpu::core::Str {str} { + upvar 1 pool pool poolh poolh + if {![info exists poolh($str)]} { + set poolh($str) [llength $pool] + lappend pool $str + } + return $poolh($str) +} + +proc ::grammar::me::cpu::core::Tok {str} { + upvar 1 tokmap tokmap ord ord plain plain + + if {[regexp {^([^:]+):(.+)$} $str -> id name]} { + if {$plain} { + return -code error "Bad assembly, mixing plain and ranked tokens" + } + if {[info exists ord($id)]} { + return -code error "Bad assembly, non-total ordering for $name and $ord($id), at rank $id" + } + set ord($id) $name + set tokmap($name) $id + + return $id + } else { + if {[array size ord]} { + return -code error "Bad assembly, mixing plain and ranked tokens" + } + set plain 1 + return [uplevel 1 [list Str $str]] + } +} + +proc ::grammar::me::cpu::core::Validate {code {ovar {}} {tvar {}} {jvar {}}} { + variable anum + variable iname + + # Basic validation of structure ... + + if {[llength $code] != 3} { + return -code error "Bad length" + } + + foreach {asm pool tokmap} $code break + + if {[llength $tokmap] % 2 == 1} { + return -code error "Bad tokmap, expected a dictionary" + } + + array set ord {} + if {[llength $tokmap] > 0} { + foreach {tok rank} $tokmap { + if {[info exists ord($rank)]} { + return -code error "Bad tokmap, non-total ordering for $tok and $ord($rank), at rank $rank" + } + set ord($rank) $tok + } + } + + # Basic validation of ME code: Valid instructions, collect valid + # branch target indices + + array set target {} + + set pc 0 + set pcend [llength $asm] + set poolend [llength $pool] + + while {$pc < $pcend} { + set target($pc) . + + set insn [lindex $asm $pc] + if {($insn < 0) || ($insn > 30)} { + return -code error "Invalid instruction $insn at PC $pc" + } + + incr pc + incr pc [lindex $anum $insn] + } + + if {$pc > $pcend} { + return -code error "Bad program, last instruction $insn ($iname($insn)) is truncated" + } + + # Validation of ME instruction arguments (pool references, branch + # targets, ...) + + if {$jvar ne ""} { + upvar 1 $jvar jmp + } + array set jmp {} + + set pc 0 + while {$pc < $pcend} { + set base $pc + set insn [lindex $asm $pc] ; incr pc + set an [lindex $anum $insn] + + if {$an == 1} { + set a [lindex $asm $pc] ; incr pc + } elseif {$an == 2} { + set a [lindex $asm $pc] ; incr pc + set b [lindex $asm $pc] ; incr pc + } elseif {$an == 3} { + set a [lindex $asm $pc] ; incr pc + set b [lindex $asm $pc] ; incr pc + set c [lindex $asm $pc] ; incr pc + } + + switch -exact $insn { + 0 - 5 - 20 - 24 - 25 - 26 - + a/string { + if {($a < 0) || ($a >= $poolend)} { + return -code error "Invalid string reference $a for instruction $insn ($iname($insn)) at $base" + } + } + 1 { + # a/tok b/string + if {![llength $tokmap]} { + if {($a < 0) || ($a >= $poolend)} { + return -code error "Invalid string reference $a for instruction $insn ($iname($insn)) at $base" + } + } else { + if {![info exists ord($a)]} { + return -code error "Invalid token rank $a for instruction $insn ($iname($insn)) at $base" + } + } + if {($b < 0) || ($b >= $poolend)} { + return -code error "Invalid string reference $b for instruction $insn ($iname($insn)) at $base" + } + } + 2 { + # a/tokstart b/tokend c/string + + if {![llength $tokmap]} { + # a = b = string references. + if {($a < 0) || ($a >= $poolend)} { + return -code error "Invalid string reference $a for instruction $insn ($iname($insn)) at $base" + } + if {($b < 0) || ($b >= $poolend)} { + return -code error "Invalid string reference $b for instruction $insn ($iname($insn)) at $base" + } + if {$a == $b} { + return -code error "Invalid single-token range for instruction $insn ($iname($insn)) at $base" + } + if {[string compare [lindex $pool $a] [lindex $pool $b]] > 0} { + return -code error "Invalid empty range for instruction $insn ($iname($insn)) at $base" + } + } else { + # tokmap defined: a = b = order rank. + if {![info exists ord($a)]} { + return -code error "Invalid token rank $a for instruction $insn ($iname($insn)) at $base" + } + if {![info exists ord($b)]} { + return -code error "Invalid token rank $b for instruction $insn ($iname($insn)) at $base" + } + if {$a == $b} { + return -code error "Invalid single-token range for instruction $insn ($iname($insn)) at $base" + } + if {$a > $b} { + return -code error "Invalid empty range for instruction $insn ($iname($insn)) at $base" + } + } + if {($c < 0) || ($c >= $poolend)} { + return -code error "Invalid string reference $c for instruction $insn ($iname($insn)) at $base" + } + } + 3 { + # a/class(0-5) b/string + if {($a < 0) || ($a > 5)} { + return -code error "Invalid token-class $a for instruction $insn ($iname($insn)) at $base" + } + if {($b < 0) || ($b >= $poolend)} { + return -code error "Invalid string reference $b for instruction $insn ($iname($insn)) at $base" + } + } + 4 { + # a/branch b/string + if {![info exists target($a)]} { + return -code error "Invalid branch target $a for instruction $insn ($iname($insn)) at $base" + } else { + set jmp($a) . + } + if {($b < 0) || ($b >= $poolend)} { + return -code error "Invalid string reference $b for instruction $insn ($iname($insn)) at $base" + } + } + 6 - 11 - 12 - 13 - + a/branch { + if {![info exists target($a)]} { + return -code error "Invalid branch target $a for instruction $insn ($iname($insn)) at $base" + } else { + set jmp($base) $a + } + } + default {} + } + } + + # All checks passed, code is deemed good enough. + # Caller may have asked for some of the collected + # information. + + if {$ovar ne ""} { + upvar 1 $ovar o + array set o [array get ord] + } + if {$tvar ne ""} { + upvar 1 $tvar t + array set t [array get target] + } + return +} + +# ### ### ### ######### ######### ######### +## Ready + +package provide grammar::me::cpu::core 0.2 diff --git a/tcllib/modules/grammar_me/me_cpucore.test b/tcllib/modules/grammar_me/me_cpucore.test new file mode 100644 index 0000000..b163e19 --- /dev/null +++ b/tcllib/modules/grammar_me/me_cpucore.test @@ -0,0 +1,163 @@ +# me_cpucore.test: Tests for the ME virtual machine -*- tcl -*- +# +# This file contains a collection of tests for one or more of the +# commands making up the ME virtual machine. Sourcing this file into +# Tcl runs the tests and generates output for errors. No output means +# no errors were found. +# +# Copyright (c) 2005-2006 Andreas Kupries <andreas_kupries@users.sourceforge.net> +# All rights reserved. +# +# RCS: @(#) $Id: me_cpucore.test,v 1.3 2006/10/09 21:41:40 andreas_kupries Exp $ + +# ------------------------------------------------------------------------- + +source [file join \ + [file dirname [file dirname [file join [pwd] [info script]]]] \ + devtools testutilities.tcl] + +testsNeedTcl 8.4 +testsNeedTcltest 2.1 + +support { + use fileutil/fileutil.tcl fileutil +} +testing { + useLocal me_cpucore.tcl grammar::me::cpu::core +} + +# ------------------------------------------------------------------------- + +proc cpustate {state} { + set vstate {} + lappend vstate cd [::grammar::me::cpu::core::code $state] + lappend vstate pc [::grammar::me::cpu::core::pc $state] + lappend vstate ht [::grammar::me::cpu::core::halted $state] + lappend vstate eo [::grammar::me::cpu::core::iseof $state] + lappend vstate tc [::grammar::me::cpu::core::tok $state] + lappend vstate at [::grammar::me::cpu::core::at $state] + lappend vstate cc [::grammar::me::cpu::core::cc $state] + lappend vstate ok [::grammar::me::cpu::core::ok $state] + lappend vstate sv [::grammar::me::cpu::core::sv $state] + lappend vstate er [::grammar::me::cpu::core::error $state] + lappend vstate ls [::grammar::me::cpu::core::lstk $state] + lappend vstate as [::grammar::me::cpu::core::astk $state] + lappend vstate ms [::grammar::me::cpu::core::mstk $state] + lappend vstate es [::grammar::me::cpu::core::estk $state] + lappend vstate rs [::grammar::me::cpu::core::rstk $state] + lappend vstate nc [::grammar::me::cpu::core::nc $state] + return $vstate +} + +proc cpudelta {prev now} { + array set _ {} + foreach {k v} [cpustate $prev] { + set _($k) $v + } + set res {} + foreach {k v} [cpustate $now] { + if {[info exists _($k)] && ($_($k) eq $v)} continue + lappend res $k $v + } + return $res +} + +proc cpufstate {vstate} { + set res {} + foreach {k v} $vstate {lappend res [list $k $v]} + join $res \n +} + +proc cpusubst {vstate args} { + array set _ $vstate + foreach {k v} $args {set _($k) $v} + set res {} + foreach k {cd pc ht eo tc at cc ok sv er ls as ms es rs nc} { + if {![info exists _($k)]} continue + lappend res $k $_($k) + } + return $res +} + +proc cpufilter {vstate args} { + array set _ $vstate + set res {} + foreach k $args { lappend res $k $_($k) } + return $res +} + +proc canon_code {code} { + foreach {i p t} $code break + # Sorting the token map, canonical rep for direct comparison + return [list $i $p [dictsort $t]] +} + +# ------------------------------------------------------------------------- + +set asm_table [string trimright \ + [fileutil::cat \ + [localPath me_cpucore.tests.asm-map.txt]]] + +set badasm_table [string trimright \ + [fileutil::cat \ + [localPath me_cpucore.tests.badasm-map.txt]]] + +set badmach_table [string trimright \ + [fileutil::cat \ + [localPath me_cpucore.tests.badmach-map.txt]]] + +set semantics [string trimright \ + [fileutil::cat \ + [localPath me_cpucore.tests.semantics.txt]]] + +# ------------------------------------------------------------------------- +# In this section we run all the tests depending on a grammar::me::cpu::core, +# and thus have to test all the available implementations. + +set tests [file join [file dirname [info script]] me_cpucore.testsuite] + +catch {memory validate on} + +set impl tcl +set usec [time {source $tests} 1] + +if 0 { + foreach impl [grammar::me::cpu::core::Implementations] { + grammar::me::cpu::core::SwitchTo $impl + + # The global variable 'impl' is part of the public API the + # testsuit (in htmlparse_tree.testsuite) can expect from the + # environment. + + namespace import -force grammar::me::cpu::core + + set usec [time {source $tests} 1] + + #puts "$impl:\t$usec" + } +} + +catch {memory validate off} + +unset usec +unset tests + +#puts "" + +# Reset system to fully inactive state. +# grammar::me::cpu::core::SwitchTo {} + +# ------------------------------------------------------------------------- + +# ### ### ### ######### ######### ######### +## Cleanup and statistics. + +rename cpustate {} +rename cpufstate {} +rename cpudelta {} +rename cpufilter {} +rename canon_code {} + +unset asm_table badmach_table semantics + +testsuiteCleanup diff --git a/tcllib/modules/grammar_me/me_cpucore.tests.asm-map.txt b/tcllib/modules/grammar_me/me_cpucore.tests.asm-map.txt new file mode 100644 index 0000000..b9690be --- /dev/null +++ b/tcllib/modules/grammar_me/me_cpucore.tests.asm-map.txt @@ -0,0 +1,38 @@ +ict_advance message {{0 0} message {}} +ict_match_token {tok message} {{1 0 1} {tok message} {}} +ict_match_token {5:tok message} {{1 5 0} message {tok 5}} +ict_match_tokrange {tokbegin tokend message} {{2 0 1 2} {tokbegin tokend message} {}} +ict_match_tokrange {5:tokbegin 6:tokend message} {{2 5 6 0} message {tokbegin 5 tokend 6}} +ict_match_tokclass {alnum message} {{3 0 0} message {}} +ict_match_tokclass {alpha message} {{3 1 0} message {}} +ict_match_tokclass {digit message} {{3 2 0} message {}} +ict_match_tokclass {xdigit message} {{3 3 0} message {}} +ict_match_tokclass {punct message} {{3 4 0} message {}} +ict_match_tokclass {space message} {{3 5 0} message {}} +inc_restore {branchlabel nt} {{4 0 0} nt {}} +inc_save nt {{5 0} nt {}} +icf_ntcall branchlabel {{6 0} {} {}} +icf_ntreturn {} {7 {} {}} +iok_ok {} {8 {} {}} +iok_fail {} {9 {} {}} +iok_negate {} {10 {} {}} +icf_jalways branchlabel {{11 0} {} {}} +icf_jok branchlabel {{12 0} {} {}} +icf_jfail branchlabel {{13 0} {} {}} +icf_halt {} {14 {} {}} +icl_push {} {15 {} {}} +icl_rewind {} {16 {} {}} +icl_pop {} {17 {} {}} +ier_push {} {18 {} {}} +ier_clear {} {19 {} {}} +ier_nonterminal message {{20 0} message {}} +ier_merge {} {21 {} {}} +isv_clear {} {22 {} {}} +isv_terminal {} {23 {} {}} +isv_nonterminal_leaf nt {{24 0} nt {}} +isv_nonterminal_range nt {{25 0} nt {}} +isv_nonterminal_reduce nt {{26 0} nt {}} +ias_push {} {27 {} {}} +ias_mark {} {28 {} {}} +ias_mrewind {} {29 {} {}} +ias_mpop {} {30 {} {}} diff --git a/tcllib/modules/grammar_me/me_cpucore.tests.badasm-map.txt b/tcllib/modules/grammar_me/me_cpucore.tests.badasm-map.txt new file mode 100644 index 0000000..5cfdc82 --- /dev/null +++ b/tcllib/modules/grammar_me/me_cpucore.tests.badasm-map.txt @@ -0,0 +1,58 @@ +barf {} {Bad instruction "branchlabel barf", unknown command "barf"} +ict_advance {} {Bad instruction "branchlabel ict_advance", expected 1 argument} +ict_advance {goob message} {Bad instruction "branchlabel ict_advance goob message", expected 1 argument} +ict_match_token {} {Bad instruction "branchlabel ict_match_token", expected 2 arguments} +ict_match_token {message} {Bad instruction "branchlabel ict_match_token message", expected 2 arguments} +ict_match_token {foo tok message} {Bad instruction "branchlabel ict_match_token foo tok message", expected 2 arguments} +ict_match_tokrange {} {Bad instruction "branchlabel ict_match_tokrange", expected 3 arguments} +ict_match_tokrange {foo} {Bad instruction "branchlabel ict_match_tokrange foo", expected 3 arguments} +ict_match_tokrange {foo bar} {Bad instruction "branchlabel ict_match_tokrange foo bar", expected 3 arguments} +ict_match_tokrange {tokb:1 tokend message} {Bad assembly, mixing plain and ranked tokens} +ict_match_tokrange {feh tokbegin tokend message} {Bad instruction "branchlabel ict_match_tokrange feh tokbegin tokend message", expected 3 arguments} +ict_match_tokclass {} {Bad instruction "branchlabel ict_match_tokclass", expected 2 arguments} +ict_match_tokclass {alnum} {Bad instruction "branchlabel ict_match_tokclass alnum", expected 2 arguments} +ict_match_tokclass {bogus message} {Bad instruction "branchlabel ict_match_tokclass bogus message", unknown class code "bogus"} +ict_match_tokclass {fee alnum message} {Bad instruction "branchlabel ict_match_tokclass fee alnum message", expected 2 arguments} +inc_restore {} {Bad instruction "branchlabel inc_restore", expected 2 arguments} +inc_restore {branchlabel} {Bad instruction "branchlabel inc_restore branchlabel", expected 2 arguments} +inc_restore {badlabel nt} {Bad instruction "branchlabel inc_restore badlabel nt", unknown branch destination "badlabel"} +inc_restore {branchlabel nt foo} {Bad instruction "branchlabel inc_restore branchlabel nt foo", expected 2 arguments} +inc_save {} {Bad instruction "branchlabel inc_save", expected 1 argument} +inc_save {foo nt} {Bad instruction "branchlabel inc_save foo nt", expected 1 argument} +icf_ntcall {} {Bad instruction "branchlabel icf_ntcall", expected 1 argument} +icf_ntcall badlabel {Bad instruction "branchlabel icf_ntcall badlabel", unknown branch destination "badlabel"} +icf_ntcall {foo branchlabel} {Bad instruction "branchlabel icf_ntcall foo branchlabel", expected 1 argument} +icf_ntreturn {bogus} {Bad instruction "branchlabel icf_ntreturn bogus", expected 0 arguments} +iok_ok {bogus} {Bad instruction "branchlabel iok_ok bogus", expected 0 arguments} +iok_fail {bogus} {Bad instruction "branchlabel iok_fail bogus", expected 0 arguments} +iok_negate {bogus} {Bad instruction "branchlabel iok_negate bogus", expected 0 arguments} +icf_jalways {} {Bad instruction "branchlabel icf_jalways", expected 1 argument} +icf_jalways badlabel {Bad instruction "branchlabel icf_jalways badlabel", unknown branch destination "badlabel"} +icf_jalways {foo branchlabel} {Bad instruction "branchlabel icf_jalways foo branchlabel", expected 1 argument} +icf_jok {} {Bad instruction "branchlabel icf_jok", expected 1 argument} +icf_jok badlabel {Bad instruction "branchlabel icf_jok badlabel", unknown branch destination "badlabel"} +icf_jok {foo branchlabel} {Bad instruction "branchlabel icf_jok foo branchlabel", expected 1 argument} +icf_jfail {} {Bad instruction "branchlabel icf_jfail", expected 1 argument} +icf_jfail badlabel {Bad instruction "branchlabel icf_jfail badlabel", unknown branch destination "badlabel"} +icf_jfail {foo branchlabel} {Bad instruction "branchlabel icf_jfail foo branchlabel", expected 1 argument} +icf_halt {bogus} {Bad instruction "branchlabel icf_halt bogus", expected 0 arguments} +icl_push {bogus} {Bad instruction "branchlabel icl_push bogus", expected 0 arguments} +icl_rewind {bogus} {Bad instruction "branchlabel icl_rewind bogus", expected 0 arguments} +icl_pop {bogus} {Bad instruction "branchlabel icl_pop bogus", expected 0 arguments} +ier_push {bogus} {Bad instruction "branchlabel ier_push bogus", expected 0 arguments} +ier_clear {bogus} {Bad instruction "branchlabel ier_clear bogus", expected 0 arguments} +ier_nonterminal {} {Bad instruction "branchlabel ier_nonterminal", expected 1 argument} +ier_nonterminal {foo message} {Bad instruction "branchlabel ier_nonterminal foo message", expected 1 argument} +ier_merge {bogus} {Bad instruction "branchlabel ier_merge bogus", expected 0 arguments} +isv_clear {bogus} {Bad instruction "branchlabel isv_clear bogus", expected 0 arguments} +isv_terminal {bogus} {Bad instruction "branchlabel isv_terminal bogus", expected 0 arguments} +isv_nonterminal_leaf {} {Bad instruction "branchlabel isv_nonterminal_leaf", expected 1 argument} +isv_nonterminal_leaf {foo nt} {Bad instruction "branchlabel isv_nonterminal_leaf foo nt", expected 1 argument} +isv_nonterminal_range {} {Bad instruction "branchlabel isv_nonterminal_range", expected 1 argument} +isv_nonterminal_range {foo nt} {Bad instruction "branchlabel isv_nonterminal_range foo nt", expected 1 argument} +isv_nonterminal_reduce {} {Bad instruction "branchlabel isv_nonterminal_reduce", expected 1 argument} +isv_nonterminal_reduce {foo nt} {Bad instruction "branchlabel isv_nonterminal_reduce foo nt", expected 1 argument} +ias_push {bogus} {Bad instruction "branchlabel ias_push bogus", expected 0 arguments} +ias_mark {bogus} {Bad instruction "branchlabel ias_mark bogus", expected 0 arguments} +ias_mrewind {bogus} {Bad instruction "branchlabel ias_mrewind bogus", expected 0 arguments} +ias_mpop {bogus} {Bad instruction "branchlabel ias_mpop bogus", expected 0 arguments} diff --git a/tcllib/modules/grammar_me/me_cpucore.tests.badmach-map.txt b/tcllib/modules/grammar_me/me_cpucore.tests.badmach-map.txt new file mode 100644 index 0000000..9980cb6 --- /dev/null +++ b/tcllib/modules/grammar_me/me_cpucore.tests.badmach-map.txt @@ -0,0 +1,67 @@ +{} {Bad length} +{{}} {Bad length} +{{} {}} {Bad length} +{{} {} {} {}} {Bad length} +{{} {} x} {Bad tokmap, expected a dictionary} +{{} {} {x 3 y 3}} {Bad tokmap, non-total ordering for y and x, at rank 3} +{-1 {} {}} {Invalid instruction -1 at PC 0} +{31 {} {}} {Invalid instruction 31 at PC 0} +{0 {} {}} {Bad program, last instruction 0 (ict_advance) is truncated} +{1 {} {}} {Bad program, last instruction 1 (ict_match_token) is truncated} +{{1 0} {} {}} {Bad program, last instruction 1 (ict_match_token) is truncated} +{2 {} {}} {Bad program, last instruction 2 (ict_match_tokrange) is truncated} +{{2 0} {} {}} {Bad program, last instruction 2 (ict_match_tokrange) is truncated} +{{2 0 0} {} {}} {Bad program, last instruction 2 (ict_match_tokrange) is truncated} +{3 {} {}} {Bad program, last instruction 3 (ict_match_tokclass) is truncated} +{{3 0} {} {}} {Bad program, last instruction 3 (ict_match_tokclass) is truncated} +{4 {} {}} {Bad program, last instruction 4 (inc_restore) is truncated} +{{4 0} {} {}} {Bad program, last instruction 4 (inc_restore) is truncated} +{5 {} {}} {Bad program, last instruction 5 (inc_save) is truncated} +{6 {} {}} {Bad program, last instruction 6 (icf_ntcall) is truncated} +{11 {} {}} {Bad program, last instruction 11 (icf_jalways) is truncated} +{12 {} {}} {Bad program, last instruction 12 (icf_jok) is truncated} +{13 {} {}} {Bad program, last instruction 13 (icf_jfail) is truncated} +{20 {} {}} {Bad program, last instruction 20 (ier_nonterminal) is truncated} +{24 {} {}} {Bad program, last instruction 24 (isv_nonterminal_leaf) is truncated} +{25 {} {}} {Bad program, last instruction 25 (isv_nonterminal_range) is truncated} +{26 {} {}} {Bad program, last instruction 26 (isv_nonterminal_reduce) is truncated} +{{0 -1} {} {}} {Invalid string reference -1 for instruction 0 (ict_advance) at 0} +{{0 0} {} {}} {Invalid string reference 0 for instruction 0 (ict_advance) at 0} +{{1 0 -1} {x} {}} {Invalid string reference -1 for instruction 1 (ict_match_token) at 0} +{{1 0 1} {x} {}} {Invalid string reference 1 for instruction 1 (ict_match_token) at 0} +{{2 0 1 -1} {x y} {}} {Invalid string reference -1 for instruction 2 (ict_match_tokrange) at 0} +{{2 0 1 2} {x y} {}} {Invalid string reference 2 for instruction 2 (ict_match_tokrange) at 0} +{{3 0 -1} {} {}} {Invalid string reference -1 for instruction 3 (ict_match_tokclass) at 0} +{{3 0 0} {} {}} {Invalid string reference 0 for instruction 3 (ict_match_tokclass) at 0} +{{4 0 -1} {} {}} {Invalid string reference -1 for instruction 4 (inc_restore) at 0} +{{4 0 0} {} {}} {Invalid string reference 0 for instruction 4 (inc_restore) at 0} +{{5 -1} {} {}} {Invalid string reference -1 for instruction 5 (inc_save) at 0} +{{5 0} {} {}} {Invalid string reference 0 for instruction 5 (inc_save) at 0} +{{20 -1} {} {}} {Invalid string reference -1 for instruction 20 (ier_nonterminal) at 0} +{{20 0} {} {}} {Invalid string reference 0 for instruction 20 (ier_nonterminal) at 0} +{{24 -1} {} {}} {Invalid string reference -1 for instruction 24 (isv_nonterminal_leaf) at 0} +{{24 0} {} {}} {Invalid string reference 0 for instruction 24 (isv_nonterminal_leaf) at 0} +{{25 -1} {} {}} {Invalid string reference -1 for instruction 25 (isv_nonterminal_range) at 0} +{{25 0} {} {}} {Invalid string reference 0 for instruction 25 (isv_nonterminal_range) at 0} +{{26 -1} {} {}} {Invalid string reference -1 for instruction 26 (isv_nonterminal_reduce) at 0} +{{26 0} {} {}} {Invalid string reference 0 for instruction 26 (isv_nonterminal_reduce) at 0} +{{4 -1 0} x {}} {Invalid branch target -1 for instruction 4 (inc_restore) at 0} +{{6 -1} {} {}} {Invalid branch target -1 for instruction 6 (icf_ntcall) at 0} +{{11 -1} {} {}} {Invalid branch target -1 for instruction 11 (icf_jalways) at 0} +{{12 -1} {} {}} {Invalid branch target -1 for instruction 12 (icf_jok) at 0} +{{13 -1} {} {}} {Invalid branch target -1 for instruction 13 (icf_jfail) at 0} +{{3 -1 0} x {}} {Invalid token-class -1 for instruction 3 (ict_match_tokclass) at 0} +{{3 6 0} x {}} {Invalid token-class 6 for instruction 3 (ict_match_tokclass) at 0} +{{1 -1 0} x {}} {Invalid string reference -1 for instruction 1 (ict_match_token) at 0} +{{1 1 0} x {}} {Invalid string reference 1 for instruction 1 (ict_match_token) at 0} +{{1 0 0} x {z 1}} {Invalid token rank 0 for instruction 1 (ict_match_token) at 0} +{{2 -1 0 0} x {}} {Invalid string reference -1 for instruction 2 (ict_match_tokrange) at 0} +{{2 1 0 0} x {}} {Invalid string reference 1 for instruction 2 (ict_match_tokrange) at 0} +{{2 0 -1 0} x {}} {Invalid string reference -1 for instruction 2 (ict_match_tokrange) at 0} +{{2 0 1 0} x {}} {Invalid string reference 1 for instruction 2 (ict_match_tokrange) at 0} +{{2 0 0 0} x {}} {Invalid single-token range for instruction 2 (ict_match_tokrange) at 0} +{{2 0 1 0} {y x} {}} {Invalid empty range for instruction 2 (ict_match_tokrange) at 0} +{{2 0 1 0} x {z 1}} {Invalid token rank 0 for instruction 2 (ict_match_tokrange) at 0} +{{2 1 0 0} x {z 1}} {Invalid token rank 0 for instruction 2 (ict_match_tokrange) at 0} +{{2 0 0 0} x {z 0}} {Invalid single-token range for instruction 2 (ict_match_tokrange) at 0} +{{2 1 0 0} x {a 1 b 0}} {Invalid empty range for instruction 2 (ict_match_tokrange) at 0} diff --git a/tcllib/modules/grammar_me/me_cpucore.tests.semantics.txt b/tcllib/modules/grammar_me/me_cpucore.tests.semantics.txt new file mode 100644 index 0000000..90a0a43 --- /dev/null +++ b/tcllib/modules/grammar_me/me_cpucore.tests.semantics.txt @@ -0,0 +1,279 @@ +{ict_advance, no input, no eof -- suspend&wait} +{} 0 +0 1 {{0 0} x {}} +{} + +{ict_advance, no input, at eof -- fail} +{} 1 +0 1 {{0 0} x {}} +{pc 2 at 0 er {0 x}} + +{ict_advance, has input, no eof -- consume} +{{NUM 1 0 0}} 1 +0 1 {{0 0} x {}} +{pc 2 at 0 cc NUM ok 1} + +{ict_advance, has input, at eof -- consume} +{{NUM 1 0 0}} 1 +0 1 {{0 0} x {}} +{pc 2 at 0 cc NUM ok 1} + +{icf_halt -- stop engine} +{} 0 +0 2 {{14 14 14} {} {}} +{pc 1 ht 1} + +{icf_halt -- stopped engine, no instruction advance} +{} 0 +1 2 {{14 14 14} {} {}} +{} + +{iok_ok -- always match} +{} 0 +0 1 {8 {} {}} +{pc 1 ok 1} + +{iok_fail -- never match} +{} 0 +1 1 {{8 9} {} {}} +{pc 2 ok 0} + +{iok_negate -- match negation} +{} 0 +1 1 {{8 10} {} {}} +{pc 2 ok 0} + +{iok_negate -- match negation} +{} 0 +1 1 {{9 10} {} {}} +{pc 2 ok 1} + +{icf_jalways -- jump always} +{} 0 +0 1 {{11 5 8 8 8 8} {} {}} +{pc 5} + +{icf_jok -- jump on ok, !ok -- no jump} +{} 0 +0 1 {{12 5 8 8 8 8} {} {}} +{pc 2} + +{icf_jok -- jump on ok, ok -- jump} +{} 0 +1 1 {{8 12 5 8 8 8 8} {} {}} +{pc 5} + +{icf_jfail -- jump on !ok, !ok -- jump} +{} 0 +0 1 {{13 5 8 8 8 8} {} {}} +{pc 5} + +{icf_jfail -- jump on !ok, ok -- no jump} +{} 0 +1 1 {{8 13 5 8 8 8 8} {} {}} +{pc 3} + +{icl_push, save current location} +{{ID lalal 0 0} {NUM 1 0 5}} 0 +2 2 {{0 0 0 0 15 15} x {}} +{pc 6 ls {1 1}} + +{icl_rewind, reset to last saved location} +{{ID lalal 0 0} {NUM 1 0 5}} 0 +3 1 {{0 0 15 0 0 16} x {}} +{pc 6 at 0 ls {}} + +{icl_pop, discard last saved location} +{{ID lalal 0 0} {NUM 1 0 5}} 0 +4 1 {{0 0 15 0 0 15 17} x {}} +{pc 7 ls 0} + +{isv_terminal -- semantic value, set for terminal} +{{ID lalal 0 0} {NUM 1 0 5}} 0 +2 1 {{0 0 0 0 23} x {}} +{pc 5 sv {{} 1 1} as {{{} 1 1}}} + +{isv_clear -- semantic value, clear} +{{ID lalal 0 0} {NUM 1 0 5}} 0 +3 1 {{0 0 0 0 23 22} x {}} +{pc 6 sv {}} + +{isv_nonterminal_leaf -- semantic value, set for leaf nonterminal} +{{ID lalal 0 0} {NUM 1 0 5} {NUM 1 0 5}} 0 +4 1 {{0 0 0 0 15 0 0 24 0} x {}} +{pc 9 sv {0 1 2}} + +{isv_nonterminal_range -- semantic value, set for range nonterminal} +{{ID lalal 0 0} {NUM 1 0 5} {NUM 1 0 5}} 0 +4 1 {{0 0 0 0 15 0 0 25 0} x {}} +{pc 9 sv {0 1 2 {{} 1 2}}} + +{ias_push -- save semantic value on ast stack} +{{ID lalal 0 0} {NUM 1 0 5} {NUM 1 0 5}} 0 +5 1 {{0 0 0 0 15 0 0 24 0 27} x {}} +{pc 10 as {{0 1 2}}} + +{ias_mark -- remember location on ast stack} +{{ID lalal 0 0} {NUM 1 0 5} {NUM 1 0 5}} 0 +6 1 {{0 0 0 0 15 0 0 24 0 27 28} x {}} +{pc 11 ms 0} + +{ias_mark -- remember location on ast stack, empty ast stack} +{{ID lalal 0 0} {NUM 1 0 5} {NUM 1 0 5}} 0 +0 1 {28 {} {}} +{pc 1 ms -1} + +{ias_mrewind -- discard ast stack to last saved location} +{{ID lalal 0 0} {NUM 1 0 5} {NUM 1 0 5}} 0 +10 1 {{0 0 0 0 15 0 0 24 0 27 28 27 27 27 29} x {}} +{pc 15 as {{0 1 2}} ms {}} + +{ias_mpop -- discard last saved ast location} +{{ID lalal 0 0} {NUM 1 0 5} {NUM 1 0 5}} 0 +10 1 {{0 0 0 0 15 0 0 24 0 27 28 27 27 27 30} x {}} +{pc 15 ms {}} + +{ict_match_token, no match} +{{NUM 1 0 0}} 1 +1 1 {{0 0 1 0 1} {x bad} {}} +{pc 5 ok 0 er {0 bad}} + +{ict_match_token, match} +{{NUM 1 0 0}} 1 +1 1 {{0 0 1 0 1} {NUM bad} {}} +{pc 5} + +{ict_match_token, no match, token map} +{{NUM 1 0 0}} 1 +1 1 {{0 0 1 5 1} {x bad} {x 5 NUM 7}} +{pc 5 ok 0 er {0 bad}} + +{ict_match_token, match, token map} +{{NUM 1 0 0}} 1 +1 1 {{0 0 1 7 1} {NUM bad} {NUM 7}} +{pc 5} + +{ict_match_tokrange, no match} +{{NUM 1 0 0}} 1 +1 1 {{0 0 2 0 1 2} {a z bad} {}} +{pc 6 ok 0 er {0 bad}} + +{ict_match_tokrange, match} +{{f 1 0 0}} 1 +1 1 {{0 0 2 0 1 2} {a z bad} {}} +{pc 6} + +{ict_match_tokrange, no match, token map} +{{k 1 0 0}} 1 +1 1 {{0 0 2 5 7 0} {bad} {a 5 x 6 z 7 k 0}} +{pc 6 ok 0 er {0 bad}} + +{ict_match_tokrange, match, token map} +{{x 1 0 0}} 1 +1 1 {{0 0 2 5 7 0} {bad} {a 5 x 6 z 7 k 0}} +{pc 6} + +{ict_match_tokclass, no match} +{{NUM 1 0 0}} 1 +1 1 {{0 0 3 2 0} bad {}} +{pc 5 ok 0 er {0 bad}} + +{ict_match_tokclass, match} +{{8 1 0 0}} 1 +1 1 {{0 0 3 2 0} bad {}} +{pc 5} + +{icf_ntcall -- subroutine invokation} +{} 0 +0 1 {{6 5 8 8 8 8} {} {}} +{pc 5 rs 2} + +{icf_ntcall -- nested subroutine invokation} +{} 0 +0 2 {{6 3 8 6 8 8 8 8 8} {} {}} +{pc 8 rs {2 5}} + +{icf_ntreturn -- subroutine return} +{} 0 +1 1 {{6 5 8 8 8 7} {} {}} +{pc 2 rs {}} + +{icf_ntreturn -- nested subroutine return} +{} 0 +2 2 {{6 3 7 6 6 7 7} {} {}} +{pc 2 rs {}} + +{inc_save - save match status for nonterminal} +{{NUM 1 0 0} {SPACE { } 0 1} {ID ident 0 2}} 0 +6 1 {{0 0 0 0 15 15 0 0 23 5 0} {NT} {}} +{pc 11 ls 1 nc {1,NT {2 1 {} {{} 2 2}}}} + +{inc_restore - restore match status for nonterminal - wrong location for restore} +{{NUM 1 0 0} {SPACE { } 0 1} {ID ident 0 2}} 1 +10 1 {{0 0 0 0 15 15 0 0 23 5 0 0 0 9 22 4 0 0} {NT} {}} +{pc 18} + +{inc_restore - restore match status for nonterminal} +{{NUM 1 0 0} {SPACE { } 0 1} {ID ident 0 2}} 1 +10 2 {{0 0 0 0 15 15 0 0 23 5 0 0 0 9 22 16 4 0 0} {NT} {}} +{pc 0 at 2 ok 1 sv {{} 2 2} ls {}} + +{ier_push -- save error information} +{} 0 +0 1 {18 {} {}} +{pc 1 es {{}}} + +{ier_clear -- clear error information, nothing to clear} +{} 0 +0 1 {19 {} {}} +{pc 1} + +{ier_merge - merge current and pushed error - keep current, pushed is empty} +{{NUM 1 0 0} {SPACE { } 0 1} {ID ident 0 2}} 0 +3 1 {{18 0 0 1 0 0 21} {XX} {}} +{pc 7 es {}} + +{ier_merge - merge current and pushed error - current is empty, keep pushed} +{{NUM 1 0 0} {SPACE { } 0 1} {ID ident 0 2}} 0 +5 1 {{18 0 0 1 0 0 18 19 21} {XX} {}} +{pc 9 er {0 XX} es {{}}} + +{ier_merge - merge current and pushed error - old/new identical} +{{NUM 1 0 0} {SPACE { } 0 1} {ID ident 0 2}} 0 +4 1 {{18 0 0 1 0 0 18 21} {XX} {}} +{pc 8 es {{}}} + +{ier_merge - merge current and pushed error - old/new sam location, merge messages} +{{NUM 1 0 0} {SPACE { } 0 1} {ID ident 0 2}} 0 +5 1 {{18 0 0 1 0 0 18 1 0 1 21} {XX YY} {}} +{pc 11 er {0 {XX YY}} es {{}}} + +{ier_merge - merge current and pushed error - current further than pushed} +{{NUM 1 0 0} {SPACE { } 0 1} {ID ident 0 2}} 0 +5 1 {{0 0 1 0 0 18 0 0 1 0 0 21} {XX} {}} +{pc 12 es {}} + +{ier_merge - merge current and pushed error - pushed further than current} +{{NUM 1 0 0} {SPACE { } 0 1} {ID ident 0 2}} 0 +7 1 {{0 0 15 0 0 1 0 0 18 16 1 0 0 21} {XX} {}} +{pc 14 er {1 XX} es {}} + +{ier_nonterminal - replace token error against nt error} +{{NUM 1 0 0} {SPACE { } 0 1} {ID ident 0 2}} 0 +4 1 {{0 0 15 0 0 1 0 0 20 1} {X NT} {}} +{pc 10 er {1 NT}} + +{isv_nonterminal_reduce - reduce to bottom} +{{NUM 1 0 0} {SPACE { } 0 1} {ID ident 0 2}} 0 +6 1 {{0 0 15 0 0 23 23 23 26 2} {NT XX YY} {}} +{pc 10 sv {2 0 1 {{} 1 1} {{} 1 1} {{} 1 1}}} + +{isv_nonterminal_reduce - reduce to mark} +{{NUM 1 0 0} {SPACE { } 0 1} {ID ident 0 2}} 0 +7 1 {{0 0 15 0 0 23 28 23 23 26 2} {NT XX YY} {}} +{pc 11 sv {2 0 1 {{} 1 1} {{} 1 1}}} + +{isv_nonterminal_reduce - reduce to mark and rewind} +{{NUM 1 0 0} {SPACE { } 0 1} {ID ident 0 2}} 0 +7 2 {{0 0 15 0 0 23 28 23 23 26 2 29} {NT XX YY} {}} +{pc 12 sv {2 0 1 {{} 1 1} {{} 1 1}} as {{{} 1 1}} ms {}} diff --git a/tcllib/modules/grammar_me/me_cpucore.testsuite b/tcllib/modules/grammar_me/me_cpucore.testsuite new file mode 100644 index 0000000..786b63c --- /dev/null +++ b/tcllib/modules/grammar_me/me_cpucore.testsuite @@ -0,0 +1,419 @@ +# -*- tcl -*- me_cpucore.test +# ### ### ### ######### ######### ######### + +# ### ### ### ######### ######### ######### +## Assembler + +test me-cpucore-asm-${impl}-1.0 {asm, wrong args} -body { + grammar::me::cpu::core::asm +} -result {wrong # args: should be "grammar::me::cpu::core::asm code"} \ + -returnCodes error + +test me-cpucore-asm-${impl}-1.1 {asm, wrong args} -body { + grammar::me::cpu::core::asm a b +} -result {wrong # args: should be "grammar::me::cpu::core::asm code"} \ + -returnCodes error + +test me-cpucore-asm-${impl}-2.0 {asm, empty} -body { + grammar::me::cpu::core::asm {} + # No instructions, empty string pool, no token map +} -result {{} {} {}} + +set n -1 +foreach {cmd cargs expected} $asm_table { + set asm [list [linsert $cargs 0 branchlabel $cmd]] + incr n + + test me-cpucore-asm-${impl}-3.$n "asm, $cmd" -body { + canon_code [grammar::me::cpu::core::asm $asm] + } -result $expected +} + +set n -1 +foreach {cmd cargs expected} $badasm_table { + set asm [list [linsert $cargs 0 branchlabel $cmd]] + incr n + + test me-cpucore-asm-${impl}-4.$n "asm, bad $cmd" -body { + grammar::me::cpu::core::asm $asm + } -result $expected -returnCodes error +} + +# ### ### ### ######### ######### ######### +## Disassembler + +test me-cpucore-disasm-${impl}-1.0 {disasm, wrong args} -body { + grammar::me::cpu::core::disasm +} -result {wrong # args: should be "grammar::me::cpu::core::disasm code"} \ + -returnCodes error + +test me-cpucore-disasm-${impl}-1.1 {disasm, wrong args} -body { + grammar::me::cpu::core::disasm a b +} -result {wrong # args: should be "grammar::me::cpu::core::disasm code"} \ + -returnCodes error + +test me-cpucore-disasm-${impl}-2.0 {disasm, empty} -body { + # No instructions, empty string pool, no token map + grammar::me::cpu::core::disasm {{} {} {}} +} -result {} + +set n -1 +foreach {cmd cargs code} $asm_table { + set asm [list [linsert $cargs 0 branchlabel $cmd]] + incr n + + # We have to distinguish between regular instructions and + # instruction jumping somewhere. For the latter we have to perform + # a bit of fixup to get our expections of the branch labeling + # right. + + set pos [lsearch -exact $cargs branchlabel] + if {$pos >= 0} { + set expected [list [linsert [lreplace $cargs $pos $pos bra0] 0 bra0 $cmd]] + } else { + set expected [list [linsert $cargs 0 {} $cmd]] + } + + test me-cpucore-disasm-${impl}-3.$n "disasm, $cmd" -body { + grammar::me::cpu::core::disasm $code + } -result $expected +} + +set n -1 +foreach {insns expected} $badmach_table { + incr n + + test me-cpucore-disasm-${impl}-4.$n "disasm, error" -body { + grammar::me::cpu::core::disasm $insns + } -result $expected -returnCodes error +} + +# ### ### ### ######### ######### ######### +## State creation. + +test me-cpucore-new-${impl}-1.0 {new, wrong args} -body { + grammar::me::cpu::core::new +} -result {wrong # args: should be "grammar::me::cpu::core::new code"} \ + -returnCodes error + +test me-cpucore-new-${impl}-1.1 {new, wrong args} -body { + grammar::me::cpu::core::new a b +} -result {wrong # args: should be "grammar::me::cpu::core::new code"} \ + -returnCodes error + +test me-cpucore-run-${impl}-2.0 run -setup { + set state [grammar::me::cpu::core::new {{} {} {}}] +} -returnCodes error -body { + grammar::me::cpu::core::run state +} -result {No instructions to execute} + +set n -1 +foreach {cmd cargs expected} $asm_table { + set asm [list [linsert $cargs 0 branchlabel $cmd]] + incr n + + test me-cpucore-new-${impl}-3.$n "new, $cmd, code" -body { + grammar::me::cpu::core::code \ + [grammar::me::cpu::core::new \ + [canon_code [grammar::me::cpu::core::asm $asm]]] + } -result $expected + + test me-cpucore-new-${impl}-4.$n "new, $cmd, state" -body { + cpusubst [cpustate \ + [grammar::me::cpu::core::new \ + [canon_code [grammar::me::cpu::core::asm $asm]]]] \ + cd {} + } -result {cd {} pc 0 ht 0 eo 0 tc {} at -1 cc {} ok 0 sv {} er {} ls {} as {} ms {} es {} rs {} nc {}} +} + +set n -1 +foreach {insns expected} $badmach_table { + incr n + + test me-cpucore-new-${impl}-5.$n "new error" -body { + grammar::me::cpu::core::new $insns + } -result $expected -returnCodes error +} + +# ### ### ### ######### ######### ######### +## State manipulation - Add tokens + +test me-cpucore-put-${impl}-1.0 {put, wrong args} -body { + grammar::me::cpu::core::put +} -result {wrong # args: should be "grammar::me::cpu::core::put statevar tok lex line col"} \ + -returnCodes error + +test me-cpucore-put-${impl}-1.1 {put, wrong args} -body { + grammar::me::cpu::core::put a +} -result {wrong # args: should be "grammar::me::cpu::core::put statevar tok lex line col"} \ + -returnCodes error + +test me-cpucore-put-${impl}-1.2 {put, wrong args} -body { + grammar::me::cpu::core::put a b +} -result {wrong # args: should be "grammar::me::cpu::core::put statevar tok lex line col"} \ + -returnCodes error + +test me-cpucore-put-${impl}-1.3 {put, wrong args} -body { + grammar::me::cpu::core::put a b c +} -result {wrong # args: should be "grammar::me::cpu::core::put statevar tok lex line col"} \ + -returnCodes error + +test me-cpucore-put-${impl}-1.4 {put, wrong args} -body { + grammar::me::cpu::core::put a b c d +} -result {wrong # args: should be "grammar::me::cpu::core::put statevar tok lex line col"} \ + -returnCodes error + +test me-cpucore-put-${impl}-1.5 {put, wrong args} -body { + grammar::me::cpu::core::put a b c d e f +} -result {wrong # args: should be "grammar::me::cpu::core::put statevar tok lex line col"} \ + -returnCodes error + +test me-cpucore-put-${impl}-2.0 put -setup { + set base [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + set next $base +} -body { + grammar::me::cpu::core::put next ID ident 1 0 + grammar::me::cpu::core::put next NUM 12345 1 5 + + cpudelta $base $next +} -result {tc {{ID ident 1 0} {NUM 12345 1 5}}} + +test me-cpucore-put-${impl}-3.0 {put after eof} -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + grammar::me::cpu::core::eof state +} -returnCodes error -body { + grammar::me::cpu::core::put state ID ident 1 0 +} -result {Cannot add input data after eof} + +# ### ### ### ######### ######### ######### +## State manipulation - Set eof + +test me-cpucore-eof-${impl}-1.0 {eof, wrong args} -body { + grammar::me::cpu::core::eof +} -result {wrong # args: should be "grammar::me::cpu::core::eof statevar"} \ + -returnCodes error + +test me-cpucore-eof-${impl}-1.1 {eof, wrong args} -body { + grammar::me::cpu::core::eof a b +} -result {wrong # args: should be "grammar::me::cpu::core::eof statevar"} \ + -returnCodes error + +test me-cpucore-eof-${impl}-2.0 eof -setup { + set base [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + set next $base +} -body { + grammar::me::cpu::core::eof next + cpudelta $base $next +} -result {eo 1} + +# ### ### ### ######### ######### ######### +## State accessors - line/col retrieval + +test me-cpucore-lc-${impl}-1.0 {lc, wrong args} -body { + grammar::me::cpu::core::lc +} -result {wrong # args: should be "grammar::me::cpu::core::lc state loc"} \ + -returnCodes error + +test me-cpucore-lc-${impl}-1.1 {lc, wrong args} -body { + grammar::me::cpu::core::lc a +} -result {wrong # args: should be "grammar::me::cpu::core::lc state loc"} \ + -returnCodes error + +test me-cpucore-lc-${impl}-1.2 {lc, wrong args} -body { + grammar::me::cpu::core::lc a b c +} -result {wrong # args: should be "grammar::me::cpu::core::lc state loc"} \ + -returnCodes error + +test me-cpucore-lc-${impl}-2.0 lc -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + grammar::me::cpu::core::put state NUM 12345 1 5 +} -body { + grammar::me::cpu::core::lc $state 0 +} -result {1 5} + +test me-cpucore-lc-${impl}-3.0 {lc, bad index} -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + grammar::me::cpu::core::put state NUM 12345 1 5 +} -body { + grammar::me::cpu::core::lc $state -1 +} -result {Illegal location -1} -returnCodes error + +test me-cpucore-lc-${impl}-3.1 {lc, bad index} -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + grammar::me::cpu::core::put state NUM 12345 1 5 +} -body { + grammar::me::cpu::core::lc $state 1 +} -result {Illegal location 1} -returnCodes error + +test me-cpucore-lc-${impl}-3.2 {lc, bad index} -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] +} -body { + grammar::me::cpu::core::lc $state 0 +} -result {Illegal location 0} -returnCodes error + +# ### ### ### ######### ######### ######### +## State accessors - Token retrieval + +test me-cpucore-tok-${impl}-1.0 {tok, wrong args} -body { + grammar::me::cpu::core::tok +} -result [tcltest::wrongNumArgs grammar::me::cpu::core::tok {state args} 0] \ + -returnCodes error + +test me-cpucore-tok-${impl}-1.1 {tok, wrong args} -body { + grammar::me::cpu::core::tok a b c d +} -result {wrong # args: should be "grammar::me::cpu::core::tok state ?from ?to??"} \ + -returnCodes error + +test me-cpucore-tok-${impl}-2.0 tok -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] +} -body { + grammar::me::cpu::core::tok $state +} -result {} + +test me-cpucore-tok-${impl}-2.1 tok -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + grammar::me::cpu::core::put state NUM 12345 1 5 +} -body { + grammar::me::cpu::core::tok $state +} -result {{NUM 12345 1 5}} + +test me-cpucore-tok-${impl}-2.2 tok -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + grammar::me::cpu::core::put state ID lalal 0 0 + grammar::me::cpu::core::put state NUM 12345 1 5 +} -body { + grammar::me::cpu::core::tok $state +} -result {{ID lalal 0 0} {NUM 12345 1 5}} + +test me-cpucore-tok-${impl}-3.0 tok -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] +} -body { + grammar::me::cpu::core::tok $state 0 +} -result {Illegal location 0} -returnCodes error + +test me-cpucore-tok-${impl}-3.1 tok -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + grammar::me::cpu::core::put state NUM 12345 1 5 +} -body { + grammar::me::cpu::core::tok $state -1 +} -result {Illegal location -1} -returnCodes error + +test me-cpucore-tok-${impl}-3.2 tok -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + grammar::me::cpu::core::put state NUM 12345 1 5 +} -body { + grammar::me::cpu::core::tok $state 1 +} -result {Illegal location 1} -returnCodes error + +test me-cpucore-tok-${impl}-3.3 tok -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + grammar::me::cpu::core::put state NUM 12345 1 5 +} -body { + grammar::me::cpu::core::tok $state 0 +} -result {{NUM 12345 1 5}} + +test me-cpucore-tok-${impl}-3.4 tok -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + grammar::me::cpu::core::put state ID lalal 0 0 + grammar::me::cpu::core::put state NUM 12345 1 5 +} -body { + grammar::me::cpu::core::tok $state 0 +} -result {{ID lalal 0 0}} + +test me-cpucore-tok-${impl}-4.0 tok -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + grammar::me::cpu::core::put state NUM 12345 1 5 +} -body { + grammar::me::cpu::core::tok $state -1 0 +} -result {Illegal start location -1} -returnCodes error + +test me-cpucore-tok-${impl}-4.1 tok -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + grammar::me::cpu::core::put state NUM 12345 1 5 +} -body { + grammar::me::cpu::core::tok $state 1 0 +} -result {Illegal start location 1} -returnCodes error + +test me-cpucore-tok-${impl}-4.2 tok -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + grammar::me::cpu::core::put state NUM 12345 1 5 +} -body { + grammar::me::cpu::core::tok $state 0 -1 +} -result {Illegal end location -1} -returnCodes error + +test me-cpucore-tok-${impl}-4.3 tok -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + grammar::me::cpu::core::put state NUM 12345 1 5 +} -body { + grammar::me::cpu::core::tok $state 0 1 +} -result {Illegal end location 1} -returnCodes error + +test me-cpucore-tok-${impl}-4.4 tok -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + grammar::me::cpu::core::put state ID lalal 0 0 + grammar::me::cpu::core::put state NUM 12345 1 5 +} -body { + grammar::me::cpu::core::tok $state 1 0 +} -result {Illegal empty location range 1 .. 0} -returnCodes error + +test me-cpucore-tok-${impl}-4.5 tok -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + grammar::me::cpu::core::put state ID lalal 0 0 + grammar::me::cpu::core::put state NUM 12345 1 5 +} -body { + grammar::me::cpu::core::tok $state 0 1 +} -result {{ID lalal 0 0} {NUM 12345 1 5}} + +test me-cpucore-tok-${impl}-4.6 tok -setup { + set state [grammar::me::cpu::core::new [canon_code [grammar::me::cpu::core::asm {}]]] + grammar::me::cpu::core::put state ID lalal 0 0 + grammar::me::cpu::core::put state ID lalal 0 0 +} -body { + grammar::me::cpu::core::tok $state 0 0 +} -result {{ID lalal 0 0}} + +# ### ### ### ######### ######### ######### +## Checking the instruction semantics + +test me-cpucore-run-${impl}-1.0 {run, wrong args} -body { + grammar::me::cpu::core::run +} -result {wrong # args: should be "grammar::me::cpu::core::run statevar ?steps?"} \ + -returnCodes error + +test me-cpucore-run-${impl}-1.1 {run, wrong args} -body { + grammar::me::cpu::core::run a b c +} -result {wrong # args: should be "grammar::me::cpu::core::run statevar ?steps?"} \ + -returnCodes error + +set n -1 +foreach {description input eof stepsSetup steps code expectedDelta} $semantics { + incr n + + if 0 { + puts ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + puts $description + puts "INPUT $input" + puts "EOF $eof" + puts "CODE $stepsSetup $steps $code" + puts $expectedDelta + } + + test me-cpucore-run-${impl}-2.$n "run $description" -setup { + set state [grammar::me::cpu::core::new $code] + foreach token $input { + eval [linsert $token 0 grammar::me::cpu::core::put state] + } + if {$eof} { + grammar::me::cpu::core::eof state + } + if {$stepsSetup} { + grammar::me::cpu::core::run state $stepsSetup + } + set save $state + } -body { + grammar::me::cpu::core::run state $steps + cpudelta $save $state + } -result $expectedDelta +} + +return diff --git a/tcllib/modules/grammar_me/me_intro.man b/tcllib/modules/grammar_me/me_intro.man new file mode 100644 index 0000000..ee59bc1 --- /dev/null +++ b/tcllib/modules/grammar_me/me_intro.man @@ -0,0 +1,94 @@ +[comment {-*- tcl -*- doctools manpage}] +[manpage_begin grammar::me_intro n 0.1] +[keywords CFG] +[keywords CFL] +[keywords {context-free grammar}] +[keywords {context-free languages}] +[keywords expression] +[keywords grammar] +[keywords LL(k)] +[keywords matching] +[keywords parsing] +[keywords {parsing expression grammar}] +[keywords PEG] +[keywords {push down automaton}] +[keywords {recursive descent}] +[keywords {top-down parsing languages}] +[keywords TPDL] +[keywords transducer] +[keywords {virtual machine}] +[copyright {2005 Andreas Kupries <andreas_kupries@users.sourceforge.net>}] +[moddesc {Grammar operations and usage}] +[titledesc {Introduction to virtual machines for parsing token streams}] +[category {Grammars and finite automata}] +[description] + +This document is an introduction to and overview of the basic +facilities for the parsing and/or matching of [term token] +streams. One possibility often used for the token domain are +characters. + +[para] + +The packages themselves all provide variants of one + +[term {virtual machine}], called a [term {match engine}] (short + +[term ME]), which has all the facilities needed for the matching and +parsing of a stream, and which are either controlled directly, or are +customized with a match program. The virtual machine is basically a +pushdown automaton, with additional elements for backtracking and/or +handling of semantic data and construction of abstract syntax trees +([term AST]). + +[para] + +Because of the high degree of similarity in the actual implementations +of the aforementioned virtual machine and the data structures they +receive and generate these common parts are specified in a separate +document which will be referenced by the documentation for packages +actually implementing it. + +[para] + +The relevant documents are: + +[para] +[list_begin definitions] + +[def [package grammar::me_vm]] + +Virtual machine specification. + +[def [package grammar::me_ast]] + +Specification of various representations used for abstract syntax +trees. + +[def [package grammar::me::util]] + +Utility commands. + +[def [package grammar::me::tcl]] + +Singleton ME virtual machine implementation tied to Tcl for control +flow and stacks. Hardwired for pull operation. Uninteruptible during +processing. + +[def [package grammar::me::cpu]] + +Object-based ME virtual machine implementation with explicit control +flow, and stacks, using bytecodes. Suspend/Resumable. Push/pull +operation. + +[def [package grammar::me::cpu::core]] + +Core functionality for state manipulation and stepping used in the +bytecode based implementation of ME virtual machines. + +[list_end] +[para] + +[vset CATEGORY grammar_me] +[include ../doctools2base/include/feedback.inc] +[manpage_end] diff --git a/tcllib/modules/grammar_me/me_tcl.man b/tcllib/modules/grammar_me/me_tcl.man new file mode 100644 index 0000000..da64b98 --- /dev/null +++ b/tcllib/modules/grammar_me/me_tcl.man @@ -0,0 +1,343 @@ +[comment {-*- tcl -*- doctools manpage}] +[manpage_begin grammar::me::tcl n 0.1] +[keywords grammar] +[keywords parsing] +[keywords {virtual machine}] +[copyright {2005 Andreas Kupries <andreas_kupries@users.sourceforge.net>}] +[moddesc {Grammar operations and usage}] +[titledesc {Virtual machine implementation I for parsing token streams}] +[category {Grammars and finite automata}] +[require Tcl 8.4] +[require grammar::me::tcl [opt 0.1]] +[description] +[para] + +This package provides an implementation of the ME virtual machine. +Please go and read the document [syscmd grammar::me_intro] first if +you do not know what a ME virtual machine is. + +[para] + +This implementation is tied very strongly to Tcl. All the stacks in +the machine state are handled through the Tcl stack, all control flow +is handled by Tcl commands, and the remaining machine instructions are +directly mapped to Tcl commands. Especially the matching of +nonterminal symbols is handled by Tcl procedures as well, essentially +extending the machine implementation with custom instructions. + +[para] + +Further on the implementation handles only a single machine which is +uninteruptible during execution and hardwired for pull operation. I.e. +it explicitly requests each new token through a callback, pulling them +into its state. + +[para] + +A related package is [package grammar::peg::interp] which provides a +generic interpreter / parser for parsing expression grammars (PEGs), +implemented on top of this implementation of the ME virtual machine. + +[section {API}] + +The commands documented in this section do not implement any of the +instructions of the ME virtual machine. They provide the facilities +for the initialization of the machine and the retrieval of important +information. + +[list_begin definitions] + +[call [cmd ::grammar::me::tcl] [method cmd] [arg ...]] + +This is an ensemble command providing access to the commands listed in +this section. See the methods themselves for detailed specifications. + +[call [cmd ::grammar::me::tcl] [method init] [arg nextcmd] [opt [arg tokmap]]] + +This command (re)initializes the machine. It returns the empty +string. This command has to be invoked before any other command of +this package. + +[para] + +The command prefix [arg nextcmd] represents the input stream of +characters and is invoked by the machine whenever the a new character +from the stream is required. The instruction for handling this is +[term ict_advance]. + +The callback has to return either the empty list, or a list of 4 +elements containing the token, its lexeme attribute, and its location +as line number and column index, in this order. + +The empty list is the signal that the end of the input stream has been +reached. The lexeme attribute is stored in the terminal cache, but +otherwise not used by the machine. + +[para] + +The optional dictionary [arg tokmap] maps from tokens to integer +numbers. If present the numbers impose an order on the tokens, which +is subsequently used by [term ict_match_tokrange] to determine if a +token is in the specified range or not. If no token map is specified +the lexicographic order of th token names will be used instead. This +choice is especially asensible when using characters as tokens. + +[call [cmd ::grammar::me::tcl] [method lc] [arg location]] + +This command converts the location of a token given as offset in the +input stream into the associated line number and column index. The +result of the command is a 2-element list containing the two values, +in the order mentioned in the previous sentence. + +This allows higher levels to convert the location information found in +the error status and the generated AST into more human readable data. + +[para] + +[emph Note] that the command is not able to convert locations which +have not been reached by the machine yet. In other words, if the +machine has read 7 tokens the command is able to convert the offsets +[const 0] to [const 6], but nothing beyond that. This also shows that +it is not possible to convert offsets which refer to locations before +the beginning of the stream. + +[para] + +After a call of [method init] the state used for the conversion is +cleared, making further conversions impossible until the machine has +read tokens again. + +[call [cmd ::grammar::me::tcl] [method tok] [arg from] [opt [arg to]]] + +This command returns a Tcl list containing the part of the input +stream between the locations [arg from] and [arg to] (both +inclusive). If [arg to] is not specified it will default to the value +of [arg from]. + +[para] + +Each element of the returned list is a list of four elements, the +token, its associated lexeme, line number, and column index, in this +order. + +In other words, each element has the same structure as the result of +the [arg nextcmd] callback given to [cmd ::grammar::me::tcl::init] + +[para] + +This command places the same restrictions on its location arguments as +[cmd ::grammar::me::tcl::lc]. + +[call [cmd ::grammar::me::tcl] [method tokens]] + +This command returns the number of tokens currently known to the ME +virtual machine. + +[call [cmd ::grammar::me::tcl] [method sv]] + +This command returns the current semantic value [term SV] stored in +the machine. This is an abstract syntax tree as specified in the +document [syscmd grammar::me_ast], section [sectref-external {AST VALUES}]. + +[call [cmd ::grammar::me::tcl] [method ast]] + +This method returns the abstract syntax tree currently at the top of +the AST stack of the ME virtual machine. This is an abstract syntax +tree as specified in the document [syscmd grammar::me_ast], section +[sectref-external {AST VALUES}]. + +[call [cmd ::grammar::me::tcl] [method astall]] + +This method returns the whole stack of abstract syntax trees currently +known to the ME virtual machine. Each element of the returned list is +an abstract syntax tree as specified in the document + +[syscmd grammar::me_ast], section [sectref-external {AST VALUES}]. +The top of the stack resides at the end of the list. + +[call [cmd ::grammar::me::tcl] [method ctok]] + +This method returns the current token considered by the ME virtual +machine. + +[call [cmd ::grammar::me::tcl] [method nc]] + +This method returns the contents of the nonterminal cache as a +dictionary mapping from "[var symbol],[var location]" to match +information. + +[call [cmd ::grammar::me::tcl] [method next]] + +This method returns the next token callback as specified during +initialization of the ME virtual machine. + +[call [cmd ::grammar::me::tcl] [method ord]] + +This method returns a dictionary containing the [arg tokmap] specified +during initialization of the ME virtual machine. + +[var [cmd ::grammar::me::tcl::ok]] + +This variable contains the current match status [term OK]. It is +provided as variable instead of a command because that makes access to +this information faster, and the speed of access is considered very +important here as this information is used constantly to determine the +control flow. + +[list_end] +[para] + +[section {MACHINE STATE}] + +Please go and read the document [syscmd grammar::me_vm] first for a +specification of the basic ME virtual machine and its state. + +[para] + +This implementation manages the state described in that document, +except for the stacks minus the AST stack. In other words, location +stack, error stack, return stack, and ast marker stack are implicitly +managed through standard Tcl scoping, i.e. Tcl variables in +procedures, outside of this implementation. + +[section {MACHINE INSTRUCTIONS}] + +Please go and read the document [syscmd grammar::me_vm] first for a +specification of the basic ME virtual machine and its instruction set. + +[para] + +This implementation maps all instructions to Tcl commands in the +namespace "::grammar::me::tcl", except for the stack related commands, +nonterminal symbols and control flow. + +Here we simply list the commands and explain the differences to the +specified instructions, if there are any. For their semantics see the +aforementioned specification. The machine commands are [emph not] +reachable through the ensemble command [cmd ::grammar::me::tcl]. + +[para] +[list_begin definitions] + +[call [cmd ::grammar::me::tcl::ict_advance] [arg message]] +No changes. + +[call [cmd ::grammar::me::tcl::ict_match_token] [arg tok] [arg message]] +No changes. + +[call [cmd ::grammar::me::tcl::ict_match_tokrange] [arg tokbegin] [arg tokend] [arg message]] + +If, and only if a token map was specified during initialization then +the arguments are the numeric representations of the smallest and +largest tokens in the range. Otherwise they are the relevant tokens +themselves and lexicographic comparison is used. + +[call [cmd ::grammar::me::tcl::ict_match_tokclass] [arg code] [arg message]] +No changes. + +[call [cmd ::grammar::me::tcl::inc_restore] [arg nt]] + +Instead of taking a branchlabel the command returns a boolean value. +The result will be [const true] if and only if cached information was +found. The caller has to perform the appropriate branching. + +[call [cmd ::grammar::me::tcl::inc_save] [arg nt] [arg startlocation]] + +The command takes the start location as additional argument, as it is +managed on the Tcl stack, and not in the machine state. + +[def "[cmd icf_ntcall] [arg branchlabel]"] +[def [cmd icf_ntreturn]] + +These two instructions are not mapped to commands. They are control +flow instructions and handled in Tcl. + +[call [cmd ::grammar::me::tcl::iok_ok]] +No changes. + +[call [cmd ::grammar::me::tcl::iok_fail]] +No changes. + +[call [cmd ::grammar::me::tcl::iok_negate]] +No changes. + +[def "[cmd icf_jalways] [arg branchlabel]"] +[def "[cmd icf_jok] [arg branchlabel]"] +[def "[cmd icf_jfail] [arg branchlabel]"] +[def [cmd icf_halt]] + +These four instructions are not mapped to commands. They are control +flow instructions and handled in Tcl. + +[call [cmd ::grammar::me::tcl::icl_get]] + +This command returns the current location [term CL] in the input. +It replaces [term icl_push]. + +[call [cmd ::grammar::me::tcl::icl_rewind] [arg oldlocation]] + +The command takes the location as argument as it comes from the +Tcl stack, not the machine state. + +[def [cmd icl_pop]] + +Not mapped, the stacks are not managed by the package. + +[call [cmd ::grammar::me::tcl::ier_get]] + +This command returns the current error state [term ER]. +It replaces [term ier_push]. + +[call [cmd ::grammar::me::tcl::ier_clear]] +No changes. + +[call [cmd ::grammar::me::tcl::ier_nonterminal] [arg message] [arg location]] + +The command takes the location as argument as it comes from the +Tcl stack, not the machine state. + +[call [cmd ::grammar::me::tcl::ier_merge] [arg olderror]] + +The command takes the second error state to merge as argument as it +comes from the Tcl stack, not the machine state. + +[call [cmd ::grammar::me::tcl::isv_clear]] +No changes. + +[call [cmd ::grammar::me::tcl::isv_terminal]] +No changes. + +[call [cmd ::grammar::me::tcl::isv_nonterminal_leaf] [arg nt] [arg startlocation]] + +The command takes the start location as argument as it comes from the +Tcl stack, not the machine state. + +[call [cmd ::grammar::me::tcl::isv_nonterminal_range] [arg nt] [arg startlocation]] + +The command takes the start location as argument as it comes from the +Tcl stack, not the machine state. + +[call [cmd ::grammar::me::tcl::isv_nonterminal_reduce] [arg nt] [arg startlocation] [opt [arg marker]]] + +The command takes start location and marker as argument as it comes +from the Tcl stack, not the machine state. + +[call [cmd ::grammar::me::tcl::ias_push]] +No changes. + +[call [cmd ::grammar::me::tcl::ias_mark]] + +This command returns a marker for the current state of the AST stack +[term AS]. The marker stack is not managed by the machine. + +[call [cmd ::grammar::me::tcl::ias_pop2mark] [arg marker]] + +The command takes the marker as argument as it comes from the +Tcl stack, not the machine state. It replaces [term ias_mpop]. + +[list_end] +[para] + +[vset CATEGORY grammar_me] +[include ../doctools2base/include/feedback.inc] +[manpage_end] diff --git a/tcllib/modules/grammar_me/me_tcl.tcl b/tcllib/modules/grammar_me/me_tcl.tcl new file mode 100644 index 0000000..e0e86e4 --- /dev/null +++ b/tcllib/modules/grammar_me/me_tcl.tcl @@ -0,0 +1,521 @@ +# -*- tcl -*- +# ### ### ### ######### ######### ######### +## Package description + +## Implementation of the ME virtual machine as a singleton, tied to +## Tcl for control flow and stack handling (except the AST stack). + +# ### ### ### ######### ######### ######### +## Requisites + +# ### ### ### ######### ######### ######### +## Implementation + +namespace eval ::grammar::me::tcl { + namespace export \ + init lc tok sv tokens ast \ + astall ctok nc next ord \ + \ + isv_clear ict_advance inc_save \ + isv_terminal ict_match_token inc_restore \ + isv_nonterminal_leaf ict_match_tokrange icl_get \ + isv_nonterminal_range ict_match_tokclass icl_rewind \ + isv_nonterminal_reduce iok_ok \ + ier_clear iok_fail \ + ier_get iok_negate \ + ier_expected ias_push \ + ier_nonterminal ias_mark \ + ier_merge ias_pop2mark + + variable ok +} + +# ### ### ### ######### ######### ######### +## Implementation, API. Ensemble command. + +proc ::grammar::me::tcl {cmd args} { + # Dispatcher for the ensemble command. + variable tcl::cmds + return [uplevel 1 [linsert $args 0 $cmds($cmd)]] +} + +namespace eval grammar::me::tcl { + variable cmds + + # Mapping from cmd names to procedures for quick dispatch. The + # objects will shimmer into resolved command references. + + array set cmds { + init ::grammar::me::tcl::init + lc ::grammar::me::tcl::lc + tok ::grammar::me::tcl::tok + sv ::grammar::me::tcl::sv + tokens ::grammar::me::tcl::tokens + ast ::grammar::me::tcl::ast + astall ::grammar::me::tcl::astall + ctok ::grammar::me::tcl::ctok + nc ::grammar::me::tcl::nc + next ::grammar::me::tcl::next + ord ::grammar::me::tcl::ord + } +} + +# ### ### ### ######### ######### ######### +## API Implementation. + +proc ::grammar::me::tcl::init {nxcmd {tokmap {}}} { + variable next $nxcmd + variable as {} + variable ok 0 + variable error {} + variable sv {} + variable loc -1 + variable ct {} + variable tc {} + variable nc + variable tokOrd + variable tokUseOrd 0 + + array unset nc * + array unset tokOrd * + + if {[llength $tokmap]} { + if {[llength $tokmap] % 2 == 1} { + return -code error \ + "Bad token order map, not a dictionary" + } + array set tokOrd $tokmap + set tokUseOrd 1 + } + return +} + +proc ::grammar::me::tcl::lc {pos} { + variable tc + return [lrange [lindex $tc $pos] 2 3] +} + +proc ::grammar::me::tcl::tok {from {to {}}} { + variable tc + if {$to == {}} {set to $from} + return [lrange $tc $from $to] +} + +proc ::grammar::me::tcl::tokens {} { + variable tc + return [llength $tc] +} + +proc ::grammar::me::tcl::sv {} { + variable sv + return $sv +} + +proc ::grammar::me::tcl::ast {} { + variable as + return [lindex $as end] +} + +proc ::grammar::me::tcl::astall {} { + variable as + return $as +} + +proc ::grammar::me::tcl::ctok {} { + variable ct + return $ct +} + +proc ::grammar::me::tcl::nc {} { + variable nc + return [array get nc] +} + +proc ::grammar::me::tcl::next {} { + variable next + return $next +} + +proc ::grammar::me::tcl::ord {} { + variable tokOrd + return [array get tokOrd] +} + +# ### ### ### ######### ######### ######### +## Terminal matching + +proc ::grammar::me::tcl::ict_advance {msg} { + # Inlined: Getch, Expected, ClearErrors + + variable ok + variable error + # ------------------------ + variable tc + variable loc + variable ct + # ------------------------ + variable next + # ------------------------ + + # Satisfy from input cache if possible. + incr loc + if {$loc < [llength $tc]} { + set ct [lindex $tc $loc 0] + set ok 1 + set error {} + return + } + + # Actually read from the input, and remember + # the information. + + # Read from buffer, and remember. + # Note: loc is the instance variable. + # This implicitly increments the location! + + set tokdata [uplevel \#0 $next] + if {![llength $tokdata]} { + set ok 0 + set error [list $loc [list $msg]] + return + } elseif {[llength $tokdata] != 4} { + return -code error "Bad callback result, expected 4 elements" + } + + lappend tc $tokdata + set ct [lindex $tokdata 0] + set ok 1 + set error {} + return +} + +proc ::grammar::me::tcl::ict_match_token {tok msg} { + variable ct + variable ok + + set ok [expr {$tok eq $ct}] + + OkFail $msg + return +} + +proc ::grammar::me::tcl::ict_match_tokrange {toks toke msg} { + variable ct + variable ok + variable tokUseOrd + variable tokOrd + + if {$tokUseOrd} { + set ord $tokOrd($ct) + set ok [expr { + ($toks <= $ord) && + ($ord <= $toke) + }] ; # {} + } else { + set ok [expr { + ([string compare $toks $ct] <= 0) && + ([string compare $ct $toke] <= 0) + }] ; # {} + } + + OkFail $msg + return +} + +proc ::grammar::me::tcl::ict_match_tokclass {code msg} { + variable ct + variable ok + + set ok [string is $code -strict $ct] + + OkFail $msg + return +} + +proc ::grammar::me::tcl::OkFail {msg} { + variable ok + variable error + variable loc + + # Inlined: Expected, Unget, ClearErrors + + if {!$ok} { + set error [list $loc [list $msg]] + incr loc -1 + } else { + set error {} + } + return +} + +# ### ### ### ######### ######### ######### +## Nonterminal cache + +proc ::grammar::me::tcl::inc_restore {symbol} { + variable loc + variable nc + variable ok + variable error + variable sv + + # Satisfy from cache if possible. + if {[info exists nc($loc,$symbol)]} { + foreach {go ok error sv} $nc($loc,$symbol) break + + # Go forward, as the nonterminal matches (or not). + set loc $go + return 1 + } + return 0 +} + +proc ::grammar::me::tcl::inc_save {symbol at} { + variable loc + variable nc + variable ok + variable error + variable sv + + if 0 { + if {[info exists nc($at,$symbol)]} { + return -code error "Cannot overwrite\ + existing data @ ($at, $symbol)" + } + } + + # FIXME - end location should be argument. + + # Store not only the value, but also how far + # the match went (if it was a match). + + set nc($at,$symbol) [list $loc $ok $error $sv] + return +} + +# ### ### ### ######### ######### ######### +## Unconditional matching. + +proc ::grammar::me::tcl::iok_ok {} { + variable ok 1 + return +} + +proc ::grammar::me::tcl::iok_fail {} { + variable ok 0 + return +} + +proc ::grammar::me::tcl::iok_negate {} { + variable ok + set ok [expr {!$ok}] + return +} + +# ### ### ### ######### ######### ######### +## Basic input handling and tracking + +proc ::grammar::me::tcl::icl_get {} { + variable loc + return $loc +} + +proc ::grammar::me::tcl::icl_rewind {oldloc} { + variable loc + + if 0 { + if {($oldloc < -1) || ($oldloc > $loc)} { + return -code error "Bad location \"$oldloc\" (vs $loc)" + } + } + set loc $oldloc + return +} + +# ### ### ### ######### ######### ######### +## Error handling. + +proc ::grammar::me::tcl::ier_get {} { + variable error + return $error +} + +proc ::grammar::me::tcl::ier_clear {} { + variable error {} + return +} + +proc ::grammar::me::tcl::ier_nonterminal {msg pos} { + # Inlined: Errors, Expected. + + variable error + + if {[llength $error]} { + foreach {l m} $error break + incr pos + if {$l == $pos} { + set error [list $l [list $msg]] + } + } +} + +proc ::grammar::me::tcl::ier_merge {new} { + variable error + + # We have either old or new error data, keep it. + + if {![llength $error]} {set error $new ; return} + if {![llength $new]} {return} + + # If one of the errors is further on in the input choose that as + # the information to propagate. + + foreach {loe msgse} $error break + foreach {lon msgsn} $new break + + if {$lon > $loe} {set error $new ; return} + if {$loe > $lon} {return} + + # Equal locations, merge the message lists. + + foreach m $msgsn {lappend msgse $m} + set error [list $loe [lsort -uniq $msgse]] + return +} + +# ### ### ### ######### ######### ######### +## Operations for the construction of the +## abstract syntax tree (AST). + +proc ::grammar::me::tcl::isv_clear {} { + variable sv {} + return +} + +proc ::grammar::me::tcl::isv_terminal {} { + variable loc + variable sv + variable as + + set sv [list {} $loc $loc] + lappend as $sv + return +} + +proc ::grammar::me::tcl::isv_nonterminal_leaf {nt pos} { + # Inlined clear, reduce, and optimized. + variable ok + variable loc + variable sv {} + + # Clear ; if {$ok} {Reduce $nt} + + if {$ok} { + incr pos + set sv [list $nt $pos $loc] + } + return +} + +proc ::grammar::me::tcl::isv_nonterminal_range {nt pos} { + variable ok + variable loc + variable sv {} + + if {$ok} { + # TerminalString $pos + # Get all characters after 'pos' to current location as terminal data. + + incr pos + set sv [list $nt $pos $loc [list {} $pos $loc]] + + #set sv [linsert $sv 0 $nt] ;#Reduce $nt + } + return +} + +proc ::grammar::me::tcl::isv_nonterminal_reduce {nt pos {mrk 0}} { + variable ok + variable as + variable loc + variable sv {} + + if {$ok} { + incr pos + set sv [lrange $as $mrk end] ;#SaveToMark $mrk + set sv [linsert $sv 0 $nt $pos $loc] ;#Reduce $nt + } + return +} + +# ### ### ### ######### ######### ######### +## AST stack handling + +proc ::grammar::me::tcl::ias_push {} { + variable as + variable sv + lappend as $sv + return +} + +proc ::grammar::me::tcl::ias_mark {} { + variable as + return [llength $as] +} + +proc ::grammar::me::tcl::ias_pop2mark {mark} { + variable as + if {[llength $as] <= $mark} return + incr mark -1 + set as [lrange $as 0 $mark] + return +} + +# ### ### ### ######### ######### ######### +## Data structures. + +namespace eval ::grammar::me::tcl { + # ### ### ### ######### ######### ######### + ## Public State of MVM (Matching Virtual Machine) + + variable ok 0 ; # Boolean: Ok/Fail of last match operation. + + # ### ### ### ######### ######### ######### + ## Internal state. + + variable ct {} ; # Current token. + variable loc 0 ; # Location of 'ct' as offset in input. + + variable error {} ; # Error data for last match. + # ; # == List (loc, list of strings) + # ; # or empty list + variable sv {} ; # Semantic value for last match. + + # ### ### ### ######### ######### ######### + ## Data structures for AST construction + + variable as {} ; # Stack of values for AST + + # ### ### ### ######### ######### ######### + ## Memo data structures for tokens and match results. + + variable tc {} + variable nc ; array set nc {} + + # ### ### ### ######### ######### ######### + ## Input buffer, location of next character to read. + ## ASSERT (loc <= cloc) + + variable next ; # Callback to get next character. + + # Token ordering for range checks. Optional + + variable tokOrd ; array set tokOrd {} + variable tokUseOrd 0 + + # ### ### ### ######### ######### ######### +} + +# ### ### ### ######### ######### ######### +## Package Management + +package provide grammar::me::tcl 0.1 diff --git a/tcllib/modules/grammar_me/me_tcl.test b/tcllib/modules/grammar_me/me_tcl.test new file mode 100644 index 0000000..4847a4b --- /dev/null +++ b/tcllib/modules/grammar_me/me_tcl.test @@ -0,0 +1,1615 @@ +# me_tcl.test: Tests for the ME virtual machine -*- tcl -*- +# +# This file contains a collection of tests for one or more of the +# commands making up the ME virtual machine. Sourcing this file into +# Tcl runs the tests and generates output for errors. No output means +# no errors were found. +# +# Copyright (c) 2005 Andreas Kupries <andreas_kupries@users.sourceforge.net> +# All rights reserved. +# +# RCS: @(#) $Id: me_tcl.test,v 1.8 2007/08/01 22:49:26 andreas_kupries Exp $ + +# ------------------------------------------------------------------------- + +source [file join \ + [file dirname [file dirname [file join [pwd] [info script]]]] \ + devtools testutilities.tcl] + +testsNeedTcl 8.4 +testsNeedTcltest 2.1 + +testing { + useLocal me_tcl.tcl grammar::me::tcl +} + +# ### ### ### ######### ######### ######### +## Pre-requisites. Helper commands to inspect the state of the ME +## virtual machine. + +proc ME_state {} { + # This command retrieves all parts of the ME virtual machine state + # for inspection by the testing commands. The result is a dictionary. + + set res {} + + lappend res [list tok__ [grammar::me::tcl ctok]] + lappend res [list loc__ [grammar::me::tcl::icl_get]] + lappend res [list ok___ $grammar::me::tcl::ok] + lappend res [list error [grammar::me::tcl::ier_get]] + lappend res [list sv___ [grammar::me::tcl sv]] + lappend res [list ast__ [grammar::me::tcl astall]] + + set nt [grammar::me::tcl tokens] + incr nt -1 + lappend res [list input [grammar::me::tcl tok 0 $nt]] + + lappend res [list cache [dictsort [grammar::me::tcl nc]]] + lappend res [list next_ [grammar::me::tcl next]] + lappend res [list ord__ [dictsort [grammar::me::tcl ord]]] + + return $res +} + +proc ME_stateText {} { + join [ME_state] \n +} + +proc next_badresult {} {return a} + +proc next_eof {} {return {}} + +proc next_char {x} {return [list $x 3 4 {}]} + +proc next_count {} { + global count + incr count + return [list T$count 1 $count $count] +} +proc nc_init {} { + global count + set count 0 + return +} + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-init-1.0 {Call without enough arguments} \ + -returnCodes error \ + -body { + grammar::me::tcl::init + } -result {wrong # args: should be "grammar::me::tcl::init nxcmd ?tokmap?"} + +test mevmtcl-init-1.1 {Call with too many arguments} \ + -returnCodes error \ + -body { + grammar::me::tcl::init a b c + } -result {wrong # args: should be "grammar::me::tcl::init nxcmd ?tokmap?"} + +test mevmtcl-init-1.2 {Call with bad token map} \ + -returnCodes error \ + -body { + grammar::me::tcl::init a b + } -result {Bad token order map, not a dictionary} + + +test mevmtcl-init-2.0 {Basic initialization} \ + -body { + grammar::me::tcl::init fake + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 0 +error {} +sv___ {} +ast__ {} +input {} +cache {} +next_ fake +ord__ {}} + +test mevmtcl-init-2.1 {Basic initialization, with map} \ + -body { + grammar::me::tcl::init fakeB {ident 0} + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 0 +error {} +sv___ {} +ast__ {} +input {} +cache {} +next_ fakeB +ord__ {ident 0}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-ict_advance-1.0 {Call without enough arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ict_advance + } -result {wrong # args: should be "grammar::me::tcl::ict_advance msg"} + +test mevmtcl-ict_advance-1.1 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ict_advance a b + } -result {wrong # args: should be "grammar::me::tcl::ict_advance msg"} + +test mevmtcl-ict_advance-1.2 {Bad next callback} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ict_advance foo + } -result {invalid command name "fake"} + +test mevmtcl-ict_advance-1.3 {Bad next callback, bad results} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init next_badresult + } -body { + grammar::me::tcl::ict_advance foo + } -result {Bad callback result, expected 4 elements} + + +test mevmtcl-ict_advance-2.0 {Behaviour at eof} \ + -setup { + grammar::me::tcl::init next_eof + } -body { + grammar::me::tcl::ict_advance "foo (got EOF)" + ME_stateText + } -result {tok__ {} +loc__ 0 +ok___ 0 +error {0 {{foo (got EOF)}}} +sv___ {} +ast__ {} +input {} +cache {} +next_ next_eof +ord__ {}} + +test mevmtcl-ict_advance-2.1 {Behaviour for regular token} \ + -setup { + grammar::me::tcl::init next_count + nc_init + } -body { + grammar::me::tcl::ict_advance foo + ME_stateText + } -result {tok__ T1 +loc__ 0 +ok___ 1 +error {} +sv___ {} +ast__ {} +input {{T1 1 1 1}} +cache {} +next_ next_count +ord__ {}} + +test mevmtcl-ict_advance-2.2 {Behaviour for backtracing in input} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + grammar::me::tcl::icl_rewind 0 + } -body { + grammar::me::tcl::ict_advance foo + ME_stateText + } -result {tok__ T2 +loc__ 1 +ok___ 1 +error {} +sv___ {} +ast__ {} +input {{T1 1 1 1} {T2 1 2 2} {T3 1 3 3}} +cache {} +next_ next_count +ord__ {}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-ict_match_token-1.0 {Call without enough arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ict_match_token + } -result {wrong # args: should be "grammar::me::tcl::ict_match_token tok msg"} + +test mevmtcl-ict_match_token-1.1 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ict_match_token a b c + } -result {wrong # args: should be "grammar::me::tcl::ict_match_token tok msg"} + + +test mevmtcl-ict_match_token-2.0 {Token is matching} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + } -body { + grammar::me::tcl::ict_match_token T1 "Expected foo" + ME_stateText + } -result {tok__ T1 +loc__ 0 +ok___ 1 +error {} +sv___ {} +ast__ {} +input {{T1 1 1 1}} +cache {} +next_ next_count +ord__ {}} + +test mevmtcl-ict_match_token-2.1 {Token is not matching} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + } -body { + grammar::me::tcl::ict_match_token BOGUS "Expected 'BOGUS'" + ME_stateText + } -result {tok__ T1 +loc__ -1 +ok___ 0 +error {0 {{Expected 'BOGUS'}}} +sv___ {} +ast__ {} +input {{T1 1 1 1}} +cache {} +next_ next_count +ord__ {}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-ict_match_tokrange-1.0 {Call without enough arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ict_match_tokrange + } -result {wrong # args: should be "grammar::me::tcl::ict_match_tokrange toks toke msg"} + +test mevmtcl-ict_match_tokrange-1.1 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ict_match_tokrange a b c d + } -result {wrong # args: should be "grammar::me::tcl::ict_match_tokrange toks toke msg"} + + +test mevmtcl-ict_match_tokrange-2.0 {Token range, lexicographic compare, outside low} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + } -body { + grammar::me::tcl::ict_match_tokrange T2 T4 "\[T2 .. T4\]" + ME_stateText + } -result {tok__ T1 +loc__ -1 +ok___ 0 +error {0 {{[T2 .. T4]}}} +sv___ {} +ast__ {} +input {{T1 1 1 1}} +cache {} +next_ next_count +ord__ {}} + +test mevmtcl-ict_match_tokrange-2.1 {Token range, lexicographic compare, outside up} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + } -body { + grammar::me::tcl::ict_match_tokrange A S "\[A .. S\]" + ME_stateText + } -result {tok__ T1 +loc__ -1 +ok___ 0 +error {0 {{[A .. S]}}} +sv___ {} +ast__ {} +input {{T1 1 1 1}} +cache {} +next_ next_count +ord__ {}} + +test mevmtcl-ict_match_tokrange-2.2 {Token range, lexicographic compare, in range} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + } -body { + grammar::me::tcl::ict_match_tokrange A T2 "\[A .. T2\]" + ME_stateText + } -result {tok__ T1 +loc__ 0 +ok___ 1 +error {} +sv___ {} +ast__ {} +input {{T1 1 1 1}} +cache {} +next_ next_count +ord__ {}} + +test mevmtcl-ict_match_tokrange-2.3 {Token range, lexicographic compare, in range, low edge} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + } -body { + grammar::me::tcl::ict_match_tokrange T1 T5 "\[T1 .. T5\]" + ME_stateText + } -result {tok__ T1 +loc__ 0 +ok___ 1 +error {} +sv___ {} +ast__ {} +input {{T1 1 1 1}} +cache {} +next_ next_count +ord__ {}} + +test mevmtcl-ict_match_tokrange-2.4 {Token range, lexicographic compare, in range, upper edge} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + } -body { + grammar::me::tcl::ict_match_tokrange A T1 "\[A .. T1\]" + ME_stateText + } -result {tok__ T1 +loc__ 0 +ok___ 1 +error {} +sv___ {} +ast__ {} +input {{T1 1 1 1}} +cache {} +next_ next_count +ord__ {}} + + +test mevmtcl-ict_match_tokrange-3.0 {Token range, map order compare, outside low} \ + -setup { + grammar::me::tcl::init next_count {T1 0 A 1 B 2} + nc_init + grammar::me::tcl::ict_advance foo + } -body { + grammar::me::tcl::ict_match_tokrange 1 2 "\[A .. B\]" + ME_stateText + } -result {tok__ T1 +loc__ -1 +ok___ 0 +error {0 {{[A .. B]}}} +sv___ {} +ast__ {} +input {{T1 1 1 1}} +cache {} +next_ next_count +ord__ {A 1 B 2 T1 0}} + +test mevmtcl-ict_match_tokrange-3.1 {Token range, map order compare, outside up} \ + -setup { + grammar::me::tcl::init next_count {A 1 B 2 T1 3} + nc_init + grammar::me::tcl::ict_advance foo + } -body { + grammar::me::tcl::ict_match_tokrange 1 2 "\[A .. B\]" + ME_stateText + } -result {tok__ T1 +loc__ -1 +ok___ 0 +error {0 {{[A .. B]}}} +sv___ {} +ast__ {} +input {{T1 1 1 1}} +cache {} +next_ next_count +ord__ {A 1 B 2 T1 3}} + +test mevmtcl-ict_match_tokrange-3.2 {Token range, map order compare, in range} \ + -setup { + grammar::me::tcl::init next_count {A 1 T1 2 B 3} + nc_init + grammar::me::tcl::ict_advance foo + } -body { + grammar::me::tcl::ict_match_tokrange 1 3 "\[A .. B\]" + ME_stateText + } -result {tok__ T1 +loc__ 0 +ok___ 1 +error {} +sv___ {} +ast__ {} +input {{T1 1 1 1}} +cache {} +next_ next_count +ord__ {A 1 B 3 T1 2}} + +test mevmtcl-ict_match_tokrange-3.3 {Token range, map order compare, in range, low edge} \ + -setup { + grammar::me::tcl::init next_count {T1 0 A 1 B 2} + nc_init + grammar::me::tcl::ict_advance foo + } -body { + grammar::me::tcl::ict_match_tokrange 0 1 "\[T1 .. A\]" + ME_stateText + } -result {tok__ T1 +loc__ 0 +ok___ 1 +error {} +sv___ {} +ast__ {} +input {{T1 1 1 1}} +cache {} +next_ next_count +ord__ {A 1 B 2 T1 0}} + +test mevmtcl-ict_match_tokrange-3.4 {Token range, map order compare, in range, upper edge} \ + -setup { + grammar::me::tcl::init next_count {A 0 B 1 T1 2} + nc_init + grammar::me::tcl::ict_advance foo + } -body { + grammar::me::tcl::ict_match_tokrange 0 2 "\[A .. T1\]" + ME_stateText + } -result {tok__ T1 +loc__ 0 +ok___ 1 +error {} +sv___ {} +ast__ {} +input {{T1 1 1 1}} +cache {} +next_ next_count +ord__ {A 0 B 1 T1 2}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-ict_match_tokclass-1.0 {Call without enough arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ict_match_tokclass + } -result {wrong # args: should be "grammar::me::tcl::ict_match_tokclass code msg"} + +test mevmtcl-ict_match_tokclass-1.1 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ict_match_tokclass a b c + } -result {wrong # args: should be "grammar::me::tcl::ict_match_tokclass code msg"} + +test mevmtcl-ict_match_tokclass-1.2a {Call with bad code} \ + -constraints {tcl8.5plus tcl8.5minus} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ict_match_tokclass gargle foo + } -result {bad class "gargle": must be alnum, alpha, ascii, control, boolean, digit, double, false, graph, integer, list, lower, print, punct, space, true, upper, wideinteger, wordchar, or xdigit} + +test mevmtcl-ict_match_tokclass-1.2b {Call with bad code} \ + -constraints {!tcl8.5plus} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ict_match_tokclass gargle foo + } -result {bad class "gargle": must be alnum, alpha, ascii, control, boolean, digit, double, false, graph, integer, lower, print, punct, space, true, upper, wordchar, or xdigit} + +test mevmtcl-ict_match_tokclass-1.2c {Call with bad code} \ + -constraints tcl8.6plus \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ict_match_tokclass gargle foo + } -result {bad class "gargle": must be alnum, alpha, ascii, control, boolean, digit, double, entier, false, graph, integer, list, lower, print, punct, space, true, upper, wideinteger, wordchar, or xdigit} + +test mevmtcl-ict_match_tokclass-2.0 {Token is matching} \ + -setup { + grammar::me::tcl::init {next_char X} + grammar::me::tcl::ict_advance foo + } -body { + grammar::me::tcl::ict_match_tokclass alpha "<alpha>" + ME_stateText + } -result {tok__ X +loc__ 0 +ok___ 1 +error {} +sv___ {} +ast__ {} +input {{X 3 4 {}}} +cache {} +next_ {next_char X} +ord__ {}} + +test mevmtcl-ict_match_tokclass-2.1 {Token is not matching} \ + -setup { + grammar::me::tcl::init {next_char 0} + grammar::me::tcl::ict_advance foo + } -body { + grammar::me::tcl::ict_match_tokclass alpha "<alpha>" + ME_stateText + } -result {tok__ 0 +loc__ -1 +ok___ 0 +error {0 <alpha>} +sv___ {} +ast__ {} +input {{0 3 4 {}}} +cache {} +next_ {next_char 0} +ord__ {}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-inc_save-1.0 {Call without enough arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::inc_save + } -result {wrong # args: should be "grammar::me::tcl::inc_save symbol at"} + +test mevmtcl-inc_save-1.1 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::inc_save a b c + } -result {wrong # args: should be "grammar::me::tcl::inc_save symbol at"} + + +test mevmtcl-inc_save-2.0 {Basic save of nonterminal match data} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + } -body { + grammar::me::tcl::inc_save A -1 + ME_stateText + } -result {tok__ T2 +loc__ 1 +ok___ 1 +error {} +sv___ {} +ast__ {} +input {{T1 1 1 1} {T2 1 2 2}} +cache {-1,A {1 1 {} {}}} +next_ next_count +ord__ {}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-inc_restore-1.0 {Call without enough arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::inc_restore + } -result {wrong # args: should be "grammar::me::tcl::inc_restore symbol"} + +test mevmtcl-inc_restore-1.1 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::inc_restore a b + } -result {wrong # args: should be "grammar::me::tcl::inc_restore symbol"} + + +test mevmtcl-inc_restore-2.0 {Restore match data, not present} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + grammar::me::tcl::icl_rewind -1 + grammar::me::tcl::iok_fail + } -body { + list [grammar::me::tcl::inc_restore A] [ME_stateText] + } -result {0 {tok__ T2 +loc__ -1 +ok___ 0 +error {} +sv___ {} +ast__ {} +input {{T1 1 1 1} {T2 1 2 2}} +cache {} +next_ next_count +ord__ {}}} + +test mevmtcl-inc_restore-2.1 {Restore match data from cache} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + grammar::me::tcl::inc_save A -1 + grammar::me::tcl::icl_rewind -1 + grammar::me::tcl::iok_fail + } -body { + list [grammar::me::tcl::inc_restore A] [ME_stateText] + } -result {1 {tok__ T2 +loc__ 1 +ok___ 1 +error {} +sv___ {} +ast__ {} +input {{T1 1 1 1} {T2 1 2 2}} +cache {-1,A {1 1 {} {}}} +next_ next_count +ord__ {}}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-iok_ok-1.0 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::iok_ok a + } -result {wrong # args: should be "grammar::me::tcl::iok_ok"} + + +test mevmtcl-iok_ok-2.0 {Regular behaviour} \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::iok_ok + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 1 +error {} +sv___ {} +ast__ {} +input {} +cache {} +next_ fake +ord__ {}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-iok_fail-1.0 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::iok_fail a + } -result {wrong # args: should be "grammar::me::tcl::iok_fail"} + + +test mevmtcl-iok_fail-2.0 {Regular behaviour} \ + -setup { + grammar::me::tcl::init fake + grammar::me::tcl::iok_ok + } -body { + grammar::me::tcl::iok_fail + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 0 +error {} +sv___ {} +ast__ {} +input {} +cache {} +next_ fake +ord__ {}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-iok_negate-1.0 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::iok_negate a + } -result {wrong # args: should be "grammar::me::tcl::iok_negate"} + + +test mevmtcl-iok_negate-2.0 {Regular behaviour} \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::iok_negate + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 1 +error {} +sv___ {} +ast__ {} +input {} +cache {} +next_ fake +ord__ {}} + +test mevmtcl-iok_negate-2.1 {Regular behaviour} \ + -setup { + grammar::me::tcl::init fake + grammar::me::tcl::iok_ok + } -body { + grammar::me::tcl::iok_negate + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 0 +error {} +sv___ {} +ast__ {} +input {} +cache {} +next_ fake +ord__ {}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-icl_get-1.0 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::icl_get a + } -result {wrong # args: should be "grammar::me::tcl::icl_get"} + + +test mevmtcl-icl_get-2.0 {Get current location} \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::icl_get + } -result -1 + +test mevmtcl-icl_get-2.1 {Get current location after advancing} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + } -body { + grammar::me::tcl::icl_get + } -result 2 + +test mevmtcl-icl_get-2.2 {Get current location after advance & rewind} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + grammar::me::tcl::icl_rewind 1 + } -body { + grammar::me::tcl::icl_get + } -result 1 + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-icl_rewind-1.0 {Call without enough arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::icl_rewind + } -result {wrong # args: should be "grammar::me::tcl::icl_rewind oldloc"} + +test mevmtcl-icl_rewind-1.1 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::icl_rewind a b + } -result {wrong # args: should be "grammar::me::tcl::icl_rewind oldloc"} + + +test mevmtcl-icl_rewind-2.0 {Rewind travels back} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + } -body { + grammar::me::tcl::icl_rewind 1 + grammar::me::tcl::icl_get + } -result 1 + +test mevmtcl-icl_rewind-2.1 {Rewind is not sanity checked} \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::icl_rewind -4 + grammar::me::tcl::icl_get + } -result -4 + +test mevmtcl-icl_rewind-2.2 {Rewind is not sanity checked} \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::icl_rewind 50 + grammar::me::tcl::icl_get + } -result 50 + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-ier_get-1.0 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ier_get a + } -result {wrong # args: should be "grammar::me::tcl::ier_get"} + + +test mevmtcl-ier_get-2.0 {Get current error} \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ier_get + } -result {} + +test mevmtcl-ier_get-2.1 {Get current error} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_match_token BOGUS 'BOGUS' + } -body { + grammar::me::tcl::ier_get + } -result {0 'BOGUS'} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-ier_clear-1.0 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ier_clear a + } -result {wrong # args: should be "grammar::me::tcl::ier_clear"} + + +test mevmtcl-ier_clear-2.0 {Clear error, no preceding error} \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ier_clear + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 0 +error {} +sv___ {} +ast__ {} +input {} +cache {} +next_ fake +ord__ {}} + +test mevmtcl-ier_clear-2.1 {Clear error} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_match_token BOGUS 'BOGUS' + } -body { + grammar::me::tcl::ier_clear + ME_stateText + } -result {tok__ T1 +loc__ -1 +ok___ 0 +error {} +sv___ {} +ast__ {} +input {{T1 1 1 1}} +cache {} +next_ next_count +ord__ {}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-ier_nonterminal-1.0 {Call without enough arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ier_nonterminal + } -result {wrong # args: should be "grammar::me::tcl::ier_nonterminal msg pos"} + +test mevmtcl-ier_nonterminal-1.1 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ier_nonterminal a b c + } -result {wrong # args: should be "grammar::me::tcl::ier_nonterminal msg pos"} + + +test mevmtcl-ier_nonterminal-2.0 {No-op if there is no error} \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ier_nonterminal A 4 + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 0 +error {} +sv___ {} +ast__ {} +input {} +cache {} +next_ fake +ord__ {}} + +test mevmtcl-ier_nonterminal-2.1 {No-op for non-matching locations} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_match_token BOGUS 'BOGUS' + } -body { + grammar::me::tcl::ier_nonterminal A 4 + ME_stateText + } -result {tok__ T1 +loc__ -1 +ok___ 0 +error {0 'BOGUS'} +sv___ {} +ast__ {} +input {{T1 1 1 1}} +cache {} +next_ next_count +ord__ {}} + +test mevmtcl-ier_nonterminal-2.2 {Replace error for matching locations} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_match_token BOGUS 'BOGUS' + } -body { + grammar::me::tcl::ier_nonterminal A -1 + ME_stateText + } -result {tok__ T1 +loc__ -1 +ok___ 0 +error {0 A} +sv___ {} +ast__ {} +input {{T1 1 1 1}} +cache {} +next_ next_count +ord__ {}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-ier_merge-1.0 {Call without enough arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ier_merge + } -result {wrong # args: should be "grammar::me::tcl::ier_merge new"} + +test mevmtcl-ier_merge-1.1 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ier_merge a b + } -result {wrong # args: should be "grammar::me::tcl::ier_merge new"} + + +test mevmtcl-ier_merge-2.0 {Both errors empty} \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ier_merge {} + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 0 +error {} +sv___ {} +ast__ {} +input {} +cache {} +next_ fake +ord__ {}} + +test mevmtcl-ier_merge-2.1 {Stored error empty, argument not} \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ier_merge {3 {A dot bar}} + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 0 +error {3 {A dot bar}} +sv___ {} +ast__ {} +input {} +cache {} +next_ fake +ord__ {}} + +test mevmtcl-ier_merge-2.2 {Stored error non-empty, argument is} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_match_token BOGUS 'BOGUS' + } -body { + grammar::me::tcl::ier_merge {} + ME_stateText + } -result {tok__ T1 +loc__ -1 +ok___ 0 +error {0 'BOGUS'} +sv___ {} +ast__ {} +input {{T1 1 1 1}} +cache {} +next_ next_count +ord__ {}} + +test mevmtcl-ier_merge-2.3 {Both errors non-empty, stored further} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_match_token BOGUS 'BOGUS' + } -body { + grammar::me::tcl::ier_merge {0 {A C}} + ME_stateText + } -result {tok__ T3 +loc__ 1 +ok___ 0 +error {2 'BOGUS'} +sv___ {} +ast__ {} +input {{T1 1 1 1} {T2 1 2 2} {T3 1 3 3}} +cache {} +next_ next_count +ord__ {}} + +test mevmtcl-ier_merge-2.4 {Both errors non-empty, argument further} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_match_token BOGUS 'BOGUS' + } -body { + grammar::me::tcl::ier_merge {4 {A C}} + ME_stateText + } -result {tok__ T3 +loc__ 1 +ok___ 0 +error {4 {A C}} +sv___ {} +ast__ {} +input {{T1 1 1 1} {T2 1 2 2} {T3 1 3 3}} +cache {} +next_ next_count +ord__ {}} + +test mevmtcl-ier_merge-2.5 {Both errors non-empty, same location} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_match_token BOGUS 'BOGUS' + } -body { + grammar::me::tcl::ier_merge {2 {A C}} + ME_stateText + } -result {tok__ T3 +loc__ 1 +ok___ 0 +error {2 {'BOGUS' A C}} +sv___ {} +ast__ {} +input {{T1 1 1 1} {T2 1 2 2} {T3 1 3 3}} +cache {} +next_ next_count +ord__ {}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-isv_clear-1.0 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::isv_clear a + } -result {wrong # args: should be "grammar::me::tcl::isv_clear"} + + +test mevmtcl-isv_clear-2.0 {Clear sv, was already clear} \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::isv_clear + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 0 +error {} +sv___ {} +ast__ {} +input {} +cache {} +next_ fake +ord__ {}} + +test mevmtcl-isv_clear-2.1 {Clear sv, after creating something} \ + -setup { + grammar::me::tcl::init fake + grammar::me::tcl::isv_terminal + } -body { + grammar::me::tcl::isv_clear + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 0 +error {} +sv___ {} +ast__ {{{} -1 -1}} +input {} +cache {} +next_ fake +ord__ {}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-isv_terminal-1.0 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::isv_terminal a + } -result {wrong # args: should be "grammar::me::tcl::isv_terminal"} + + +test mevmtcl-isv_terminal-2.0 {Create terminal sv & push} \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::isv_terminal + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 0 +error {} +sv___ {{} -1 -1} +ast__ {{{} -1 -1}} +input {} +cache {} +next_ fake +ord__ {}} + +test mevmtcl-isv_terminal-2.1 {Create terminal sv & push, after advancing} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + grammar::me::tcl::ict_advance foo + } -body { + grammar::me::tcl::isv_terminal + ME_stateText + } -result {tok__ T3 +loc__ 2 +ok___ 1 +error {} +sv___ {{} 2 2} +ast__ {{{} 2 2}} +input {{T1 1 1 1} {T2 1 2 2} {T3 1 3 3}} +cache {} +next_ next_count +ord__ {}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-isv_nonterminal_leaf-1.0 {Call without enough arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::isv_nonterminal_leaf + } -result {wrong # args: should be "grammar::me::tcl::isv_nonterminal_leaf nt pos"} + +test mevmtcl-isv_nonterminal_leaf-1.1 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::isv_nonterminal_leaf a b c + } -result {wrong # args: should be "grammar::me::tcl::isv_nonterminal_leaf nt pos"} + + +test mevmtcl-isv_nonterminal_leaf-2.0 {No-op if not ok} \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::isv_nonterminal_leaf A -3 + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 0 +error {} +sv___ {} +ast__ {} +input {} +cache {} +next_ fake +ord__ {}} + +test mevmtcl-isv_nonterminal_leaf-2.1 {Generate sv} \ + -setup { + grammar::me::tcl::init fake + grammar::me::tcl::iok_ok + } -body { + grammar::me::tcl::isv_nonterminal_leaf A -3 + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 1 +error {} +sv___ {A -2 -1} +ast__ {} +input {} +cache {} +next_ fake +ord__ {}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-isv_nonterminal_range-1.0 {Call without enough arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::isv_nonterminal_range + } -result {wrong # args: should be "grammar::me::tcl::isv_nonterminal_range nt pos"} + +test mevmtcl-isv_nonterminal_range-1.1 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::isv_nonterminal_range a b c + } -result {wrong # args: should be "grammar::me::tcl::isv_nonterminal_range nt pos"} + + +test mevmtcl-isv_nonterminal_range-2.0 {No-op if not ok} \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::isv_nonterminal_range A -3 + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 0 +error {} +sv___ {} +ast__ {} +input {} +cache {} +next_ fake +ord__ {}} + +test mevmtcl-isv_nonterminal_range-2.1 {Generate sv} \ + -setup { + grammar::me::tcl::init fake + grammar::me::tcl::iok_ok + } -body { + grammar::me::tcl::isv_nonterminal_range A -3 + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 1 +error {} +sv___ {A -2 -1 {{} -2 -1}} +ast__ {} +input {} +cache {} +next_ fake +ord__ {}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-isv_nonterminal_reduce-1.0 {Call without enough arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::isv_nonterminal_reduce + } -result {wrong # args: should be "grammar::me::tcl::isv_nonterminal_reduce nt pos ?mrk?"} + +test mevmtcl-isv_nonterminal_reduce-1.1 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::isv_nonterminal_reduce a b c d + } -result {wrong # args: should be "grammar::me::tcl::isv_nonterminal_reduce nt pos ?mrk?"} + + +test mevmtcl-isv_nonterminal_reduce-2.0 {No-op if not ok} \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::isv_nonterminal_reduce A -3 + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 0 +error {} +sv___ {} +ast__ {} +input {} +cache {} +next_ fake +ord__ {}} + +test mevmtcl-isv_nonterminal_reduce-2.1 {Generate sv, reduce all} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::isv_terminal + grammar::me::tcl::ict_advance foo + grammar::me::tcl::isv_terminal + grammar::me::tcl::ict_advance foo + grammar::me::tcl::isv_terminal + } -body { + grammar::me::tcl::isv_nonterminal_reduce A -1 + ME_stateText + } -result {tok__ T3 +loc__ 2 +ok___ 1 +error {} +sv___ {A 0 2 {{} 0 0} {{} 1 1} {{} 2 2}} +ast__ {{{} 0 0} {{} 1 1} {{} 2 2}} +input {{T1 1 1 1} {T2 1 2 2} {T3 1 3 3}} +cache {} +next_ next_count +ord__ {}} + +test mevmtcl-isv_nonterminal_reduce-2.2 {Generate sv, reduce partial} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::isv_terminal + grammar::me::tcl::ict_advance foo + grammar::me::tcl::isv_terminal + grammar::me::tcl::ict_advance foo + grammar::me::tcl::isv_terminal + } -body { + grammar::me::tcl::isv_nonterminal_reduce A 0 1 + ME_stateText + } -result {tok__ T3 +loc__ 2 +ok___ 1 +error {} +sv___ {A 1 2 {{} 1 1} {{} 2 2}} +ast__ {{{} 0 0} {{} 1 1} {{} 2 2}} +input {{T1 1 1 1} {T2 1 2 2} {T3 1 3 3}} +cache {} +next_ next_count +ord__ {}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-ias_push-1.0 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ias_push a + } -result {wrong # args: should be "grammar::me::tcl::ias_push"} + + +test mevmtcl-ias_push-2.0 {Push sv to ast stack} \ + -setup { + grammar::me::tcl::init fake + grammar::me::tcl::iok_ok + grammar::me::tcl::isv_nonterminal_leaf A -3 + } -body { + grammar::me::tcl::ias_push + ME_stateText + } -result {tok__ {} +loc__ -1 +ok___ 1 +error {} +sv___ {A -2 -1} +ast__ {{A -2 -1}} +input {} +cache {} +next_ fake +ord__ {}} + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-ias_mark-1.0 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ias_mark a + } -result {wrong # args: should be "grammar::me::tcl::ias_mark"} + + +test mevmtcl-ias_mark-2.0 {Get ast stack size} \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ias_mark + } -result 0 + +test mevmtcl-ias_mark-2.1 {Get ast stack size} \ + -setup { + grammar::me::tcl::init fake + grammar::me::tcl::iok_ok + grammar::me::tcl::isv_nonterminal_leaf A -3 + grammar::me::tcl::ias_push + } -body { + grammar::me::tcl::ias_mark + } -result 1 + + +# ### ### ### ######### ######### ######### +## + +test mevmtcl-ias_pop2mark-1.0 {Call without enough arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ias_pop2mark + } -result {wrong # args: should be "grammar::me::tcl::ias_pop2mark mark"} + +test mevmtcl-ias_pop2mark-1.1 {Call with too many arguments} \ + -returnCodes error \ + -setup { + grammar::me::tcl::init fake + } -body { + grammar::me::tcl::ias_pop2mark a b + } -result {wrong # args: should be "grammar::me::tcl::ias_pop2mark mark"} + + +test mevmtcl-ias_pop2mark-2.0 {No-op if stack smaller than mark} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::isv_terminal + grammar::me::tcl::ict_advance foo + grammar::me::tcl::isv_terminal + grammar::me::tcl::ict_advance foo + grammar::me::tcl::isv_terminal + } -body { + grammar::me::tcl::ias_pop2mark 5 + ME_stateText + } -result {tok__ T3 +loc__ 2 +ok___ 1 +error {} +sv___ {{} 2 2} +ast__ {{{} 0 0} {{} 1 1} {{} 2 2}} +input {{T1 1 1 1} {T2 1 2 2} {T3 1 3 3}} +cache {} +next_ next_count +ord__ {}} + +test mevmtcl-ias_pop2mark-2.1 {Reduce to chosen size} \ + -setup { + grammar::me::tcl::init next_count + nc_init + grammar::me::tcl::ict_advance foo + grammar::me::tcl::isv_terminal + grammar::me::tcl::ict_advance foo + grammar::me::tcl::isv_terminal + grammar::me::tcl::ict_advance foo + grammar::me::tcl::isv_terminal + grammar::me::tcl::ict_advance foo + grammar::me::tcl::isv_terminal + } -body { + grammar::me::tcl::ias_pop2mark 2 + ME_stateText + } -result {tok__ T4 +loc__ 3 +ok___ 1 +error {} +sv___ {{} 3 3} +ast__ {{{} 0 0} {{} 1 1}} +input {{T1 1 1 1} {T2 1 2 2} {T3 1 3 3} {T4 1 4 4}} +cache {} +next_ next_count +ord__ {}} + + +# ### ### ### ######### ######### ######### +## Cleanup and statistics. + +rename ME_state {} +testsuiteCleanup diff --git a/tcllib/modules/grammar_me/me_util.man b/tcllib/modules/grammar_me/me_util.man new file mode 100644 index 0000000..f8f660e --- /dev/null +++ b/tcllib/modules/grammar_me/me_util.man @@ -0,0 +1,83 @@ +[comment {-*- tcl -*- doctools manpage}] +[manpage_begin grammar::me::util n 0.1] +[keywords {abstract syntax tree}] +[keywords {syntax tree}] +[keywords tree] +[copyright {2005 Andreas Kupries <andreas_kupries@users.sourceforge.net>}] +[moddesc {Grammar operations and usage}] +[titledesc {AST utilities}] +[category {Grammars and finite automata}] +[require Tcl 8.4] +[require grammar::me::util [opt 0.1]] +[description] +[para] + +This package provides a number of utility command for the conversion +between the various representations of abstract syntax trees as +specified in the document [syscmd grammar::me_ast]. + +[list_begin definitions] + +[call [cmd ::grammar::me::util::ast2tree] [arg ast] [arg tree] [opt [arg root]]] + +This command converts an [arg ast] from value to object +representation. All nodes in the [arg ast] will be converted into +nodes of this [arg tree], with the root of the AST a child of the node +[arg root]. If this node is not explicitly specified the root of the +tree is used. Existing content of tree is not touched, i.e. neither +removed nor changed, with the exception of the specified root node, +which will gain a new child. + +[call [cmd ::grammar::me::util::ast2etree] [arg ast] [arg mcmd] [arg tree] [opt [arg root]]] + +This command is like [cmd ::grammar::me::util::ast2tree], except that +the result is in the extended object representation of the input AST. + +The source of the extended information is the command prefix + +[arg mcmd]. + +It has to understand two methods, [method lc], and [method tok], with +the semantics specified below. + +[list_begin definitions] + +[call [cmd mcmd] [method lc] [arg location]] + +Takes the location of a token given as offset in the input stream and +return a 2-element list containing the associated line number and +column index, in this order. + +[call [cmd mcmd] [method tok] [arg from] [opt [arg to]]] + +Takes one or two locations [arg from] and [arg to] as offset in the +input stream and returns a Tcl list containing the specified part of +the input stream. Both location are inclusive. If [arg to] is not +specified it will default to the value of [arg from]. + +[para] + +Each element of the returned list is a list containing the token, its +associated lexeme, the line number, and column index, in this order. + +[list_end] +[para] + +Both the ensemble command [cmd ::grammar::me::tcl] provided by the +package [package grammar::me::tcl] and the objects command created by +the package [package ::grammar::me::cpu] fit the above specification. + +[call [cmd ::grammar::me::util::tree2ast] [arg tree] [opt [arg root]]] + +This command converts an [arg ast] in (extended) object representation +into a value and returns it. + +If a [arg root] node is specified the AST is generated from that node +downward. Otherwise the root of the tree object is used as the +starting point. + +[list_end] + +[vset CATEGORY grammar_me] +[include ../doctools2base/include/feedback.inc] +[manpage_end] diff --git a/tcllib/modules/grammar_me/me_util.tcl b/tcllib/modules/grammar_me/me_util.tcl new file mode 100644 index 0000000..625e894 --- /dev/null +++ b/tcllib/modules/grammar_me/me_util.tcl @@ -0,0 +1,188 @@ +# -*- tcl -*- +# ### ### ### ######### ######### ######### +## Package description + +## Utility commands for the conversion between various representations +## of abstract syntax trees. + +# ### ### ### ######### ######### ######### +## Requisites + +namespace eval ::grammar::me::util { + namespace export ast2tree ast2etree tree2ast +} + +# ### ### ### ######### ######### ######### +## Implementation + +# ### ### ### ######### ######### ######### +## API Implementation. + +proc ::grammar::me::util::ast2tree {ast tree {root {}}} { + # See grammar::me_ast for the specification of both value and tree + # representations. + + if {$root eq ""} { + set root [$tree rootname] + } + + # Decompose the AST value into its components. + + if {[llength $ast] < 3} { + return -code error "Bad node \"$ast\", not enough elements" + } + + set type [lindex $ast 0] + set range [lrange $ast 1 2] + set children [lrange $ast 3 end] + + if {($type eq "") && [llength $children]} { + return -code error \ + "Terminal node \"[lrange $ast 0 2]\" has children" + } + foreach {s e} $range break + if { + ![string is integer -strict $s] || ($s < 0) || + ![string is integer -strict $e] || ($e < 0) + } { + return -code error "Bad range information \"$range\"" + } + + # Create a node for the root of the AST and fill it with the data + # from the value. Afterward recurse and build the tree for the + # children of the root. + + set new [lindex [$tree insert $root end] 0] + + if {$type eq ""} { + $tree set $new type terminal + } else { + $tree set $new type nonterminal + $tree set $new detail $type + } + + $tree set $new range $range + + foreach child $children { + ast2tree $child $tree $new + } + return +} + +proc ::grammar::me::util::ast2etree {ast mcmd tree {root {}}} { + # See grammar::me_ast for the specification of both value and tree + # representations. + + if {$root eq ""} { + set root [$tree rootname] + } + + # Decompose the AST value into its components. + + if {[llength $ast] < 3} { + return -code error "Bad node \"$ast\", not enough elements" + } + + set type [lindex $ast 0] + set range [lrange $ast 1 2] + set children [lrange $ast 3 end] + + if {($type eq "") && [llength $children]} { + return -code error \ + "Terminal node \"[lrange $ast 0 2]\" has children" + } + foreach {s e} $range break + if { + ![string is integer -strict $s] || ($s < 0) || + ![string is integer -strict $e] || ($e < 0) + } { + return -code error "Bad range information \"$range\"" + } + + # Create a node for the root of the AST and fill it with the data + # from the value. Afterward recurse and build the tree for the + # children of the root. + + set new [lindex [$tree insert $root end] 0] + + if {$type eq ""} { + set cmd $mcmd + lappend cmd tok + foreach loc $range {lappend cmd $loc} + + $tree set $new type terminal + $tree set $new detail [uplevel \#0 $cmd] + } else { + $tree set $new type nonterminal + $tree set $new detail $type + } + + set range_lc {} + foreach loc $range { + lappend range_lc [uplevel \#0 \ + [linsert $mcmd end lc $loc]] + } + + $tree set $new range $range + $tree set $new range_lc $range_lc + + foreach child $children { + ast2etree $child $mcmd $tree $new + } + return +} + +proc ::grammar::me::util::tree2ast {tree {root {}}} { + # See grammar::me_ast for the specification of both value and tree + # representations. + + if {$root eq ""} { + set root [$tree rootname] + } + + set value {} + + if {![$tree keyexists $root type]} { + return -code error "Bad node \"$root\", type information is missing" + } + if {![$tree keyexists $root range]} { + return -code error "Bad node \"$root\", range information is missing" + } + + set range [$tree get $root range] + if {[llength $range] != 2} { + return -code error "Bad node \"root\", bad range information \"$range\"" + } + + foreach {s e} $range break + if { + ![string is integer -strict $s] || ($s < 0) || + ![string is integer -strict $e] || ($e < 0) + } { + return -code error "Bad node \"root\", bad range information \"$range\"" + } + + if {[$tree get $root type] eq "terminal"} { + lappend value {} + } else { + if {![$tree keyexists $root detail]} { + return -code error "Bad node \"$root\", nonterminal detail is missing" + } + + lappend value [$tree get $root detail] + } + + # Range data ... + lappend value $s $e + + foreach child [$tree children $root] { + lappend value [tree2ast $tree $child] + } + + return $value +} + +# ### ### ### ######### ######### ######### +## Package Management + +package provide grammar::me::util 0.1 diff --git a/tcllib/modules/grammar_me/me_util.test b/tcllib/modules/grammar_me/me_util.test new file mode 100644 index 0000000..c49dd73 --- /dev/null +++ b/tcllib/modules/grammar_me/me_util.test @@ -0,0 +1,168 @@ +# me_util.test: tests for the AST utilities -*- tcl -*- +# +# This file contains a collection of tests for one or more of the Tcl +# built-in commands. Sourcing this file into Tcl runs the tests and +# generates output for errors. No output means no errors were found. +# +# Copyright (c) 2005 Andreas Kupries <andreas_kupries@users.sourceforge.net> +# All rights reserved. +# +# RCS: @(#) $Id: me_util.test,v 1.7 2007/08/01 22:49:26 andreas_kupries Exp $ + +# ------------------------------------------------------------------------- + +source [file join \ + [file dirname [file dirname [file join [pwd] [info script]]]] \ + devtools testutilities.tcl] + +testsNeedTcl 8.4 +testsNeedTcltest 2.1 + +support { + useAccel [useTcllibC] struct/tree.tcl struct::tree + TestAccelInit struct::tree +} +testing { + useLocal me_util.tcl grammar::me::util +} + +# ------------------------------------------------------------------------- + +# ------------------------------------------------------------------------- + +## Pre-requisites. An AST value and various serializations of plain +## and extended tree representations of the same AST. Plus helper +## commands for the checking of trees for structural equality. + +set ast {a 0 56 {{} 3 15} {b 16 40 {d 16 20} {{} 21 40}} {c 41 56}} + +set serial_0 { + root {} {} + node0 0 {type nonterminal detail a range {0 56}} + node1 3 {type terminal range {3 15}} + node2 3 {type nonterminal detail b range {16 40}} + node3 3 {type nonterminal detail c range {41 56}} + node4 9 {type nonterminal detail d range {16 20}} + node5 9 {type terminal range {21 40}} +} + +set serial_0a { + node0 {} {type nonterminal detail a range {0 56}} + node1 0 {type terminal range {3 15}} + node2 0 {type nonterminal detail b range {16 40}} + node3 0 {type nonterminal detail c range {41 56}} + node4 6 {type nonterminal detail d range {16 20}} + node5 6 {type terminal range {21 40}} +} + +set serial_1 { + root {} {} + foo 0 {} + node0 3 {type nonterminal detail a range {0 56}} + node1 6 {type terminal range {3 15}} + node2 6 {type nonterminal detail b range {16 40}} + node3 6 {type nonterminal detail c range {41 56}} + node4 12 {type nonterminal detail d range {16 20}} + node5 12 {type terminal range {21 40}} +} + +set serial_2 { + root {} {} + node0 0 {type nonterminal detail a range {0 56} range_lc {{l0 c0} {l56 c56}}} + node1 3 {type terminal range {3 15} range_lc {{l3 c3} {l15 c15}} detail {{T3 l3 c3 L3} {T4 l4 c4 L4} {T5 l5 c5 L5} {T6 l6 c6 L6} {T7 l7 c7 L7} {T8 l8 c8 L8} {T9 l9 c9 L9} {T10 l10 c10 L10} {T11 l11 c11 L11} {T12 l12 c12 L12} {T13 l13 c13 L13} {T14 l14 c14 L14} {T15 l15 c15 L15}}} + node2 3 {type nonterminal detail b range {16 40} range_lc {{l16 c16} {l40 c40}}} + node3 3 {type nonterminal detail c range {41 56} range_lc {{l41 c41} {l56 c56}}} + node4 9 {type nonterminal detail d range {16 20} range_lc {{l16 c16} {l20 c20}}} + node5 9 {type terminal range {21 40} range_lc {{l21 c21} {l40 c40}} detail {{T21 l21 c21 L21} {T22 l22 c22 L22} {T23 l23 c23 L23} {T24 l24 c24 L24} {T25 l25 c25 L25} {T26 l26 c26 L26} {T27 l27 c27 L27} {T28 l28 c28 L28} {T29 l29 c29 L29} {T30 l30 c30 L30} {T31 l31 c31 L31} {T32 l32 c32 L32} {T33 l33 c33 L33} {T34 l34 c34 L34} {T35 l35 c35 L35} {T36 l36 c36 L36} {T37 l37 c37 L37} {T38 l38 c38 L38} {T39 l39 c39 L39} {T40 l40 c40 L40}}} +} + +set serial_2a { + node0 {} {type nonterminal detail a range {0 56}} + node1 0 {type terminal range {3 15}} + node2 0 {type nonterminal detail b range {16 40}} + node3 0 {type nonterminal detail c range {41 56}} + node4 6 {type nonterminal detail d range {16 20}} + node5 6 {type terminal range {21 40}} +} + +set serial_3 { + root {} {} + foo 0 {} + node0 3 {type nonterminal detail a range {0 56} range_lc {{l0 c0} {l56 c56}}} + node1 6 {type terminal range {3 15} range_lc {{l3 c3} {l15 c15}} detail {{T3 l3 c3 L3} {T4 l4 c4 L4} {T5 l5 c5 L5} {T6 l6 c6 L6} {T7 l7 c7 L7} {T8 l8 c8 L8} {T9 l9 c9 L9} {T10 l10 c10 L10} {T11 l11 c11 L11} {T12 l12 c12 L12} {T13 l13 c13 L13} {T14 l14 c14 L14} {T15 l15 c15 L15}}} + node2 6 {type nonterminal detail b range {16 40} range_lc {{l16 c16} {l40 c40}}} + node3 6 {type nonterminal detail c range {41 56} range_lc {{l41 c41} {l56 c56}}} + node4 12 {type nonterminal detail d range {16 20} range_lc {{l16 c16} {l20 c20}}} + node5 12 {type terminal range {21 40} range_lc {{l21 c21} {l40 c40}} detail {{T21 l21 c21 L21} {T22 l22 c22 L22} {T23 l23 c23 L23} {T24 l24 c24 L24} {T25 l25 c25 L25} {T26 l26 c26 L26} {T27 l27 c27 L27} {T28 l28 c28 L28} {T29 l29 c29 L29} {T30 l30 c30 L30} {T31 l31 c31 L31} {T32 l32 c32 L32} {T33 l33 c33 L33} {T34 l34 c34 L34} {T35 l35 c35 L35} {T36 l36 c36 L36} {T37 l37 c37 L37} {T38 l38 c38 L38} {T39 l39 c39 L39} {T40 l40 c40 L40}}} +} + +proc tree_equal {ta tb} { + set tna [llength [$ta nodes]] + set tnb [llength [$tb nodes]] + + if {$tna != $tnb} { + puts "sizes: $ta n = $tna != $tnb = $tb n" + return 0 + } + node_equal $ta $tb [$ta rootname] [$tb rootname] +} + +proc node_equal {ta tb na nb} { + if {[dictsort [$ta getall $na]] ne [dictsort [$tb getall $nb]]} { + puts "attr delta $ta $na: [dictsort [$ta getall $na]]\n $tb $nb: [dictsort [$tb getall $nb]]" + return 0 + } + if {[$ta numchildren $na] != [$tb numchildren $nb]} { + puts "#c $na / $nb: [$ta numchildren $na] != [$tb numchildren $nb]" + return 0 + } + foreach ca [$ta children $na] cb [$tb children $nb] { + if {![node_equal $ta $tb $ca $cb]} { + return 0 + } + } + return 1 +} + +proc tsdump {ser} { + set line {} + foreach {a b c} $ser { + lappend line [list $a $b $c] + } + return \t[join $line \n\t] +} + +# ------------------------------------------------------------------------- +# In this section we run all the tests depending on a struct::tree, +# and thus have to test all the available implementations. + +set tests [file join [file dirname [info script]] me_util.testsuite] + +catch {memory validate on} + +TestAccelDo struct::tree impl { + # The global variable 'impl' is part of the public API the + # testsuit (in htmlparse_tree.testsuite) can expect from the + # environment. + + namespace import -force struct::tree + + set usec [time {source $tests} 1] + + #puts "$impl:\t$usec" +} + +catch {memory validate off} + +unset usec +unset tests + +# ------------------------------------------------------------------------- + +## Cleanup and statistics. + +rename tree_equal {} +rename node_equal {} +rename tsdump {} +TestAccelExit struct::tree +testsuiteCleanup diff --git a/tcllib/modules/grammar_me/me_util.testsuite b/tcllib/modules/grammar_me/me_util.testsuite new file mode 100644 index 0000000..6423544 --- /dev/null +++ b/tcllib/modules/grammar_me/me_util.testsuite @@ -0,0 +1,384 @@ +# -*- tcl -*- me_util.test +# ### ### ### ######### ######### ######### +## Suite 1: Values to tree objects. + +set tname [expr {$impl eq "critcl" ? "t" : "::t"}] + + +test ast2tree-${impl}-1.0 {Call without enough arguments} \ + -returnCodes error \ + -body { + grammar::me::util::ast2tree + } -result {wrong # args: should be "grammar::me::util::ast2tree ast tree ?root?"} + +test ast2tree-${impl}-1.1 {Call with to many arguments} \ + -returnCodes error \ + -body { + grammar::me::util::ast2tree a b c d + } -result {wrong # args: should be "grammar::me::util::ast2tree ast tree ?root?"} + +test ast2tree-${impl}-1.2 {Call with bad tree object} \ + -returnCodes error \ + -body { + grammar::me::util::ast2tree {a 1 2} foo + } -result {invalid command name "foo"} + +test ast2tree-${impl}-1.3 {Call with bad node in tree} \ + -returnCodes error \ + -setup { + struct::tree t + } -cleanup { + t destroy + } -body { + grammar::me::util::ast2tree {a 1 2} t blub + } -result "parent node \"blub\" does not exist in tree \"$tname\"" + +test ast2tree-${impl}-1.4 {Call with bad AST, terminal node with children} \ + -returnCodes error \ + -setup { + struct::tree t + } -cleanup { + t destroy + } -body { + grammar::me::util::ast2tree {{} 1 2 {a 3 4} {c 5 6}} t + } -result {Terminal node "{} 1 2" has children} + +foreach {n range} { + 0 {0 a} + 1 {0 -1} + 2 {a 0} + 3 {-1 0} + 4 {a b} + 5 {a -1} + 6 {-1 b} + 7 {-1 -1} +} { + test ast2tree-${impl}-1.5.$n {Call with bad AST, bad location information} \ + -returnCodes error \ + -setup { + struct::tree t + } -cleanup { + t destroy + } -body { + grammar::me::util::ast2tree [linsert $range 0 {}] t + } -result "Bad range information \"[lrange $range 0 end]\"" +} + +foreach {n node} { + 0 {} + 1 {{}} + 2 {{} 0} +} { + test ast2tree-${impl}-1.6.$n {Call with bad AST, node representation too short} \ + -returnCodes error \ + -setup { + struct::tree t + } -cleanup { + t destroy + } -body { + grammar::me::util::ast2tree $node t + } -result "Bad node \"[lrange $node 0 end]\", not enough elements" +} + + + +test ast2tree-${impl}-2.0 {Regular conversion} \ + -setup { + struct::tree t + struct::tree tex deserialize $serial_0 + } -cleanup { + t destroy + tex destroy + } -body { + grammar::me::util::ast2tree $ast t + tree_equal t tex + } -result 1 + +test ast2tree-${impl}-2.1 {Regular conversion under non-root root} \ + -setup { + struct::tree t + t insert root end foo + struct::tree tex deserialize $serial_1 + } -cleanup { + t destroy + tex destroy + } -body { + grammar::me::util::ast2tree $ast t foo + tree_equal t tex + } -result 1 + +# ### ### ### ######### ######### ######### +## Suite 2: Values to extended tree objects + +proc tinfo {cmd args} { + # 'tinfo lc 0' is a nice check that things work. + switch -exact -- $cmd { + lc { + return [list l[lindex $args 0] c[lindex $args 0]] + } + tok { + foreach {s e} $args break + set res {} + for {set i $s} {$i <= $e} {incr i} { + lappend res [list T$i l$i c$i L$i] + } + return $res + } + } + return -code error BOGUS +} + +test ast2etree-${impl}-1.0 {Call without enough arguments} \ + -returnCodes error \ + -body { + grammar::me::util::ast2etree + } -result {wrong # args: should be "grammar::me::util::ast2etree ast mcmd tree ?root?"} + +test ast2etree-${impl}-1.1 {Call with to many arguments} \ + -returnCodes error \ + -body { + grammar::me::util::ast2etree a b c d e + } -result {wrong # args: should be "grammar::me::util::ast2etree ast mcmd tree ?root?"} + +test ast2etree-${impl}-1.2 {Call with bad tree object} \ + -returnCodes error \ + -body { + grammar::me::util::ast2etree {a 1 2} tinfo foo + } -result {invalid command name "foo"} + +test ast2etree-${impl}-1.3 {Call with bad info callback} \ + -returnCodes error \ + -setup { + struct::tree t + } -cleanup { + t destroy + } -body { + grammar::me::util::ast2etree {a 1 2} foo t + } -result {invalid command name "foo"} + +test ast2etree-${impl}-1.4 {Call with bad node in tree} \ + -returnCodes error \ + -setup { + struct::tree t + } -cleanup { + t destroy + } -body { + grammar::me::util::ast2etree {a 1 2} tinfo t blub + } -result "parent node \"blub\" does not exist in tree \"$tname\"" + +test ast2etree-${impl}-1.6 {Call with bad AST, terminal node with children} \ + -returnCodes error \ + -setup { + struct::tree t + } -cleanup { + t destroy + } -body { + grammar::me::util::ast2etree {{} 1 2 {a 3 4} {c 5 6}} tinfo t + } -result {Terminal node "{} 1 2" has children} + +foreach {n range} { + 0 {0 a} + 1 {0 -1} + 2 {a 0} + 3 {-1 0} + 4 {a b} + 5 {a -1} + 6 {-1 b} + 7 {-1 -1} +} { + test ast2etree-${impl}-1.7.$n {Call with bad AST, bad location information} \ + -returnCodes error \ + -setup { + struct::tree t + } -cleanup { + t destroy + } -body { + grammar::me::util::ast2etree [linsert $range 0 {}] tinfo t + } -result "Bad range information \"[lrange $range 0 end]\"" +} + +foreach {n node} { + 0 {} + 1 {{}} + 2 {{} 0} +} { + test ast2tree-${impl}-1.8.$n {Call with bad AST, node representation too short} \ + -returnCodes error \ + -setup { + struct::tree t + } -cleanup { + t destroy + } -body { + grammar::me::util::ast2etree $node tinfo t + } -result "Bad node \"[lrange $node 0 end]\", not enough elements" +} + + + +test ast2etree-${impl}-2.0 {Regular conversion} \ + -setup { + struct::tree t + struct::tree tex deserialize $serial_2 + } -cleanup { + t destroy + tex destroy + } -body { + grammar::me::util::ast2etree $ast tinfo t + tree_equal t tex + } -result 1 + +test ast2etree-${impl}-2.1 {Regular conversion under non-root root} \ + -setup { + struct::tree t + t insert root end foo + struct::tree tex deserialize $serial_3 + } -cleanup { + t destroy + tex destroy + } -body { + grammar::me::util::ast2etree $ast tinfo t foo + tree_equal t tex + } -result 1 + +# ### ### ### ######### ######### ######### +## Suite 3: Tree objects to values. + +test tree2ast-1.0 {Call without enough arguments} \ + -returnCodes error \ + -body { + grammar::me::util::tree2ast + } -result {wrong # args: should be "grammar::me::util::tree2ast tree ?root?"} + +test tree2ast-1.1 {Call with to many arguments} \ + -returnCodes error \ + -body { + grammar::me::util::tree2ast a b c + } -result {wrong # args: should be "grammar::me::util::tree2ast tree ?root?"} + +test tree2ast-1.2 {Call with bad tree object} \ + -returnCodes error \ + -body { + grammar::me::util::tree2ast foo + } -result {invalid command name "foo"} + +test tree2ast-1.3 {Call with bad node in tree} \ + -returnCodes error \ + -setup { + struct::tree t + } -cleanup { + t destroy + } -body { + grammar::me::util::tree2ast t blub + } -result "node \"blub\" does not exist in tree \"$tname\"" + +test tree2ast-1.4 {Call with broken tree, missing type} \ + -returnCodes error \ + -setup { + struct::tree t deserialize {root {} {range {0 2} detail x}} + } -cleanup { + t destroy + } -body { + grammar::me::util::tree2ast t + } -result {Bad node "root", type information is missing} + +test tree2ast-1.5.0 {Call with broken tree, missing range, nonterminal} \ + -returnCodes error \ + -setup { + struct::tree t deserialize {root {} {type nonterminal detail x}} + } -cleanup { + t destroy + } -body { + grammar::me::util::tree2ast t + } -result {Bad node "root", range information is missing} + +test tree2ast-1.5.1 {Call with broken tree, missing range, terminal} \ + -returnCodes error \ + -setup { + struct::tree t deserialize {root {} {type terminal}} + } -cleanup { + t destroy + } -body { + grammar::me::util::tree2ast t + } -result {Bad node "root", range information is missing} + +test tree2ast-1.6 {Call with broken tree, missing detail} \ + -returnCodes error \ + -setup { + struct::tree t deserialize {root {} {type nonterminal range {0 2}}} + } -cleanup { + t destroy + } -body { + grammar::me::util::tree2ast t + } -result {Bad node "root", nonterminal detail is missing} + +foreach {n range} { + 0 {0 a} + 1 {0 -1} + 2 {a 0} + 3 {-1 0} + 4 {a b} + 5 {a -1} + 6 {-1 b} + 7 {-1 -1} + 8 {} + 9 {1} + 10 {1 2 3} +} { + test tree2ast-1.7.$n {Call with broken tree, bad location, terminal} \ + -returnCodes error \ + -setup { + struct::tree t deserialize [list root {} [list type terminal range $range]] + } -cleanup { + t destroy + } -body { + grammar::me::util::tree2ast t + } -result "Bad node \"root\", bad range information \"$range\"" + + test tree2ast-1.8.$n {Call with broken tree, bad location, nonterminal} \ + -returnCodes error \ + -setup { + struct::tree t deserialize [list root {} [list type nonterminal detail x range $range]] + } -cleanup { + t destroy + } -body { + grammar::me::util::tree2ast t + } -result "Bad node \"root\", bad range information \"$range\"" +} + + + +test tree2ast-2.0 {Regular conversion} \ + -setup { + struct::tree t deserialize $serial_0a + } -cleanup { + t destroy + } -body { + grammar::me::util::tree2ast t + } -result $ast + +test tree2ast-2.1 {Regular conversion under non-root root} \ + -setup { + struct::tree t deserialize $serial_1 + } -cleanup { + t destroy + } -body { + grammar::me::util::tree2ast t node0 + } -result $ast + +test tree2ast-2.2 {Regular conversion, of extended tree} \ + -setup { + struct::tree t deserialize $serial_2a + } -cleanup { + t destroy + } -body { + grammar::me::util::tree2ast t + } -result $ast + +test tree2ast-2.3 {Regular conversion under non-root root} \ + -setup { + struct::tree t deserialize $serial_3 + } -cleanup { + t destroy + } -body { + grammar::me::util::tree2ast t node0 + } -result $ast diff --git a/tcllib/modules/grammar_me/me_vm.man b/tcllib/modules/grammar_me/me_vm.man new file mode 100644 index 0000000..b5bd36e --- /dev/null +++ b/tcllib/modules/grammar_me/me_vm.man @@ -0,0 +1,663 @@ +[comment {-*- tcl -*- doctools manpage}] +[manpage_begin grammar::me_vm n 0.1] +[keywords grammar] +[keywords parsing] +[keywords {virtual machine}] +[copyright {2005 Andreas Kupries <andreas_kupries@users.sourceforge.net>}] +[moddesc {Grammar operations and usage}] +[titledesc {Virtual machine for parsing token streams}] +[category {Grammars and finite automata}] +[description] + +Please go and read the document [syscmd grammar::me_intro] first for +an overview of the various documents and their relations. + +[para] + +This document specifies a virtual machine for the controlled matching +and parsing of token streams, creating an + +[term {abstract syntax tree}] (short [term AST]) reflecting the +structure of the input. Special machine features are the caching and +reuse of partial results, caching of the encountered input, and the +ability to backtrack in both input and AST creation. + +[para] + +These features make the specified virtual machine especially useful to +packrat parsers based on parsing expression grammars. It is however +not restricted to this type of parser. Normal LL and LR parsers can be +implemented with it as well. + +[para] + +The following sections will discuss first the abstract state kept by +ME virtual machines, and then their instruction set. + +[section {MACHINE STATE}] + +A ME virtual machine manages the following state: + +[list_begin definitions] +[def "[term {Current token}] CT"] + +The token from the input under consideration by the machine. + +[para] + +This information is used and modified by the instructions defined in +the section + +[sectref {TERMINAL MATCHING}]. + +[def "[term {Current location}] CL"] + +The location of the [term {current token}] in the input stream, as +offset relative to the beginning of the stream. The first token is +considered to be at offset [const 0]. + +[para] + +This information is implicitly used and modified by the instructions +defined in the sections + +[sectref {TERMINAL MATCHING}] and +[sectref {NONTERMINAL MATCHING}], + +and can be directly queried and modified by the instructions defined +in section + +[sectref {INPUT LOCATION HANDLING}]. + +[def "[term {Location stack}] LS"] + +In addition to the above a stack of locations, for backtracking. +Locations can put on the stack, removed from it, and removed with +setting the current location. + +[para] + +This information is implicitly used and modified by the instructions +defined in the sections + +[sectref {TERMINAL MATCHING}] and +[sectref {NONTERMINAL MATCHING}], + +and can be directly queried and modified by the instructions defined +in section + +[sectref {INPUT LOCATION HANDLING}]. + +[def "[term {Match status}] OK"] + +A boolean value, the result of the last attempt at matching input. +It is set to [const true] if that attempt was successful, and +[const false] otherwise. + +[para] + +This information is influenced by the instructions defined in the +sections + +[sectref {TERMINAL MATCHING}], +[sectref {NONTERMINAL MATCHING}], and +[sectref {UNCONDITIONAL MATCHING}]. + +It is queried by the instructions defined in the section + +[sectref {CONTROL FLOW}]. + +[def "[term {Semantic value}] SV"] + +The semantic value associated with (generated by) the last attempt at +matching input. Contains either the empty string or a node for the +abstract syntax tree constructed from the input. + +[para] + +This information is influenced by the instructions defined in the +sections + +[sectref {SEMANTIC VALUES}], and +[sectref {AST STACK HANDLING}]. + +[def "[term {AST stack}] AS"] + +A stack of partial abstract syntax trees constructed by the machine +during matching. + +[para] + +This information is influenced by the instructions defined in the +sections + +[sectref {SEMANTIC VALUES}], and +[sectref {AST STACK HANDLING}]. + +[def "[term {AST Marker stack}] MS"] + +In addition to the above a stack of stacks, for backtracking. This is +actually a stack of markers into the AST stack, thus implicitly +snapshooting the state of the AST stack at some point in time. Markers +can be put on the stack, dropped from it, and used to roll back the +AST stack to an earlier state. + +[para] + +This information is influenced by the instructions defined in the +sections + +[sectref {SEMANTIC VALUES}], and +[sectref {AST STACK HANDLING}]. + +[def "[term {Error status}] ER"] + +Error information associated with the last attempt at matching +input. Contains either the empty string or a list of 2 elements, a +location in the input and a list of error messages associated with +it, in this order. + +[para] + +[emph Note] that error information can be set even if the last attempt +at matching input was successful. For example the *-operator (matching +a sub-expression zero or more times) in a parsing expression grammar +is always successful, even if it encounters a problem further in the +input and has to backtrack. Such problems must not be forgotten when +continuing to match. + +[para] + +This information is queried and influenced by the instructions defined +in the sections + +[sectref {TERMINAL MATCHING}], +[sectref {NONTERMINAL MATCHING}], and +[sectref {ERROR HANDLING}]. + +[def "[term {Error stack}] ES"] + +In addition to the above a stack of error information, to allow the +merging of current and older error information when performing +backtracking in choices after an unsucessful match. + +[para] + +This information is queried and influenced by the instructions defined +in the sections + +[sectref {TERMINAL MATCHING}], +[sectref {NONTERMINAL MATCHING}], and +[sectref {ERROR HANDLING}]. + +[def "[term {Return stack}] RS"] + +A stack of program counter values, i.e. locations in the code +controlling the virtual machine, for the management of subroutine +calls, i.e. the matching of nonterminal symbols. + +[para] + +This information is queried and influenced by the instructions defined +in the section + +[sectref {NONTERMINAL MATCHING}]. + +[def "[term {Nonterminal cache}] NC"] + +A cache of machine states (A 4-tuple containing a location in the +input, match status [term OK], semantic value [term SV], and error +status [term ER]) keyed by name of nonterminal symbol and location in +the input stream. + +[para] + +The key location is where machine started the attempt to match the +named nonterminal symbol, and the location in the value is where +machine ended up after the attempt completed, independent of the +success of the attempt. + +[para] + +This status is queried and influenced by the instructions defined in +the section + +[sectref {NONTERMINAL MATCHING}]. + +[list_end] + +[section {MACHINE INSTRUCTIONS}] + +With the machine state specified it is now possible to explain the +instruction set of ME virtual machines. They are grouped roughly by +the machine state they influence and/or query. + +[subsection {TERMINAL MATCHING}] + +First the instructions to match tokens from the input stream, and +by extension all terminal symbols. + +[para] + +These instructions are the only ones which may retrieve a new token +from the input stream. This is a [emph may] and not a [emph will] +because the instructions will a retrieve new token if, and only if the +current location [term CL] is at the head of the stream. + +If the machine has backtracked (see [cmd icl_rewind]) the instructions +will retrieve the token to compare against from the internal cache. + +[para] +[list_begin definitions] + +[def "[cmd ict_advance] [arg message]"] + +This instruction tries to advance to the next token in the input +stream, i.e. the one after the current location [term CL]. The +instruction will fail if, and only if the end of the input stream is +reached, i.e. if there is no next token. + +[para] + +The sucess/failure of the instruction is remembered in the match +status [term OK]. In the case of failure the error status [term ER] is +set to the current location and the message [arg message]. + +In the case of success the error status [term ER] is cleared, the new +token is made the current token [term CT], and the new location is +made the current location [term CL]. + +[para] + +The argument [arg message] is a reference to the string to put into +the error status [term ER], if such is needed. + +[def "[cmd ict_match_token] [arg tok] [arg message]"] + +This instruction tests the current token [term CT] for equality +with the argument [arg tok] and records the result in the match +status [term OK]. The instruction fails if the current token is +not equal to [arg tok]. + +[para] + +In case of failure the error status [term ER] is set to the current +location [term CL] and the message [arg message], and the +current location [term CL] is moved one token backwards. + +Otherwise, i.e. upon success, the error status [term ER] is cleared +and the current location [term CL] is not touched. + +[def "[cmd ict_match_tokrange] [arg tokbegin] [arg tokend] [arg message]"] + +This instruction tests the current token [term CT] for being in +the range of tokens from [arg tokbegin] to [arg tokend] +(inclusive) and records the result in the match status [term OK]. The +instruction fails if the current token is not inside the range. + +[para] + +In case of failure the error status [term ER] is set to the current +location [term CL] and the message [arg message], and the current location +[term CL] is moved one token backwards. + +Otherwise, i.e. upon success, the error status [term ER] is cleared +and the current location [term CL] is not touched. + +[def "[cmd ict_match_tokclass] [arg code] [arg message]"] + +This instruction tests the current token [term CT] for being a member +of the token class [arg code] and records the result in the match +status [term OK]. The instruction fails if the current token is not a +member of the specified class. + +[para] + +In case of failure the error status [term ER] is set to the current +location [term CL] and the message [arg message], and the +current location [term CL] is moved one token backwards. + +Otherwise, i.e. upon success, the error status [term ER] is cleared +and the current location [term CL] is not touched. + +[para] + +Currently the following classes are legal: + +[list_begin definitions] +[def alnum] +A token is accepted if it is a unicode alphabetical character, or a digit. +[def alpha] +A token is accepted if it is a unicode alphabetical character. +[def digit] +A token is accepted if it is a unicode digit character. +[def xdigit] +A token is accepted if it is a hexadecimal digit character. +[def punct] +A token is accepted if it is a unicode punctuation character. +[def space] +A token is accepted if it is a unicode space character. +[list_end] + +[list_end] +[para] + +[subsection {NONTERMINAL MATCHING}] + +The instructions in this section handle the matching of nonterminal +symbols. They query the nonterminal cache [term NC] for saved +information, and put such information into the cache. + +[para] + +The usage of the cache is a performance aid for backtracking parsers, +allowing them to avoid an expensive rematch of complex nonterminal +symbols if they have been encountered before. + +[para] + +[list_begin definitions] + +[def "[cmd inc_restore] [arg branchlabel] [arg nt]"] + +This instruction checks if the nonterminal cache [term NC] contains +information about the nonterminal symbol [arg nt], at the current +location [term CL]. If that is the case the instruction will update +the machine state (current location [term CL], match status [term OK], +semantic value [term SV], and error status [term ER]) with the found +information and continue execution at the instruction refered to by +the [arg branchlabel]. The new current location [term CL] will be the +last token matched by the nonterminal symbol, i.e. belonging to it. + +[para] + +If no information was found the instruction will continue execution at +the next instruction. + +[para] + +Together with [cmd icf_ntcall] it is possible to generate code for +memoized and non-memoized matching of nonterminal symbols, either as +subroutine calls, or inlined in the caller. + +[def "[cmd inc_save] [arg nt]"] + +This instruction saves the current state of the machine (current +location [term CL], match status [term OK], semantic value [term SV], +and error status [term ER]), to the nonterminal cache [term NC]. It +will also pop an entry from the location stack [term LS] and save it +as the start location of the match. + +[para] + +It is expected to be called at the end of matching a nonterminal +symbol, with [arg nt] the name of the nonterminal symbol the code was +working on. This allows the instruction [cmd inc_restore] to check for +and retrieve the data, should we have to match this nonterminal symbol +at the same location again, during backtracking. + +[def "[cmd icf_ntcall] [arg branchlabel]"] + +This instruction invokes the code for matching the nonterminal symbol +[arg nt] as a subroutine. To this end it stores the current program +counter [term PC] on the return stack [term RS], the current location +[term CL] on the location stack [term LS], and then continues +execution at the address [arg branchlabel]. + +[para] + +The next matching [cmd icf_ntreturn] will cause the execution to +continue at the instruction coming after the call. + +[def [cmd icf_ntreturn]] + +This instruction will pop an entry from the return stack [term RS], +assign it to the program counter [term PC], and then continue +execution at the new address. + +[list_end] +[para] + +[subsection {UNCONDITIONAL MATCHING}] + +The instructions in this section are the remaining match +operators. They change the match status [term OK] directly and +unconditionally. + +[list_begin definitions] + +[def [cmd iok_ok]] + +This instruction sets the match status [term OK] to [const true], +indicating a successful match. + +[def [cmd iok_fail]] + +This instruction sets the match status [term OK] to [const false], +indicating a failed match. + +[def [cmd iok_negate]] + +This instruction negates the match status [term OK], turning a failure +into a success and vice versa. + +[list_end] +[para] + +[subsection {CONTROL FLOW}] + +The instructions in this section implement both conditional and +unconditional control flow. The conditional jumps query the match +status [term OK]. + +[list_begin definitions] + +[def "[cmd icf_jalways] [arg branchlabel]"] + +This instruction sets the program counter [term PC] to the address +specified by [arg branchlabel] and then continues execution from +there. This is an unconditional jump. + +[def "[cmd icf_jok] [arg branchlabel]"] + +This instruction sets the program counter [term PC] to the address +specified by [arg branchlabel]. This happens if, and only if the match +status [term OK] indicates a success. Otherwise it simply continues +execution at the next instruction. This is a conditional jump. + +[def "[cmd icf_jfail] [arg branchlabel]"] + +This instruction sets the program counter [term PC] to the address +specified by [arg branchlabel]. This happens if, and only if the match +status [term OK] indicates a failure. Otherwise it simply continues +execution at the next instruction. This is a conditional jump. + +[def [cmd icf_halt]] + +This instruction halts the machine and blocks any further execution. + +[list_end] + +[subsection {INPUT LOCATION HANDLING}] + +The instructions in this section are for backtracking, they manipulate +the current location [term CL] of the machine state. + +They allow a user of the machine to query and save locations in the +input, and to rewind the current location [term CL] to saved +locations, making them one of the components enabling the +implementation of backtracking parsers. + +[list_begin definitions] + +[def [cmd icl_push]] + +This instruction pushes a copy of the current location [term CL] on +the location stack [term LS]. + +[def [cmd icl_rewind]] + +This instruction pops an entry from the location stack [term LS] and +then moves the current location [term CL] back to this point in the +input. + +[def [cmd icl_pop]] + +This instruction pops an entry from the location stack [term LS] and +discards it. + +[list_end] +[para] + +[subsection {ERROR HANDLING}] + +The instructions in this section provide read and write access to the +error status [term ER] of the machine. + +[list_begin definitions] + +[def [cmd ier_push]] + +This instruction pushes a copy of the current error status [term ER] +on the error stack [term ES]. + +[def [cmd ier_clear]] + +This instruction clears the error status [term ER]. + +[def "[cmd ier_nonterminal] [arg message]"] + +This instruction checks if the error status [term ER] contains an +error whose location is just past the location found in the top entry +of the location stack [term LS]. + +Nothing happens if no such error is found. + +Otherwise the found error is replaced by an error at the location +found on the stack, having the message [arg message]. + +[def [cmd ier_merge]] + +This instruction pops an entry from the error stack [term ES], merges +it with the current error status [term ER] and stores the result of +the merge as the new error status [term ER]. + +[para] + +The merge is performed as described below: + +[para] + +If one of the two error states is empty the other is chosen. If +neither error state is empty, and refering to different locations, +then the error state with the location further in the input is +chosen. If both error states refer to the same location their messages +are merged (with removing duplicates). + +[list_end] + +[subsection {SEMANTIC VALUES}] + +The instructions in this section manipulate the semantic value +[term SV]. + +[list_begin definitions] + +[def [cmd isv_clear]] + +This instruction clears the semantic value [term SV]. + +[def [cmd isv_terminal]] + +This instruction creates a terminal AST node for the current token +[term CT], makes it the semantic value [term SV], and also pushes the +node on the AST stack [term AS]. + +[def "[cmd isv_nonterminal_leaf] [arg nt]"] + +This instruction creates a nonterminal AST node without any children +for the nonterminal [arg nt], and makes it the semantic value +[term SV]. + +[para] + +This instruction should be executed if, and only if the match status +[term OK] indicates a success. + +In the case of a failure [cmd isv_clear] should be called. + +[def "[cmd isv_nonterminal_range] [arg nt]"] + +This instruction creates a nonterminal AST node for the nonterminal + +[arg nt], with a single terminal node as its child, and makes this AST +the semantic value [term SV]. The terminal node refers to the input +string from the location found on top of the location stack [term LS] +to the current location [term CL] (both inclusive). + +[para] + +This instruction should be executed if, and only if the match status +[term OK] indicates a success. + +In the case of a failure [cmd isv_clear] should be called. + +[def "[cmd isv_nonterminal_reduce] [arg nt]"] + +This instruction creates a nonterminal AST node for the nonterminal +[arg nt] and makes it the semantic value [term SV]. + +[para] + +All entries on the AST stack [term AS] above the marker found in the +top entry of the AST Marker stack [term MS] become children of the new +node, with the entry at the stack top becoming the rightmost child. If +the AST Marker stack [term MS] is empty the whole stack is used. The +AST marker stack [term MS] is left unchanged. + +[para] + +This instruction should be executed if, and only if the match status +[term OK] indicates a success. + +In the case of a failure [cmd isv_clear] should be called. + +[list_end] +[para] + +[subsection {AST STACK HANDLING}] + +The instructions in this section manipulate the AST stack [term AS], +and the AST Marker stack [term MS]. + +[list_begin definitions] + +[def [cmd ias_push]] + +This instruction pushes the semantic value [term SV] on the AST stack +[term AS]. + +[def [cmd ias_mark]] + +This instruction pushes a marker for the current state of the AST +stack [term AS] on the AST Marker stack [term MS]. + +[def [cmd ias_mrewind]] + +This instruction pops an entry from the AST Marker stack [term MS] and +then proceeds to pop entries from the AST stack [term AS] until the +state represented by the popped marker has been reached again. + +Nothing is done if the AST stack [term AS] is already smaller than +indicated by the popped marker. + +[def [cmd ias_mpop]] + +This instruction pops an entry from the AST Marker stack [term MS] and +discards it. + +[list_end] + +[vset CATEGORY grammar_me] +[include ../doctools2base/include/feedback.inc] +[manpage_end] diff --git a/tcllib/modules/grammar_me/pkgIndex.tcl b/tcllib/modules/grammar_me/pkgIndex.tcl new file mode 100644 index 0000000..f43762a --- /dev/null +++ b/tcllib/modules/grammar_me/pkgIndex.tcl @@ -0,0 +1,7 @@ +if {![package vsatisfies [package provide Tcl] 8.4]} {return} + +package ifneeded grammar::me::util 0.1 [list source [file join $dir me_util.tcl]] +package ifneeded grammar::me::tcl 0.1 [list source [file join $dir me_tcl.tcl]] +package ifneeded grammar::me::cpu 0.2 [list source [file join $dir me_cpu.tcl]] +package ifneeded grammar::me::cpu::core 0.2 [list source [file join $dir me_cpucore.tcl]] +package ifneeded grammar::me::cpu::gasm 0.1 [list source [file join $dir gasm.tcl]] |