diff options
author | Guido van Rossum <guido@python.org> | 2023-06-14 15:19:24 (GMT) |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-06-14 15:19:24 (GMT) |
commit | 74c2422fa2d7e17969a7554a3bf17f91e4e5a85f (patch) | |
tree | 44153cf969938e0766b723a0b71e07eb06654ee4 | |
parent | 1d857da7f0e4858e561223f319ae5afe737d5657 (diff) | |
download | cpython-74c2422fa2d7e17969a7554a3bf17f91e4e5a85f.zip cpython-74c2422fa2d7e17969a7554a3bf17f91e4e5a85f.tar.gz cpython-74c2422fa2d7e17969a7554a3bf17f91e4e5a85f.tar.bz2 |
Update DSL docs for cases generator (#105753)
* Clarify things around goto error/ERROR_IF a bit
* Remove docs for super-instructions
* Add pseudo; fix heading markup
-rw-r--r-- | Tools/cases_generator/interpreter_definition.md | 104 |
1 files changed, 60 insertions, 44 deletions
diff --git a/Tools/cases_generator/interpreter_definition.md b/Tools/cases_generator/interpreter_definition.md index 6f902f6..c03870e 100644 --- a/Tools/cases_generator/interpreter_definition.md +++ b/Tools/cases_generator/interpreter_definition.md @@ -67,17 +67,17 @@ parts of instructions, we can reduce the potential for errors considerably. ## Specification -This specification is at an early stage and is likely to change considerably. +This specification is a work in progress. +We update it as the need arises. -Syntax ------- +### Syntax Each op definition has a kind, a name, a stack and instruction stream effect, and a piece of C code describing its semantics:: ``` file: - (definition | family)+ + (definition | family | pseudo)+ definition: "inst" "(" NAME ["," stack_effect] ")" "{" C-code "}" @@ -85,8 +85,6 @@ and a piece of C code describing its semantics:: "op" "(" NAME "," stack_effect ")" "{" C-code "}" | "macro" "(" NAME ")" "=" uop ("+" uop)* ";" - | - "super" "(" NAME ")" "=" NAME ("+" NAME)* ";" stack_effect: "(" [inputs] "--" [outputs] ")" @@ -122,7 +120,10 @@ and a piece of C code describing its semantics:: object "[" C-expression "]" family: - "family" "(" NAME ")" = "{" NAME ("," NAME)+ "}" ";" + "family" "(" NAME ")" = "{" NAME ("," NAME)+ [","] "}" ";" + + pseudo: + "pseudo" "(" NAME ")" = "{" NAME ("," NAME)+ [","] "}" ";" ``` The following definitions may occur: @@ -130,8 +131,6 @@ The following definitions may occur: * `inst`: A normal instruction, as previously defined by `TARGET(NAME)` in `ceval.c`. * `op`: A part instruction from which macros can be constructed. * `macro`: A bytecode instruction constructed from ops and cache effects. -* `super`: A super-instruction, such as `LOAD_FAST__LOAD_FAST`, constructed from - normal or macro instructions. `NAME` can be any ASCII identifier that is a C identifier and not a C or Python keyword. `foo_1` is legal. `$` is not legal, nor is `struct` or `class`. @@ -159,15 +158,21 @@ By convention cache effects (`stream`) must precede the input effects. The name `oparg` is pre-defined as a 32 bit value fetched from the instruction stream. +### Special functions/macros + The C code may include special functions that are understood by the tools as part of the DSL. Those functions include: * `DEOPT_IF(cond, instruction)`. Deoptimize if `cond` is met. -* `ERROR_IF(cond, label)`. Jump to error handler if `cond` is true. +* `ERROR_IF(cond, label)`. Jump to error handler at `label` if `cond` is true. * `DECREF_INPUTS()`. Generate `Py_DECREF()` calls for the input stack effects. +Note that the use of `DECREF_INPUTS()` is optional -- manual calls +to `Py_DECREF()` or other approaches are also acceptable +(e.g. calling an API that "steals" a reference). + Variables can either be defined in the input, output, or in the C code. Variables defined in the input may not be assigned in the C code. If an `ERROR_IF` occurs, all values will be removed from the stack; @@ -187,17 +192,39 @@ These requirements result in the following constraints on the use of intermediate results.) 3. No `DEOPT_IF` may follow an `ERROR_IF` in the same block. -Semantics ---------- +(There is some wiggle room: these rules apply to dynamic code paths, +not to static occurrences in the source code.) + +If code detects an error condition before the first `DECREF` of an input, +two idioms are valid: + +- Use `goto error`. +- Use a block containing the appropriate `DECREF` calls ending in + `ERROR_IF(true, error)`. + +An example of the latter would be: +```cc + res = PyObject_Add(left, right); + if (res == NULL) { + DECREF_INPUTS(); + ERROR_IF(true, error); + } +``` + +### Semantics The underlying execution model is a stack machine. Operations pop values from the stack, and push values to the stack. They also can look at, and consume, values from the instruction stream. -All members of a family must have the same stack and instruction stream effect. +All members of a family +(which represents a specializable instruction and its specializations) +must have the same stack and instruction stream effect. + +The same is true for all members of a pseudo instruction +(which is mapped by the bytecode compiler to one of its members). -Examples --------- +## Examples (Another source of examples can be found in the [tests](test_generator.py).) @@ -237,27 +264,6 @@ This would generate: } ``` -### Super-instruction definition - -```C - super ( LOAD_FAST__LOAD_FAST ) = LOAD_FAST + LOAD_FAST ; -``` -This might get translated into the following: -```C - TARGET(LOAD_FAST__LOAD_FAST) { - PyObject *value; - value = frame->f_localsplus[oparg]; - Py_INCREF(value); - PUSH(value); - NEXTOPARG(); - next_instr++; - value = frame->f_localsplus[oparg]; - Py_INCREF(value); - PUSH(value); - DISPATCH(); - } -``` - ### Input stack effect and cache effect ```C op ( CHECK_OBJECT_TYPE, (owner, type_version/2 -- owner) ) { @@ -339,14 +345,26 @@ For explanations see "Generating the interpreter" below.) } ``` -### Define an instruction family -These opcodes all share the same instruction format): +### Defining an instruction family + +A _family_ represents a specializable instruction and its specializations. + +Example: These opcodes all share the same instruction format): +```C + family(load_attr) = { LOAD_ATTR, LOAD_ATTR_INSTANCE_VALUE, LOAD_SLOT }; +``` + +### Defining a pseudo instruction + +A _pseudo instruction_ is used by the bytecode compiler to represent a set of possible concrete instructions. + +Example: `JUMP` may expand to `JUMP_FORWARD` or `JUMP_BACKWARD`: ```C - family(load_attr) = { LOAD_ATTR, LOAD_ATTR_INSTANCE_VALUE, LOAD_SLOT } ; + pseudo(JUMP) = { JUMP_FORWARD, JUMP_BACKWARD }; ``` -Generating the interpreter -========================== + +## Generating the interpreter The generated C code for a single instruction includes a preamble and dispatch at the end which can be easily inserted. What is more complex is ensuring the correct stack effects @@ -401,9 +419,7 @@ rather than popping and pushing, such that `LOAD_ATTR_SLOT` would look something } ``` -Other tools -=========== +## Other tools From the instruction definitions we can generate the stack marking code used in `frame.set_lineno()`, and the tables for use by disassemblers. - |