summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorGuido van Rossum <guido@python.org>2023-02-07 05:03:58 (GMT)
committerGitHub <noreply@github.com>2023-02-07 05:03:58 (GMT)
commit694e346a01f407c7f78138331db006dea79f82a8 (patch)
treee49e3232888d56da0b8e89db43e9e706398fe136
parent1fcc0efdaa84b3602c236391633b70ff36df149b (diff)
downloadcpython-694e346a01f407c7f78138331db006dea79f82a8.zip
cpython-694e346a01f407c7f78138331db006dea79f82a8.tar.gz
cpython-694e346a01f407c7f78138331db006dea79f82a8.tar.bz2
gh-98831: Move DSL documentation here from ideas repo (#101629)
-rw-r--r--Tools/cases_generator/README.md9
-rw-r--r--Tools/cases_generator/interpreter_definition.md409
2 files changed, 415 insertions, 3 deletions
diff --git a/Tools/cases_generator/README.md b/Tools/cases_generator/README.md
index dc055ea..c595a93 100644
--- a/Tools/cases_generator/README.md
+++ b/Tools/cases_generator/README.md
@@ -1,5 +1,8 @@
# Tooling to generate interpreters
+Documentation for the instruction definitions in `Python/bytecodes.c`
+("the DSL") is [here](interpreter_definition.md).
+
What's currently here:
- `lexer.py`: lexer for C, originally written by Mark Shannon
@@ -7,10 +10,10 @@ What's currently here:
- `parser.py`: Parser for instruction definition DSL; main class `Parser`
- `generate_cases.py`: driver script to read `Python/bytecodes.c` and
write `Python/generated_cases.c.h`
+- `test_generator.py`: tests, require manual running using `pytest`
-The DSL for the instruction definitions in `Python/bytecodes.c` is described
-[here](https://github.com/faster-cpython/ideas/blob/main/3.12/interpreter_definition.md).
-Note that there is some dummy C code at the top and bottom of the file
+Note that there is some dummy C code at the top and bottom of
+`Python/bytecodes.c`
to fool text editors like VS Code into believing this is valid C code.
## A bit about the parser
diff --git a/Tools/cases_generator/interpreter_definition.md b/Tools/cases_generator/interpreter_definition.md
new file mode 100644
index 0000000..c7bd38d
--- /dev/null
+++ b/Tools/cases_generator/interpreter_definition.md
@@ -0,0 +1,409 @@
+# A higher level definition of the bytecode interpreter
+
+## Abstract
+
+The CPython interpreter is defined in C, meaning that the semantics of the
+bytecode instructions, the dispatching mechanism, error handling, and
+tracing and instrumentation are all intermixed.
+
+This document proposes defining a custom C-like DSL for defining the
+instruction semantics and tools for generating the code deriving from
+the instruction definitions.
+
+These tools would be used to:
+* Generate the main interpreter (done)
+* Generate the tier 2 interpreter
+* Generate documentation for instructions
+* Generate metadata about instructions, such as stack use (done).
+
+Having a single definition file ensures that there is a single source
+of truth for bytecode semantics.
+
+Other tools that operate on bytecodes, like `frame.setlineno`
+and the `dis` module, will be derived from the common semantic
+definition, reducing errors.
+
+## Motivation
+
+The bytecode interpreter of CPython has traditionally been defined as standard
+C code, but with a lot of macros.
+The presence of these macros and the nature of bytecode interpreters means
+that the interpreter is effectively defined in a domain specific language (DSL).
+
+Rather than using an ad-hoc DSL embedded in the C code for the interpreter,
+a custom DSL should be defined and the semantics of the bytecode instructions,
+and the instructions defined in that DSL.
+
+Generating the interpreter decouples low-level details of dispatching
+and error handling from the semantics of the instructions, resulting
+in more maintainable code and a potentially faster interpreter.
+
+It also provides the ability to create and check optimizers and optimization
+passes from the semantic definition, reducing errors.
+
+## Rationale
+
+As we improve the performance of CPython, we need to optimize larger regions
+of code, use more complex optimizations and, ultimately, translate to machine
+code.
+
+All of these steps introduce the possibility of more bugs, and require more code
+to be written. One way to mitigate this is through the use of code generators.
+Code generators decouple the debugging of the code (the generator) from checking
+the correctness (the DSL input).
+
+For example, we are likely to want a new interpreter for the tier 2 optimizer
+to be added in 3.12. That interpreter will have a different API, a different
+set of instructions and potentially different dispatching mechanism.
+But the instructions it will interpret will be built from the same building
+blocks as the instructions for the tier 1 (PEP 659) interpreter.
+
+Rewriting all the instructions is tedious and error-prone, and changing the
+instructions is a maintenance headache as both versions need to be kept in sync.
+
+By using a code generator and using a common source for the instructions, or
+parts of instructions, we can reduce the potential for errors considerably.
+
+
+## Specification
+
+This specification is at an early stage and is likely to change considerably.
+
+Syntax
+------
+
+Each op definition has a kind, a name, a stack and instruction stream effect,
+and a piece of C code describing its semantics::
+
+```
+ file:
+ (definition | family)+
+
+ definition:
+ "inst" "(" NAME ["," stack_effect] ")" "{" C-code "}"
+ |
+ "op" "(" NAME "," stack_effect ")" "{" C-code "}"
+ |
+ "macro" "(" NAME ")" "=" uop ("+" uop)* ";"
+ |
+ "super" "(" NAME ")" "=" NAME ("+" NAME)* ";"
+
+ stack_effect:
+ "(" [inputs] "--" [outputs] ")"
+
+ inputs:
+ input ("," input)*
+
+ outputs:
+ output ("," output)*
+
+ input:
+ object | stream | array
+
+ output:
+ object | array
+
+ uop:
+ NAME | stream
+
+ object:
+ NAME [":" type] [ "if" "(" C-expression ")" ]
+
+ type:
+ NAME
+
+ stream:
+ NAME "/" size
+
+ size:
+ INTEGER
+
+ array:
+ object "[" C-expression "]"
+
+ family:
+ "family" "(" NAME ")" = "{" NAME ("," NAME)+ "}" ";"
+```
+
+The following definitions may occur:
+
+* `inst`: A normal instruction, as previously defined by `TARGET(NAME)` in `ceval.c`.
+* `op`: A part instruction from which macros can be constructed.
+* `macro`: A bytecode instruction constructed from ops and cache effects.
+* `super`: A super-instruction, such as `LOAD_FAST__LOAD_FAST`, constructed from
+ normal or macro instructions.
+
+`NAME` can be any ASCII identifier that is a C identifier and not a C or Python keyword.
+`foo_1` is legal. `$` is not legal, nor is `struct` or `class`.
+
+The optional `type` in an `object` is the C type. It defaults to `PyObject *`.
+The objects before the "--" are the objects on top of the the stack at the start
+of the instruction. Those after the "--" are the objects on top of the the stack
+at the end of the instruction.
+
+An `inst` without `stack_effect` is a transitional form to allow the original C code
+definitions to be copied. It lacks information to generate anything other than the
+interpreter, but is useful for initial porting of code.
+
+Stack effect names may be `unused`, indicating the space is to be reserved
+but no use of it will be made in the instruction definition.
+This is useful to ensure that all instructions in a family have the same
+stack effect.
+
+The number in a `stream` define how many codeunits are consumed from the
+instruction stream. It returns a 16, 32 or 64 bit value.
+If the name is `unused` the size can be any value and that many codeunits
+will be skipped in the instruction stream.
+
+By convention cache effects (`stream`) must precede the input effects.
+
+The name `oparg` is pre-defined as a 32 bit value fetched from the instruction stream.
+
+The C code may include special functions that are understood by the tools as
+part of the DSL.
+
+Those functions include:
+
+* `DEOPT_IF(cond, instruction)`. Deoptimize if `cond` is met.
+* `ERROR_IF(cond, label)`. Jump to error handler if `cond` is true.
+* `DECREF_INPUTS()`. Generate `Py_DECREF()` calls for the input stack effects.
+
+Variables can either be defined in the input, output, or in the C code.
+Variables defined in the input may not be assigned in the C code.
+If an `ERROR_IF` occurs, all values will be removed from the stack;
+they must already be `DECREF`'ed by the code block.
+If a `DEOPT_IF` occurs, no values will be removed from the stack or
+the instruction stream; no values must have been `DECREF`'ed or created.
+
+These requirements result in the following constraints on the use of
+`DEOPT_IF` and `ERROR_IF` in any instruction's code block:
+
+1. Until the last `DEOPT_IF`, no objects may be allocated, `INCREF`ed,
+ or `DECREF`ed.
+2. Before the first `ERROR_IF`, all input values must be `DECREF`ed,
+ and no objects may be allocated or `INCREF`ed, with the exception
+ of attempting to create an object and checking for success using
+ `ERROR_IF(result == NULL, label)`. (TODO: Unclear what to do with
+ intermediate results.)
+3. No `DEOPT_IF` may follow an `ERROR_IF` in the same block.
+
+Semantics
+---------
+
+The underlying execution model is a stack machine.
+Operations pop values from the stack, and push values to the stack.
+They also can look at, and consume, values from the instruction stream.
+
+All members of a family must have the same stack and instruction stream effect.
+
+Examples
+--------
+
+(Another source of examples can be found in the [tests](test_generator.py).)
+
+Some examples:
+
+### Output stack effect
+```C
+ inst ( LOAD_FAST, (-- value) ) {
+ value = frame->f_localsplus[oparg];
+ Py_INCREF(value);
+ }
+```
+This would generate:
+```C
+ TARGET(LOAD_FAST) {
+ PyObject *value;
+ value = frame->f_localsplus[oparg];
+ Py_INCREF(value);
+ PUSH(value);
+ DISPATCH();
+ }
+```
+
+### Input stack effect
+```C
+ inst ( STORE_FAST, (value --) ) {
+ SETLOCAL(oparg, value);
+ }
+```
+This would generate:
+```C
+ TARGET(STORE_FAST) {
+ PyObject *value = PEEK(1);
+ SETLOCAL(oparg, value);
+ STACK_SHRINK(1);
+ DISPATCH();
+ }
+```
+
+### Super-instruction definition
+
+```C
+ super ( LOAD_FAST__LOAD_FAST ) = LOAD_FAST + LOAD_FAST ;
+```
+This might get translated into the following:
+```C
+ TARGET(LOAD_FAST__LOAD_FAST) {
+ PyObject *value;
+ value = frame->f_localsplus[oparg];
+ Py_INCREF(value);
+ PUSH(value);
+ NEXTOPARG();
+ next_instr++;
+ value = frame->f_localsplus[oparg];
+ Py_INCREF(value);
+ PUSH(value);
+ DISPATCH();
+ }
+```
+
+### Input stack effect and cache effect
+```C
+ op ( CHECK_OBJECT_TYPE, (owner, type_version/2 -- owner) ) {
+ PyTypeObject *tp = Py_TYPE(owner);
+ assert(type_version != 0);
+ DEOPT_IF(tp->tp_version_tag != type_version);
+ }
+```
+This might become (if it was an instruction):
+```C
+ TARGET(CHECK_OBJECT_TYPE) {
+ PyObject *owner = PEEK(1);
+ uint32 type_version = read32(next_instr);
+ PyTypeObject *tp = Py_TYPE(owner);
+ assert(type_version != 0);
+ DEOPT_IF(tp->tp_version_tag != type_version);
+ next_instr += 2;
+ DISPATCH();
+ }
+```
+
+### More examples
+
+For explanations see "Generating the interpreter" below.)
+```C
+ op ( CHECK_HAS_INSTANCE_VALUES, (owner -- owner) ) {
+ PyDictOrValues dorv = *_PyObject_DictOrValuesPointer(owner);
+ DEOPT_IF(!_PyDictOrValues_IsValues(dorv));
+ }
+```
+```C
+ op ( LOAD_INSTANCE_VALUE, (owner, index/1 -- null if (oparg & 1), res) ) {
+ res = _PyDictOrValues_GetValues(dorv)->values[index];
+ DEOPT_IF(res == NULL);
+ Py_INCREF(res);
+ null = NULL;
+ Py_DECREF(owner);
+ }
+```
+```C
+ macro ( LOAD_ATTR_INSTANCE_VALUE ) =
+ counter/1 + CHECK_OBJECT_TYPE + CHECK_HAS_INSTANCE_VALUES +
+ LOAD_INSTANCE_VALUE + unused/4 ;
+```
+```C
+ op ( LOAD_SLOT, (owner, index/1 -- null if (oparg & 1), res) ) {
+ char *addr = (char *)owner + index;
+ res = *(PyObject **)addr;
+ DEOPT_IF(res == NULL);
+ Py_INCREF(res);
+ null = NULL;
+ Py_DECREF(owner);
+ }
+```
+```C
+ macro ( LOAD_ATTR_SLOT ) = counter/1 + CHECK_OBJECT_TYPE + LOAD_SLOT + unused/4;
+```
+```C
+ inst ( BUILD_TUPLE, (items[oparg] -- tuple) ) {
+ tuple = _PyTuple_FromArraySteal(items, oparg);
+ ERROR_IF(tuple == NULL, error);
+ }
+```
+```C
+ inst ( PRINT_EXPR ) {
+ PyObject *value = POP();
+ PyObject *hook = _PySys_GetAttr(tstate, &_Py_ID(displayhook));
+ PyObject *res;
+ if (hook == NULL) {
+ _PyErr_SetString(tstate, PyExc_RuntimeError,
+ "lost sys.displayhook");
+ Py_DECREF(value);
+ goto error;
+ }
+ res = PyObject_CallOneArg(hook, value);
+ Py_DECREF(value);
+ ERROR_IF(res == NULL);
+ Py_DECREF(res);
+ }
+```
+
+### Define an instruction family
+These opcodes all share the same instruction format):
+```C
+ family(load_attr) = { LOAD_ATTR, LOAD_ATTR_INSTANCE_VALUE, LOAD_SLOT } ;
+```
+
+Generating the interpreter
+==========================
+
+The generated C code for a single instruction includes a preamble and dispatch at the end
+which can be easily inserted. What is more complex is ensuring the correct stack effects
+and not generating excess pops and pushes.
+
+For example, in `CHECK_HAS_INSTANCE_VALUES`, `owner` occurs in the input, so it cannot be
+redefined. Thus it doesn't need to written and can be read without adjusting the stack pointer.
+The C code generated for `CHECK_HAS_INSTANCE_VALUES` would look something like:
+
+```C
+ {
+ PyObject *owner = stack_pointer[-1];
+ PyDictOrValues dorv = *_PyObject_DictOrValuesPointer(owner);
+ DEOPT_IF(!_PyDictOrValues_IsValues(dorv));
+ }
+```
+
+When combining ops together to form instructions, temporary values should be used,
+rather than popping and pushing, such that `LOAD_ATTR_SLOT` would look something like:
+
+```C
+ case LOAD_ATTR_SLOT: {
+ PyObject *s1 = stack_pointer[-1];
+ /* CHECK_OBJECT_TYPE */
+ {
+ PyObject *owner = s1;
+ uint32_t type_version = read32(next_instr + 1);
+ PyTypeObject *tp = Py_TYPE(owner);
+ assert(type_version != 0);
+ if (tp->tp_version_tag != type_version) goto deopt;
+ }
+ /* LOAD_SLOT */
+ {
+ PyObject *owner = s1;
+ uint16_t index = *(next_instr + 1 + 2);
+ char *addr = (char *)owner + index;
+ PyObject *null;
+ PyObject *res = *(PyObject **)addr;
+ if (res == NULL) goto deopt;
+ Py_INCREF(res);
+ null = NULL;
+ Py_DECREF(owner);
+ if (oparg & 1) {
+ stack_pointer[0] = null;
+ stack_pointer += 1;
+ }
+ s1 = res;
+ }
+ next_instr += (1 + 1 + 2 + 1 + 4);
+ stack_pointer[-1] = s1;
+ DISPATCH();
+ }
+```
+
+Other tools
+===========
+
+From the instruction definitions we can generate the stack marking code used in `frame.set_lineno()`,
+and the tables for use by disassemblers.
+