summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--InternalDocs/README.md2
-rw-r--r--InternalDocs/compiler.md651
2 files changed, 653 insertions, 0 deletions
diff --git a/InternalDocs/README.md b/InternalDocs/README.md
index a2502fb..42f6125 100644
--- a/InternalDocs/README.md
+++ b/InternalDocs/README.md
@@ -12,6 +12,8 @@ it is not, please report that through the
[issue tracker](https://github.com/python/cpython/issues).
+[Compiler Design](compiler.md)
+
[Exception Handling](exception_handling.md)
[Adaptive Instruction Families](adaptive.md)
diff --git a/InternalDocs/compiler.md b/InternalDocs/compiler.md
new file mode 100644
index 0000000..0abc10d
--- /dev/null
+++ b/InternalDocs/compiler.md
@@ -0,0 +1,651 @@
+
+Compiler design
+===============
+
+Abstract
+--------
+
+In CPython, the compilation from source code to bytecode involves several steps:
+
+1. Tokenize the source code
+ [Parser/lexer/](https://github.com/python/cpython/blob/main/Parser/lexer/)
+ and [Parser/tokenizer/](https://github.com/python/cpython/blob/main/Parser/tokenizer/).
+2. Parse the stream of tokens into an Abstract Syntax Tree
+ [Parser/parser.c](https://github.com/python/cpython/blob/main/Parser/parser.c).
+3. Transform AST into an instruction sequence
+ [Python/compile.c](https://github.com/python/cpython/blob/main/Python/compile.c).
+4. Construct a Control Flow Graph and apply optimizations to it
+ [Python/flowgraph.c](https://github.com/python/cpython/blob/main/Python/flowgraph.c).
+5. Emit bytecode based on the Control Flow Graph
+ [Python/assemble.c](https://github.com/python/cpython/blob/main/Python/assemble.c).
+
+This document outlines how these steps of the process work.
+
+This document only describes parsing in enough depth to explain what is needed
+for understanding compilation. This document provides a detailed, though not
+exhaustive, view of the how the entire system works. You will most likely need
+to read some source code to have an exact understanding of all details.
+
+
+Parsing
+=======
+
+As of Python 3.9, Python's parser is a PEG parser of a somewhat
+unusual design. It is unusual in the sense that the parser's input is a stream
+of tokens rather than a stream of characters which is more common with PEG
+parsers.
+
+The grammar file for Python can be found in
+[Grammar/python.gram](https://github.com/python/cpython/blob/main/Grammar/python.gram).
+The definitions for literal tokens (such as ``:``, numbers, etc.) can be found in
+[Grammar/Tokens](https://github.com/python/cpython/blob/main/Grammar/Tokens).
+Various C files, including
+[Parser/parser.c](https://github.com/python/cpython/blob/main/Parser/parser.c)
+are generated from these.
+
+See Also:
+
+* [Guide to the parser](https://devguide.python.org/internals/parser/index.html)
+ for a detailed description of the parser.
+
+* [Changing CPython’s grammar](https://devguide.python.org/developer-workflow/grammar/#grammar)
+ for a detailed description of the grammar.
+
+
+Abstract syntax trees (AST)
+===========================
+
+
+The abstract syntax tree (AST) is a high-level representation of the
+program structure without the necessity of containing the source code;
+it can be thought of as an abstract representation of the source code. The
+specification of the AST nodes is specified using the Zephyr Abstract
+Syntax Definition Language (ASDL) [^1], [^2].
+
+The definition of the AST nodes for Python is found in the file
+[Parser/Python.asdl](https://github.com/python/cpython/blob/main/Parser/Python.asdl).
+
+Each AST node (representing statements, expressions, and several
+specialized types, like list comprehensions and exception handlers) is
+defined by the ASDL. Most definitions in the AST correspond to a
+particular source construct, such as an 'if' statement or an attribute
+lookup. The definition is independent of its realization in any
+particular programming language.
+
+The following fragment of the Python ASDL construct demonstrates the
+approach and syntax:
+
+```
+ module Python
+ {
+ stmt = FunctionDef(identifier name, arguments args, stmt* body,
+ expr* decorators)
+ | Return(expr? value) | Yield(expr? value)
+ attributes (int lineno)
+ }
+```
+
+The preceding example describes two different kinds of statements and an
+expression: function definitions, return statements, and yield expressions.
+All three kinds are considered of type ``stmt`` as shown by ``|`` separating
+the various kinds. They all take arguments of various kinds and amounts.
+
+Modifiers on the argument type specify the number of values needed; ``?``
+means it is optional, ``*`` means 0 or more, while no modifier means only one
+value for the argument and it is required. ``FunctionDef``, for instance,
+takes an ``identifier`` for the *name*, ``arguments`` for *args*, zero or more
+``stmt`` arguments for *body*, and zero or more ``expr`` arguments for
+*decorators*.
+
+Do notice that something like 'arguments', which is a node type, is
+represented as a single AST node and not as a sequence of nodes as with
+stmt as one might expect.
+
+All three kinds also have an 'attributes' argument; this is shown by the
+fact that 'attributes' lacks a '|' before it.
+
+The statement definitions above generate the following C structure type:
+
+
+```
+ typedef struct _stmt *stmt_ty;
+
+ struct _stmt {
+ enum { FunctionDef_kind=1, Return_kind=2, Yield_kind=3 } kind;
+ union {
+ struct {
+ identifier name;
+ arguments_ty args;
+ asdl_seq *body;
+ } FunctionDef;
+
+ struct {
+ expr_ty value;
+ } Return;
+
+ struct {
+ expr_ty value;
+ } Yield;
+ } v;
+ int lineno;
+ }
+```
+
+Also generated are a series of constructor functions that allocate (in
+this case) a ``stmt_ty`` struct with the appropriate initialization. The
+``kind`` field specifies which component of the union is initialized. The
+``FunctionDef()`` constructor function sets 'kind' to ``FunctionDef_kind`` and
+initializes the *name*, *args*, *body*, and *attributes* fields.
+
+See also
+[Green Tree Snakes - The missing Python AST docs](https://greentreesnakes.readthedocs.io/en/latest)
+ by Thomas Kluyver.
+
+Memory management
+=================
+
+Before discussing the actual implementation of the compiler, a discussion of
+how memory is handled is in order. To make memory management simple, an **arena**
+is used that pools memory in a single location for easy
+allocation and removal. This enables the removal of explicit memory
+deallocation. Because memory allocation for all needed memory in the compiler
+registers that memory with the arena, a single call to free the arena is all
+that is needed to completely free all memory used by the compiler.
+
+In general, unless you are working on the critical core of the compiler, memory
+management can be completely ignored. But if you are working at either the
+very beginning of the compiler or the end, you need to care about how the arena
+works. All code relating to the arena is in either
+[Include/internal/pycore_pyarena.h](https://github.com/python/cpython/blob/main/Include/internal/pycore_pyarena.h)
+or [Python/pyarena.c](https://github.com/python/cpython/blob/main/Python/pyarena.c).
+
+``PyArena_New()`` will create a new arena. The returned ``PyArena`` structure
+will store pointers to all memory given to it. This does the bookkeeping of
+what memory needs to be freed when the compiler is finished with the memory it
+used. That freeing is done with ``PyArena_Free()``. This only needs to be
+called in strategic areas where the compiler exits.
+
+As stated above, in general you should not have to worry about memory
+management when working on the compiler. The technical details of memory
+management have been designed to be hidden from you for most cases.
+
+The only exception comes about when managing a PyObject. Since the rest
+of Python uses reference counting, there is extra support added
+to the arena to cleanup each PyObject that was allocated. These cases
+are very rare. However, if you've allocated a PyObject, you must tell
+the arena about it by calling ``PyArena_AddPyObject()``.
+
+
+Source code to AST
+==================
+
+The AST is generated from source code using the function
+``_PyParser_ASTFromString()`` or ``_PyParser_ASTFromFile()``
+[Parser/peg_api.c](https://github.com/python/cpython/blob/main/Parser/peg_api.c).
+
+After some checks, a helper function in
+[Parser/parser.c](https://github.com/python/cpython/blob/main/Parser/parser.c)
+begins applying production rules on the source code it receives; converting source
+code to tokens and matching these tokens recursively to their corresponding rule. The
+production rule's corresponding rule function is called on every match. These rule
+functions follow the format `xx_rule`. Where *xx* is the grammar rule
+that the function handles and is automatically derived from
+[Grammar/python.gram](https://github.com/python/cpython/blob/main/Grammar/python.gram) by
+[Tools/peg_generator/pegen/c_generator.py](https://github.com/python/cpython/blob/main/Tools/peg_generator/pegen/c_generator.py).
+
+Each rule function in turn creates an AST node as it goes along. It does this
+by allocating all the new nodes it needs, calling the proper AST node creation
+functions for any required supporting functions and connecting them as needed.
+This continues until all nonterminal symbols are replaced with terminals. If an
+error occurs, the rule functions backtrack and try another rule function. If
+there are no more rules, an error is set and the parsing ends.
+
+The AST node creation helper functions have the name `_PyAST_{xx}`
+where *xx* is the AST node that the function creates. These are defined by the
+ASDL grammar and contained in
+[Python/Python-ast.c](https://github.com/python/cpython/blob/main/Python/Python-ast.c)
+(which is generated by
+[Parser/asdl_c.py](https://github.com/python/cpython/blob/main/Parser/asdl_c.py)
+from
+[Parser/Python.asdl](https://github.com/python/cpython/blob/main/Parser/Python.asdl)).
+This all leads to a sequence of AST nodes stored in ``asdl_seq`` structs.
+
+To demonstrate everything explained so far, here's the
+rule function responsible for a simple named import statement such as
+``import sys``. Note that error-checking and debugging code has been
+omitted. Removed parts are represented by ``...``.
+Furthermore, some comments have been added for explanation. These comments
+may not be present in the actual code.
+
+
+```
+ // This is the production rule (from python.gram) the rule function
+ // corresponds to:
+ // import_name: 'import' dotted_as_names
+ static stmt_ty
+ import_name_rule(Parser *p)
+ {
+ ...
+ stmt_ty _res = NULL;
+ { // 'import' dotted_as_names
+ ...
+ Token * _keyword;
+ asdl_alias_seq* a;
+ // The tokenizing steps.
+ if (
+ (_keyword = _PyPegen_expect_token(p, 513)) // token='import'
+ &&
+ (a = dotted_as_names_rule(p)) // dotted_as_names
+ )
+ {
+ ...
+ // Generate an AST for the import statement.
+ _res = _PyAST_Import ( a , ...);
+ ...
+ goto done;
+ }
+ ...
+ }
+ _res = NULL;
+ done:
+ ...
+ return _res;
+ }
+```
+
+
+To improve backtracking performance, some rules (chosen by applying a
+``(memo)`` flag in the grammar file) are memoized. Each rule function checks if
+a memoized version exists and returns that if so, else it continues in the
+manner stated in the previous paragraphs.
+
+There are macros for creating and using ``asdl_xx_seq *`` types, where *xx* is
+a type of the ASDL sequence. Three main types are defined
+manually -- ``generic``, ``identifier`` and ``int``. These types are found in
+[Python/asdl.c](https://github.com/python/cpython/blob/main/Python/asdl.c)
+and its corresponding header file
+[Include/internal/pycore_asdl.h](https://github.com/python/cpython/blob/main/Include/internal/pycore_asdl.h).
+Functions and macros for creating ``asdl_xx_seq *`` types are as follows:
+
+``_Py_asdl_generic_seq_new(Py_ssize_t, PyArena *)``
+ Allocate memory for an ``asdl_generic_seq`` of the specified length
+``_Py_asdl_identifier_seq_new(Py_ssize_t, PyArena *)``
+ Allocate memory for an ``asdl_identifier_seq`` of the specified length
+``_Py_asdl_int_seq_new(Py_ssize_t, PyArena *)``
+ Allocate memory for an ``asdl_int_seq`` of the specified length
+
+In addition to the three types mentioned above, some ASDL sequence types are
+automatically generated by
+[Parser/asdl_c.py](https://github.com/python/cpython/blob/main/Parser/asdl_c.py)
+and found in
+[Include/internal/pycore_ast.h](https://github.com/python/cpython/blob/main/Include/internal/pycore_ast.h).
+Macros for using both manually defined and automatically generated ASDL
+sequence types are as follows:
+
+``asdl_seq_GET(asdl_xx_seq *, int)``
+ Get item held at a specific position in an ``asdl_xx_seq``
+``asdl_seq_SET(asdl_xx_seq *, int, stmt_ty)``
+ Set a specific index in an ``asdl_xx_seq`` to the specified value
+
+Untyped counterparts exist for some of the typed macros. These are useful
+when a function needs to manipulate a generic ASDL sequence:
+
+``asdl_seq_GET_UNTYPED(asdl_seq *, int)``
+ Get item held at a specific position in an ``asdl_seq``
+``asdl_seq_SET_UNTYPED(asdl_seq *, int, stmt_ty)``
+ Set a specific index in an ``asdl_seq`` to the specified value
+``asdl_seq_LEN(asdl_seq *)``
+ Return the length of an ``asdl_seq`` or ``asdl_xx_seq``
+
+Note that typed macros and functions are recommended over their untyped
+counterparts. Typed macros carry out checks in debug mode and aid
+debugging errors caused by incorrectly casting from ``void *``.
+
+If you are working with statements, you must also worry about keeping
+track of what line number generated the statement. Currently the line
+number is passed as the last parameter to each ``stmt_ty`` function.
+
+See also [PEP 617: New PEG parser for CPython](https://peps.python.org/pep-0617/).
+
+
+Control flow graphs
+===================
+
+A **control flow graph** (often referenced by its acronym, **CFG**) is a
+directed graph that models the flow of a program. A node of a CFG is
+not an individual bytecode instruction, but instead represents a
+sequence of bytecode instructions that always execute sequentially.
+Each node is called a *basic block* and must always execute from
+start to finish, with a single entry point at the beginning and a
+single exit point at the end. If some bytecode instruction *a* needs
+to jump to some other bytecode instruction *b*, then *a* must occur at
+the end of its basic block, and *b* must occur at the start of its
+basic block.
+
+As an example, consider the following code snippet:
+
+.. code-block:: Python
+
+ if x < 10:
+ f1()
+ f2()
+ else:
+ g()
+ end()
+
+The ``x < 10`` guard is represented by its own basic block that
+compares ``x`` with ``10`` and then ends in a conditional jump based on
+the result of the comparison. This conditional jump allows the block
+to point to both the body of the ``if`` and the body of the ``else``. The
+``if`` basic block contains the ``f1()`` and ``f2()`` calls and points to
+the ``end()`` basic block. The ``else`` basic block contains the ``g()``
+call and similarly points to the ``end()`` block.
+
+Note that more complex code in the guard, the ``if`` body, or the ``else``
+body may be represented by multiple basic blocks. For instance,
+short-circuiting boolean logic in a guard like ``if x or y:``
+will produce one basic block that tests the truth value of ``x``
+and then points both (1) to the start of the ``if`` body and (2) to
+a different basic block that tests the truth value of y.
+
+CFGs are useful as an intermediate representation of the code because
+they are a convenient data structure for optimizations.
+
+AST to CFG to bytecode
+======================
+
+The conversion of an ``AST`` to bytecode is initiated by a call to the function
+``_PyAST_Compile()`` in
+[Python/compile.c](https://github.com/python/cpython/blob/main/Python/compile.c).
+
+The first step is to construct the symbol table. This is implemented by
+``_PySymtable_Build()`` in
+[Python/symtable.c](https://github.com/python/cpython/blob/main/Python/symtable.c).
+This function begins by entering the starting code block for the AST (passed-in)
+and then calling the proper `symtable_visit_{xx}` function (with *xx* being the
+AST node type). Next, the AST tree is walked with the various code blocks that
+delineate the reach of a local variable as blocks are entered and exited using
+``symtable_enter_block()`` and ``symtable_exit_block()``, respectively.
+
+Once the symbol table is created, the ``AST`` is transformed by ``compiler_codegen()``
+in [Python/compile.c](https://github.com/python/cpython/blob/main/Python/compile.c)
+into a sequence of pseudo instructions. These are similar to bytecode, but
+in some cases they are more abstract, and are resolved later into actual
+bytecode. The construction of this instruction sequence is handled by several
+functions that break the task down by various AST node types. The functions are
+all named `compiler_visit_{xx}` where *xx* is the name of the node type (such
+as ``stmt``, ``expr``, etc.). Each function receives a ``struct compiler *``
+and `{xx}_ty` where *xx* is the AST node type. Typically these functions
+consist of a large 'switch' statement, branching based on the kind of
+node type passed to it. Simple things are handled inline in the
+'switch' statement with more complex transformations farmed out to other
+functions named `compiler_{xx}` with *xx* being a descriptive name of what is
+being handled.
+
+When transforming an arbitrary AST node, use the ``VISIT()`` macro.
+The appropriate `compiler_visit_{xx}` function is called, based on the value
+passed in for <node type> (so `VISIT({c}, expr, {node})` calls
+`compiler_visit_expr({c}, {node})`). The ``VISIT_SEQ()`` macro is very similar,
+but is called on AST node sequences (those values that were created as
+arguments to a node that used the '*' modifier).
+
+Emission of bytecode is handled by the following macros:
+
+* ``ADDOP(struct compiler *, location, int)``
+ add a specified opcode
+* ``ADDOP_IN_SCOPE(struct compiler *, location, int)``
+ like ``ADDOP``, but also exits current scope; used for adding return value
+ opcodes in lambdas and closures
+* ``ADDOP_I(struct compiler *, location, int, Py_ssize_t)``
+ add an opcode that takes an integer argument
+* ``ADDOP_O(struct compiler *, location, int, PyObject *, TYPE)``
+ add an opcode with the proper argument based on the position of the
+ specified PyObject in PyObject sequence object, but with no handling of
+ mangled names; used for when you
+ need to do named lookups of objects such as globals, consts, or
+ parameters where name mangling is not possible and the scope of the
+ name is known; *TYPE* is the name of PyObject sequence
+ (``names`` or ``varnames``)
+* ``ADDOP_N(struct compiler *, location, int, PyObject *, TYPE)``
+ just like ``ADDOP_O``, but steals a reference to PyObject
+* ``ADDOP_NAME(struct compiler *, location, int, PyObject *, TYPE)``
+ just like ``ADDOP_O``, but name mangling is also handled; used for
+ attribute loading or importing based on name
+* ``ADDOP_LOAD_CONST(struct compiler *, location, PyObject *)``
+ add the ``LOAD_CONST`` opcode with the proper argument based on the
+ position of the specified PyObject in the consts table.
+* ``ADDOP_LOAD_CONST_NEW(struct compiler *, location, PyObject *)``
+ just like ``ADDOP_LOAD_CONST_NEW``, but steals a reference to PyObject
+* ``ADDOP_JUMP(struct compiler *, location, int, basicblock *)``
+ create a jump to a basic block
+
+The ``location`` argument is a struct with the source location to be
+associated with this instruction. It is typically extracted from an
+``AST`` node with the ``LOC`` macro. The ``NO_LOCATION`` can be used
+for *synthetic* instructions, which we do not associate with a line
+number at this stage. For example, the implicit ``return None``
+which is added at the end of a function is not associated with any
+line in the source code.
+
+There are several helper functions that will emit pseudo-instructions
+and are named `compiler_{xx}()` where *xx* is what the function helps
+with (``list``, ``boolop``, etc.). A rather useful one is ``compiler_nameop()``.
+This function looks up the scope of a variable and, based on the
+expression context, emits the proper opcode to load, store, or delete
+the variable.
+
+Once the instruction sequence is created, it is transformed into a CFG
+by ``_PyCfg_FromInstructionSequence()``. Then ``_PyCfg_OptimizeCodeUnit()``
+applies various peephole optimizations, and
+``_PyCfg_OptimizedCfgToInstructionSequence()`` converts the optimized ``CFG``
+back into an instruction sequence. These conversions and optimizations are
+implemented in
+[Python/flowgraph.c](https://github.com/python/cpython/blob/main/Python/flowgraph.c).
+
+Finally, the sequence of pseudo-instructions is converted into actual
+bytecode. This includes transforming pseudo instructions into actual instructions,
+converting jump targets from logical labels to relative offsets, and
+construction of the
+[exception table](exception_handling.md) and
+[locations table](https://github.com/python/cpython/blob/main/Objects/locations.md).
+The bytecode and tables are then wrapped into a ``PyCodeObject`` along with additional
+metadata, including the ``consts`` and ``names`` arrays, information about function
+reference to the source code (filename, etc). All of this is implemented by
+``_PyAssemble_MakeCodeObject()`` in
+[Python/assemble.c](https://github.com/python/cpython/blob/main/Python/assemble.c).
+
+
+Code objects
+============
+
+The result of ``PyAST_CompileObject()`` is a ``PyCodeObject`` which is defined in
+[Include/cpython/code.h](https://github.com/python/cpython/blob/main/Include/cpython/code.h).
+And with that you now have executable Python bytecode!
+
+The code objects (byte code) are executed in
+[Python/ceval.c](https://github.com/python/cpython/blob/main/Python/ceval.c).
+This file will also need a new case statement for the new opcode in the big switch
+statement in ``_PyEval_EvalFrameDefault()``.
+
+
+Important files
+===============
+
+* [Parser/](https://github.com/python/cpython/blob/main/Parser/)
+
+ * [Parser/Python.asdl](https://github.com/python/cpython/blob/main/Parser/Python.asdl):
+ ASDL syntax file.
+
+ * [Parser/asdl.py](https://github.com/python/cpython/blob/main/Parser/asdl.py):
+ Parser for ASDL definition files.
+ Reads in an ASDL description and parses it into an AST that describes it.
+
+ * [Parser/asdl_c.py](https://github.com/python/cpython/blob/main/Parser/asdl_c.py):
+ Generate C code from an ASDL description. Generates
+ [Python/Python-ast.c](https://github.com/python/cpython/blob/main/Python/Python-ast.c)
+ and
+ [Include/internal/pycore_ast.h](https://github.com/python/cpython/blob/main/Include/internal/pycore_ast.h).
+
+ * [Parser/parser.c](https://github.com/python/cpython/blob/main/Parser/parser.c):
+ The new PEG parser introduced in Python 3.9.
+ Generated by
+ [Tools/peg_generator/pegen/c_generator.py](https://github.com/python/cpython/blob/main/Tools/peg_generator/pegen/c_generator.py)
+ from the grammar [Grammar/python.gram](https://github.com/python/cpython/blob/main/Grammar/python.gram).
+ Creates the AST from source code. Rule functions for their corresponding production
+ rules are found here.
+
+ * [Parser/peg_api.c](https://github.com/python/cpython/blob/main/Parser/peg_api.c):
+ Contains high-level functions which are
+ used by the interpreter to create an AST from source code.
+
+ * [Parser/pegen.c](https://github.com/python/cpython/blob/main/Parser/pegen.c):
+ Contains helper functions which are used by functions in
+ [Parser/parser.c](https://github.com/python/cpython/blob/main/Parser/parser.c)
+ to construct the AST. Also contains helper functions which help raise better error messages
+ when parsing source code.
+
+ * [Parser/pegen.h](https://github.com/python/cpython/blob/main/Parser/pegen.h):
+ Header file for the corresponding
+ [Parser/pegen.c](https://github.com/python/cpython/blob/main/Parser/pegen.c).
+ Also contains definitions of the ``Parser`` and ``Token`` structs.
+
+* [Python/](https://github.com/python/cpython/blob/main/Python)
+
+ * [Python/Python-ast.c](https://github.com/python/cpython/blob/main/Python/Python-ast.c):
+ Creates C structs corresponding to the ASDL types. Also contains code for
+ marshalling AST nodes (core ASDL types have marshalling code in
+ [Python/asdl.c](https://github.com/python/cpython/blob/main/Python/asdl.c)).
+ "File automatically generated by
+ [Parser/asdl_c.py](https://github.com/python/cpython/blob/main/Parser/asdl_c.py).
+ This file must be committed separately after every grammar change
+ is committed since the ``__version__`` value is set to the latest
+ grammar change revision number.
+
+ * [Python/asdl.c](https://github.com/python/cpython/blob/main/Python/asdl.c):
+ Contains code to handle the ASDL sequence type.
+ Also has code to handle marshalling the core ASDL types, such as number
+ and identifier. Used by
+ [Python/Python-ast.c](https://github.com/python/cpython/blob/main/Python/Python-ast.c)
+ for marshalling AST nodes.
+
+ * [Python/ast.c](https://github.com/python/cpython/blob/main/Python/ast.c):
+ Used for validating the AST.
+
+ * [Python/ast_opt.c](https://github.com/python/cpython/blob/main/Python/ast_opt.c):
+ Optimizes the AST.
+
+ * [Python/ast_unparse.c](https://github.com/python/cpython/blob/main/Python/ast_unparse.c):
+ Converts the AST expression node back into a string (for string annotations).
+
+ * [Python/ceval.c](https://github.com/python/cpython/blob/main/Python/ceval.c):
+ Executes byte code (aka, eval loop).
+
+ * [Python/symtable.c](https://github.com/python/cpython/blob/main/Python/symtable.c):
+ Generates a symbol table from AST.
+
+ * [Python/pyarena.c](https://github.com/python/cpython/blob/main/Python/pyarena.c):
+ Implementation of the arena memory manager.
+
+ * [Python/compile.c](https://github.com/python/cpython/blob/main/Python/compile.c):
+ Emits pseudo bytecode based on the AST.
+
+ * [Python/flowgraph.c](https://github.com/python/cpython/blob/main/Python/flowgraph.c):
+ Implements peephole optimizations.
+
+ * [Python/assemble.c](https://github.com/python/cpython/blob/main/Python/assemble.c):
+ Constructs a code object from a sequence of pseudo instructions.
+
+ * [Python/instruction_sequence.c.c](https://github.com/python/cpython/blob/main/Python/instruction_sequence.c.c):
+ A data structure representing a sequence of bytecode-like pseudo-instructions.
+
+* [Include/](https://github.com/python/cpython/blob/main/Include/)
+
+ * [Include/cpython/code.h](https://github.com/python/cpython/blob/main/Include/cpython/code.h)
+ : Header file for
+ [Objects/codeobject.c](https://github.com/python/cpython/blob/main/Objects/codeobject.c);
+ contains definition of ``PyCodeObject``.
+
+ * [Include/opcode.h](https://github.com/python/cpython/blob/main/Include/opcode.h)
+ : One of the files that must be modified if
+ [Lib/opcode.py](https://github.com/python/cpython/blob/main/Lib/opcode.py) is.
+
+ * [Include/internal/pycore_ast.h](https://github.com/python/cpython/blob/main/Include/internal/pycore_ast.h)
+ : Contains the actual definitions of the C structs as generated by
+ [Python/Python-ast.c](https://github.com/python/cpython/blob/main/Python/Python-ast.c)
+ "Automatically generated by
+ [Parser/asdl_c.py](https://github.com/python/cpython/blob/main/Parser/asdl_c.py).
+
+ * [Include/internal/pycore_asdl.h](https://github.com/python/cpython/blob/main/Include/internal/pycore_asdl.h)
+ : Header for the corresponding
+ [Python/ast.c](https://github.com/python/cpython/blob/main/Python/ast.c).
+
+ * [Include/internal/pycore_ast.h](https://github.com/python/cpython/blob/main/Include/internal/pycore_ast.h)
+ : Declares ``_PyAST_Validate()`` external (from
+ [Python/ast.c](https://github.com/python/cpython/blob/main/Python/ast.c)).
+
+ * [Include/internal/pycore_symtable.h](https://github.com/python/cpython/blob/main/Include/internal/pycore_symtable.h)
+ : Header for
+ [Python/symtable.c](https://github.com/python/cpython/blob/main/Python/symtable.c).
+ ``struct symtable`` and ``PySTEntryObject`` are defined here.
+
+ * [Include/internal/pycore_parser.h](https://github.com/python/cpython/blob/main/Include/internal/pycore_parser.h)
+ : Header for the corresponding
+ [Parser/peg_api.c](https://github.com/python/cpython/blob/main/Parser/peg_api.c).
+
+ * [Include/internal/pycore_pyarena.h](https://github.com/python/cpython/blob/main/Include/internal/pycore_pyarena.h)
+ : Header file for the corresponding
+ [Python/pyarena.c](https://github.com/python/cpython/blob/main/Python/pyarena.c).
+
+ * [Include/opcode_ids.h](https://github.com/python/cpython/blob/main/Include/opcode_ids.h)
+ : List of opcodes. Generated from
+ [Python/bytecodes.c](https://github.com/python/cpython/blob/main/Python/bytecodes.c)
+ by
+ [Tools/cases_generator/opcode_id_generator.py](https://github.com/python/cpython/blob/main/Tools/cases_generator/opcode_id_generator.py).
+
+* [Objects/](https://github.com/python/cpython/blob/main/Objects/)
+
+ * [Objects/codeobject.c](https://github.com/python/cpython/blob/main/Objects/codeobject.c)
+ : Contains PyCodeObject-related code.
+
+ * [Objects/frameobject.c](https://github.com/python/cpython/blob/main/Objects/frameobject.c)
+ : Contains the ``frame_setlineno()`` function which should determine whether it is allowed
+ to make a jump between two points in a bytecode.
+
+* [Lib/](https://github.com/python/cpython/blob/main/Lib/)
+
+ * [Lib/opcode.py](https://github.com/python/cpython/blob/main/Lib/opcode.py)
+ : opcode utilities exposed to Python.
+
+ * [Lib/importlib/_bootstrap_external.py](https://github.com/python/cpython/blob/main/Lib/importlib/_bootstrap_external.py)
+ : Home of the magic number (named ``MAGIC_NUMBER``) for bytecode versioning.
+
+
+Objects
+=======
+
+* [Objects/locations.md](https://github.com/python/cpython/blob/main/Objects/locations.md): Describes the location table
+* [Objects/frame_layout.md](https://github.com/python/cpython/blob/main/Objects/frame_layout.md): Describes the frame stack
+* [Objects/object_layout.md](https://github.com/python/cpython/blob/main/Objects/object_layout.md): Descibes object layout for 3.11 and later
+* [Exception Handling](exception_handling.md): Describes the exception table
+
+
+Specializing Adaptive Interpreter
+=================================
+
+Adding a specializing, adaptive interpreter to CPython will bring significant
+performance improvements. These documents provide more information:
+
+* [PEP 659: Specializing Adaptive Interpreter](https://peps.python.org/pep-0659/).
+* [Adding or extending a family of adaptive instructions](adaptive.md)
+
+
+References
+==========
+
+[^1]: Daniel C. Wang, Andrew W. Appel, Jeff L. Korn, and Chris
+ S. Serra. `The Zephyr Abstract Syntax Description Language.`_
+ In Proceedings of the Conference on Domain-Specific Languages,
+ pp. 213--227, 1997.
+
+[^2]: The Zephyr Abstract Syntax Description Language.:
+ https://www.cs.princeton.edu/research/techreps/TR-554-97