summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--InternalDocs/README.md (renamed from InternalDocs/index.md)4
-rw-r--r--InternalDocs/exception_handling.md201
-rw-r--r--Objects/exception_handling_notes.txt182
3 files changed, 205 insertions, 182 deletions
diff --git a/InternalDocs/index.md b/InternalDocs/README.md
index 32b66a2..e69e27d 100644
--- a/InternalDocs/index.md
+++ b/InternalDocs/README.md
@@ -10,3 +10,7 @@ to hold for other implementations of the Python language.
The core dev team attempts to keep this documentation up to date. If
it is not, please report that through the
[issue tracker](https://github.com/python/cpython/issues).
+
+
+[Exception Handling](exception_handling.md)
+
diff --git a/InternalDocs/exception_handling.md b/InternalDocs/exception_handling.md
new file mode 100644
index 0000000..22d9c3b
--- /dev/null
+++ b/InternalDocs/exception_handling.md
@@ -0,0 +1,201 @@
+Description of exception handling
+---------------------------------
+
+Python uses a technique known as "zero-cost" exception handling, which
+minimizes the cost of supporting exceptions. In the common case (where
+no exception is raised) the cost is reduced to zero (or close to zero).
+The cost of raising an exception is increased, but not by much.
+
+The following code:
+
+```
+try:
+ g(0)
+except:
+ res = "fail"
+
+```
+
+compiles into intermediate code like the following:
+
+```
+ RESUME 0
+
+ 1 SETUP_FINALLY 8 (to L1)
+
+ 2 LOAD_NAME 0 (g)
+ PUSH_NULL
+ LOAD_CONST 0 (0)
+ CALL 1
+ POP_TOP
+ POP_BLOCK
+
+ -- L1: PUSH_EXC_INFO
+
+ 3 POP_TOP
+
+ 4 LOAD_CONST 1 ('fail')
+ STORE_NAME 1 (res)
+```
+
+`SETUP_FINALLY` and `POP_BLOCK` are pseudo-instructions. This means
+that they can appear in intermediate code but they are not bytecode
+instructions. `SETUP_FINALLY` specifies that henceforth, exceptions
+are handled by the code at label L1. The `POP_BLOCK` instruction
+reverses the effect of the last `SETUP` instruction, so that the
+active exception handler reverts to what it was before.
+
+`SETUP_FINALLY` and `POP_BLOCK` have no effect when no exceptions
+are raised. The idea of zero-cost exception handling is to replace
+these pseudo-instructions by metadata which is stored alongside the
+bytecode, and which is inspected only when an exception occurs.
+This metadata is the exception table, and it is stored in the code
+object's `co_exceptiontable` field.
+
+When the pseudo-instructions are translated into bytecode,
+`SETUP_FINALLY` and `POP_BLOCK` are removed, and the exception
+table is constructed, mapping each instruction to the exception
+handler that covers it, if any. Instructions which are not
+covered by any exception handler within the same code object's
+bytecode, do not appear in the exception table at all.
+
+For the code object in our example above, the table has a single
+entry specifying that all instructions that were between the
+`SETUP_FINALLY` and the `POP_BLOCK` are covered by the exception
+handler located at label `L1`.
+
+Handling Exceptions
+-------------------
+
+At runtime, when an exception occurs, the interpreter looks up
+the offset of the current instruction in the exception table. If
+it finds a handler, control flow transfers to it. Otherwise, the
+exception bubbles up to the caller, and the caller's frame is
+checked for a handler covering the `CALL` instruction. This
+repeats until a handler is found or the topmost frame is reached.
+If no handler is found, the program terminates. During unwinding,
+the traceback is constructed as each frame is added to it.
+
+Along with the location of an exception handler, each entry of the
+exception table also contains the stack depth of the `try` instruction
+and a boolean `lasti` value, which indicates whether the instruction
+offset of the raising instruction should be pushed to the stack.
+
+Handling an exception, once an exception table entry is found, consists
+of the following steps:
+
+ 1. pop values from the stack until it matches the stack depth for the handler.
+ 2. if `lasti` is true, then push the offset that the exception was raised at.
+ 3. push the exception to the stack.
+ 4. jump to the target offset and resume execution.
+
+
+Reraising Exceptions and `lasti`
+--------------------------------
+
+The purpose of pushing `lasti` to the stack is for cases where an exception
+needs to be re-raised, and be associated with the original instruction that
+raised it. This happens, for example, at the end of a `finally` block, when
+any in-flight exception needs to be propagated on. As the frame's instruction
+pointer now points into the finally block, a `RERAISE` instruction
+(with `oparg > 0`) sets it to the `lasti` value from the stack.
+
+Format of the exception table
+-----------------------------
+
+Conceptually, the exception table consists of a sequence of 5-tuples:
+```
+ 1. `start-offset` (inclusive)
+ 2. `end-offset` (exclusive)
+ 3. `target`
+ 4. `stack-depth`
+ 5. `push-lasti` (boolean)
+```
+
+All offsets and lengths are in code units, not bytes.
+
+We want the format to be compact, but quickly searchable.
+For it to be compact, it needs to have variable sized entries so that we can store common (small) offsets compactly, but handle large offsets if needed.
+For it to be searchable quickly, we need to support binary search giving us log(n) performance in all cases.
+Binary search typically assumes fixed size entries, but that is not necessary, as long as we can identify the start of an entry.
+
+It is worth noting that the size (end-start) is always smaller than the end, so we encode the entries as:
+ `start, size, target, depth, push-lasti`.
+
+Also, sizes are limited to 2**30 as the code length cannot exceed 2**31 and each code unit takes 2 bytes.
+It also happens that depth is generally quite small.
+
+So, we need to encode:
+```
+ `start` (up to 30 bits)
+ `size` (up to 30 bits)
+ `target` (up to 30 bits)
+ `depth` (up to ~8 bits)
+ `lasti` (1 bit)
+```
+
+We need a marker for the start of the entry, so the first byte of entry will have the most significant bit set.
+Since the most significant bit is reserved for marking the start of an entry, we have 7 bits per byte to encode offsets.
+Encoding uses a standard varint encoding, but with only 7 bits instead of the usual 8.
+The 8 bits of a byte are (msb left) SXdddddd where S is the start bit. X is the extend bit meaning that the next byte is required to extend the offset.
+
+In addition, we combine `depth` and `lasti` into a single value, `((depth<<1)+lasti)`, before encoding.
+
+For example, the exception entry:
+```
+ `start`: 20
+ `end`: 28
+ `target`: 100
+ `depth`: 3
+ `lasti`: False
+```
+
+is encoded by first converting to the more compact four value form:
+```
+ `start`: 20
+ `size`: 8
+ `target`: 100
+ `depth<<1+lasti`: 6
+```
+
+which is then encoded as:
+```
+ 148 (MSB + 20 for start)
+ 8 (size)
+ 65 (Extend bit + 1)
+ 36 (Remainder of target, 100 == (1<<6)+36)
+ 6
+```
+
+for a total of five bytes.
+
+
+Script to parse the exception table
+-----------------------------------
+
+```
+def parse_varint(iterator):
+ b = next(iterator)
+ val = b & 63
+ while b&64:
+ val <<= 6
+ b = next(iterator)
+ val |= b&63
+ return val
+```
+```
+def parse_exception_table(code):
+ iterator = iter(code.co_exceptiontable)
+ try:
+ while True:
+ start = parse_varint(iterator)*2
+ length = parse_varint(iterator)*2
+ end = start + length - 2 # Present as inclusive, not exclusive
+ target = parse_varint(iterator)*2
+ dl = parse_varint(iterator)
+ depth = dl >> 1
+ lasti = bool(dl&1)
+ yield start, end, target, depth, lasti
+ except StopIteration:
+ return
+```
diff --git a/Objects/exception_handling_notes.txt b/Objects/exception_handling_notes.txt
deleted file mode 100644
index 387ef93..0000000
--- a/Objects/exception_handling_notes.txt
+++ /dev/null
@@ -1,182 +0,0 @@
-Description of exception handling in Python 3.11
-------------------------------------------------
-
-Python 3.11 uses what is known as "zero-cost" exception handling.
-Prior to 3.11, exceptions were handled by a runtime stack of "blocks".
-
-In zero-cost exception handling, the cost of supporting exceptions is minimized.
-In the common case (where no exception is raised) the cost is reduced
-to zero (or close to zero).
-The cost of raising an exception is increased, but not by much.
-
-The following code:
-
-def f():
- try:
- g(0)
- except:
- return "fail"
-
-compiles as follows in 3.10:
-
- 2 0 SETUP_FINALLY 7 (to 16)
-
- 3 2 LOAD_GLOBAL 0 (g)
- 4 LOAD_CONST 1 (0)
- 6 CALL_NO_KW 1
- 8 POP_TOP
- 10 POP_BLOCK
- 12 LOAD_CONST 0 (None)
- 14 RETURN_VALUE
-
- 4 >> 16 POP_TOP
- 18 POP_TOP
- 20 POP_TOP
-
- 5 22 POP_EXCEPT
- 24 LOAD_CONST 3 ('fail')
- 26 RETURN_VALUE
-
-Note the explicit instructions to push and pop from the "block" stack:
-SETUP_FINALLY and POP_BLOCK.
-
-In 3.11, the SETUP_FINALLY and POP_BLOCK are eliminated, replaced with
-a table to determine where to jump to when an exception is raised.
-
- 1 0 RESUME 0
-
- 2 2 NOP
-
- 3 4 LOAD_GLOBAL 1 (g + NULL)
- 16 LOAD_CONST 1 (0)
- 18 PRECALL 1
- 22 CALL 1
- 32 POP_TOP
- 34 LOAD_CONST 0 (None)
- 36 RETURN_VALUE
- >> 38 PUSH_EXC_INFO
-
- 4 40 POP_TOP
-
- 5 42 POP_EXCEPT
- 44 LOAD_CONST 2 ('fail')
- 46 RETURN_VALUE
- >> 48 COPY 3
- 50 POP_EXCEPT
- 52 RERAISE 1
-ExceptionTable:
- 4 to 32 -> 38 [0]
- 38 to 40 -> 48 [1] lasti
-
-(Note this code is from 3.11, later versions may have slightly different bytecode.)
-
-If an instruction raises an exception then its offset is used to find the target to jump to.
-For example, the CALL at offset 22, falls into the range 4 to 32.
-So, if g() raises an exception, then control jumps to offset 38.
-
-
-Unwinding
----------
-
-When an exception is raised, the current instruction offset is used to find following:
-target to jump to, stack depth, and 'lasti', which determines whether the instruction
-offset of the raising instruction should be pushed.
-
-This information is stored in the exception table, described below.
-
-If there is no relevant entry, the exception bubbles up to the caller.
-
-If there is an entry, then:
- 1. pop values from the stack until it matches the stack depth for the handler.
- 2. if 'lasti' is true, then push the offset that the exception was raised at.
- 3. push the exception to the stack.
- 4. jump to the target offset and resume execution.
-
-
-Format of the exception table
------------------------------
-
-Conceptually, the exception table consists of a sequence of 5-tuples:
- 1. start-offset (inclusive)
- 2. end-offset (exclusive)
- 3. target
- 4. stack-depth
- 5. push-lasti (boolean)
-
-All offsets and lengths are in instructions, not bytes.
-
-We want the format to be compact, but quickly searchable.
-For it to be compact, it needs to have variable sized entries so that we can store common (small) offsets compactly, but handle large offsets if needed.
-For it to be searchable quickly, we need to support binary search giving us log(n) performance in all cases.
-Binary search typically assumes fixed size entries, but that is not necessary, as long as we can identify the start of an entry.
-
-It is worth noting that the size (end-start) is always smaller than the end, so we encode the entries as:
- start, size, target, depth, push-lasti
-
-Also, sizes are limited to 2**30 as the code length cannot exceed 2**31 and each instruction takes 2 bytes.
-It also happens that depth is generally quite small.
-
-So, we need to encode:
- start (up to 30 bits)
- size (up to 30 bits)
- target (up to 30 bits)
- depth (up to ~8 bits)
- lasti (1 bit)
-
-We need a marker for the start of the entry, so the first byte of entry will have the most significant bit set.
-Since the most significant bit is reserved for marking the start of an entry, we have 7 bits per byte to encode offsets.
-Encoding uses a standard varint encoding, but with only 7 bits instead of the usual 8.
-The 8 bits of a bit are (msb left) SXdddddd where S is the start bit. X is the extend bit meaning that the next byte is required to extend the offset.
-
-In addition, we will combine depth and lasti into a single value, ((depth<<1)+lasti), before encoding.
-
-For example, the exception entry:
- start: 20
- end: 28
- target: 100
- depth: 3
- lasti: False
-
-is encoded first by converting to the more compact four value form:
- start: 20
- size: 8
- target: 100
- depth<<1+lasti: 6
-
-which is then encoded as:
- 148 (MSB + 20 for start)
- 8 (size)
- 65 (Extend bit + 1)
- 36 (Remainder of target, 100 == (1<<6)+36)
- 6
-
-for a total of five bytes.
-
-
-
-Script to parse the exception table
------------------------------------
-
-def parse_varint(iterator):
- b = next(iterator)
- val = b & 63
- while b&64:
- val <<= 6
- b = next(iterator)
- val |= b&63
- return val
-
-def parse_exception_table(code):
- iterator = iter(code.co_exceptiontable)
- try:
- while True:
- start = parse_varint(iterator)*2
- length = parse_varint(iterator)*2
- end = start + length - 2 # Present as inclusive, not exclusive
- target = parse_varint(iterator)*2
- dl = parse_varint(iterator)
- depth = dl >> 1
- lasti = bool(dl&1)
- yield start, end, target, depth, lasti
- except StopIteration:
- return