From 0711642eec58d00f31031279df799fbddaf6f2c1 Mon Sep 17 00:00:00 2001
From: Greg Price <gnprice@gmail.com>
Date: Tue, 10 Sep 2019 01:51:04 -0700
Subject: Cut tricky `goto` that isn't needed, in _PyBytes_DecodeEscape.
 (GH-15825)

This is the sort of `goto` that requires the reader to stare hard at
the code to unpick what it's doing.

On doing so, the answer is... not very much!

* It jumps from the bottom of the loop to almost the top; the effect
  is to bypass the loop condition `s < end` and also the
  `if`-condition `*s != '\\'`, acting as if both are true.

* We've just decremented `s`, after incrementing it in the `switch`
  condition.  So it has the same value as when `s == end` failed.
  Before that was another increment... and before that we had
  `s < end`.  So `s < end` true, then increment, then `s == end`
  false... that means `s < end` is still true.

* Also this means `s` points to the same character as it did for the
  `switch` condition.  And there was a `case '\\'`, which we didn't
  hit -- so `*s != '\\'` is also true.

* That means this has no effect on the behavior!  The most it might do
  is an optimization -- we get to skip those two checks, because (as
  just proven above) we know they're true.

* But gosh, this is the *invalid escape sequence* path.  This does not
  seem like the kind of code path that calls for extreme optimization
  tricks.

So, take the `goto` and the label out.

Perhaps the compiler will notice the exact same facts we showed above,
and generate identical code.  Or perhaps it won't!  That'll be OK.

But then, crucially, if some future edit to this loop causes the
reasoning above to *stop* holding true... the compiler will adjust
this jump accordingly.  One of us fallible humans might not.
---
 Objects/bytesobject.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/Objects/bytesobject.c b/Objects/bytesobject.c
index 6d55330..72f9cd2 100644
--- a/Objects/bytesobject.c
+++ b/Objects/bytesobject.c
@@ -1142,7 +1142,6 @@ PyObject *_PyBytes_DecodeEscape(const char *s,
     end = s + len;
     while (s < end) {
         if (*s != '\\') {
-          non_esc:
             if (!(recode_encoding && (*s & 0x80))) {
                 *p++ = *s++;
             }
@@ -1229,8 +1228,6 @@ PyObject *_PyBytes_DecodeEscape(const char *s,
             }
             *p++ = '\\';
             s--;
-            goto non_esc; /* an arbitrary number of unescaped
-                             UTF-8 bytes may follow. */
         }
     }
 
-- 
cgit v0.12