closes bpo-34056: Always return bytes from _HackedGetData.get_data(). (GH-8130)

* Always return bytes from _HackedGetData.get_data(). Ensure the imp.load_source shim always returns bytes by reopening the file in binary mode if needed. Hash-based pycs have to receive the source code in bytes. It's tempting to change imp.get_suffixes() to always return 'rb' as a mode, but that breaks some stdlib tests and likely 3rdparty code, too. (cherry picked from commit b0274f2cddd36b49fe5080efbe160277ef546471) Co-authored-by: Benjamin Peterson <benjamin@python.org>
author: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com> 2018-07-07 04:00:45 (GMT)
committer: GitHub <noreply@github.com> 2018-07-07 04:00:45 (GMT)
commit: 7bd6f0e5500f778e940374237b94651f60ae1990 (patch)
tree: a3200c2fce1d51d88a51eb44bd09fc9522df83cf /Lib/imp.py
parent: 127bd9bfd591c8ec1a97eb7f4037c8b884eef973 (diff)
download: cpython-7bd6f0e5500f778e940374237b94651f60ae1990.zip
cpython-7bd6f0e5500f778e940374237b94651f60ae1990.tar.gz
cpython-7bd6f0e5500f778e940374237b94651f60ae1990.tar.bz2
1 files changed, 6 insertions, 7 deletions
diff --git a/Lib/imp.py b/Lib/imp.py
index 866464b..31f8c76 100644
--- a/Lib/imp.py
+++ b/Lib/imp.py
@@ -142,17 +142,16 @@ class _HackedGetData:
     def get_data(self, path):
         """Gross hack to contort loader to deal w/ load_*()'s bad API."""
         if self.file and path == self.path:
+            # The contract of get_data() requires us to return bytes. Reopen the
+            # file in binary mode if needed.
             if not self.file.closed:
                 file = self.file
-            else:
-                self.file = file = open(self.path, 'r')
+                if 'b' not in file.mode:
+                    file.close()
+            if self.file.closed:
+                self.file = file = open(self.path, 'rb')
 
             with file:
-                # Technically should be returning bytes, but
-                # SourceLoader.get_code() just passed what is returned to
-                # compile() which can handle str. And converting to bytes would
-                # require figuring out the encoding to decode to and
-                # tokenize.detect_encoding() only accepts bytes.
                 return file.read()
         else:
             return super().get_data(path)
author	Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>	2018-07-07 04:00:45 (GMT)
committer	GitHub <noreply@github.com>	2018-07-07 04:00:45 (GMT)
commit	7bd6f0e5500f778e940374237b94651f60ae1990 (patch)
tree	a3200c2fce1d51d88a51eb44bd09fc9522df83cf /Lib/imp.py
parent	127bd9bfd591c8ec1a97eb7f4037c8b884eef973 (diff)
download	cpython-7bd6f0e5500f778e940374237b94651f60ae1990.zip cpython-7bd6f0e5500f778e940374237b94651f60ae1990.tar.gz cpython-7bd6f0e5500f778e940374237b94651f60ae1990.tar.bz2