gh-124452: Fix header mismatches when folding/unfolding with email message (#125919)

The header-folder of the new email API has a long standing known buglet where if the first token is longer than max_line_length, it puts that token on the next line. It turns out there is also a *parsing* bug when parsing such a header: the space prefixing that first, non-empty line gets preserved and tacked on to the start of the header value, which is not the expected behavior per the RFCs. The bug arises from the fact that the parser assumed that there would be at least one token on the line with the header, which is going to be true for probably every email producer other than the python email library with its folding buglet. Clearly, though, this is a case that needs to be handled correctly. The fix is simple: strip the blanks off the start of the whole value, not just the first physical line of the value. Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
author: RanKKI <hliu86.me@gmail.com> 2024-11-16 23:01:52 (GMT)
committer: GitHub <noreply@github.com> 2024-11-16 23:01:52 (GMT)
commit: ed81971e6b26c34445f06850192b34458b029337 (patch)
tree: 1089c83be8c5ae5bcd12f9f4c087789e76ce2e31 /Lib/email
parent: 2313f8421080ceb3343c6f5d291279adea85e073 (diff)
download: cpython-ed81971e6b26c34445f06850192b34458b029337.zip
cpython-ed81971e6b26c34445f06850192b34458b029337.tar.gz
cpython-ed81971e6b26c34445f06850192b34458b029337.tar.bz2
2 files changed, 4 insertions, 4 deletions
diff --git a/Lib/email/_policybase.py b/Lib/email/_policybase.py
index c7694a4..4b63b97 100644
--- a/Lib/email/_policybase.py
+++ b/Lib/email/_policybase.py
@@ -302,12 +302,12 @@ class Compat32(Policy):
         """+
         The name is parsed as everything up to the ':' and returned unmodified.
         The value is determined by stripping leading whitespace off the
-        remainder of the first line, joining all subsequent lines together, and
+        remainder of the first line joined with all subsequent lines, and
         stripping any trailing carriage return or linefeed characters.
 
         """
         name, value = sourcelines[0].split(':', 1)
-        value = value.lstrip(' \t') + ''.join(sourcelines[1:])
+        value = ''.join((value, *sourcelines[1:])).lstrip(' \t\r\n')
         return (name, value.rstrip('\r\n'))
 
     def header_store_parse(self, name, value):
diff --git a/Lib/email/policy.py b/Lib/email/policy.py
index 46b7de5..6e109b6 100644
--- a/Lib/email/policy.py
+++ b/Lib/email/policy.py
@@ -119,13 +119,13 @@ class EmailPolicy(Policy):
         """+
         The name is parsed as everything up to the ':' and returned unmodified.
         The value is determined by stripping leading whitespace off the
-        remainder of the first line, joining all subsequent lines together, and
+        remainder of the first line joined with all subsequent lines, and
         stripping any trailing carriage return or linefeed characters.  (This
         is the same as Compat32).
 
         """
         name, value = sourcelines[0].split(':', 1)
-        value = value.lstrip(' \t') + ''.join(sourcelines[1:])
+        value = ''.join((value, *sourcelines[1:])).lstrip(' \t\r\n')
         return (name, value.rstrip('\r\n'))
 
     def header_store_parse(self, name, value):
author	RanKKI <hliu86.me@gmail.com>	2024-11-16 23:01:52 (GMT)
committer	GitHub <noreply@github.com>	2024-11-16 23:01:52 (GMT)
commit	ed81971e6b26c34445f06850192b34458b029337 (patch)
tree	1089c83be8c5ae5bcd12f9f4c087789e76ce2e31 /Lib/email
parent	2313f8421080ceb3343c6f5d291279adea85e073 (diff)
download	cpython-ed81971e6b26c34445f06850192b34458b029337.zip cpython-ed81971e6b26c34445f06850192b34458b029337.tar.gz cpython-ed81971e6b26c34445f06850192b34458b029337.tar.bz2