summaryrefslogtreecommitdiffstats
path: root/Lib/email/FeedParser.py
Commit message (Collapse)AuthorAgeFilesLines
* Fixes for SF #1076485, which I'll apply to the CVS head too. The problem wasBarry Warsaw2004-12-051-2/+13
| | | | | | | | | | | | | | | | caused by a self._input.readline() call that wasn't checking for the NeedsMoreData marker. msg_43.txt contains a message that illustrates the problem, when email.message_from_*() is called. That interface uses the Parser API, which splits reads into 8192 byte chunks. It so happens that for the test message, the 8192 chunk falls inside a message/delivery-status, which is where in the FeedParser the readline() call was that didn't check for NeedsMoreData. I also added an assert to unreadline() so it'll be more evident if an attempt to push back NeedsMoreData ever happens again. Bump the email package version number.
* RFC 2822 describes the characters allowed in a header field name. Conform toBarry Warsaw2004-11-291-1/+3
| | | | this, and add test cases.
* Fix for SF bug #1072623. When the last line of the input string does not endBarry Warsaw2004-11-281-1/+2
| | | | | | | | | in a newline, and it's an end boundary, the FeedParser wasn't recognizing it as such. Tweak the regexp to make the ending linesep optional. For grins, clear self._partial when closing the BufferedSubFile. Added a test case.
* Fix SF bug # 1030941. In _parsegen(), in the clause where we'reBarry Warsaw2004-10-091-3/+7
| | | | | | | capturing_preamble but we found a StartBoundaryNotFoundDefect, we need to consume all lines from the current position to the EOF, which we'll set as the epilogue of the current message. If we're not at EOF when we return from here, the outer message's capturing_preamble assertion will fail.
* Big email 3.0 API changes, with updated unit tests and documentation.Barry Warsaw2004-10-031-8/+14
| | | | | | | | | | | | | | | | | Briefly (from the NEWS file): - Updates for the email package: + All deprecated APIs that in email 2.x issued warnings have been removed: _encoder argument to the MIMEText constructor, Message.add_payload(), Utils.dump_address_pair(), Utils.decode(), Utils.encode() + New deprecations: Generator.__call__(), Message.get_type(), Message.get_main_type(), Message.get_subtype(), the 'strict' argument to the Parser constructor. These will be removed in email 3.1. + Support for Python earlier than 2.3 has been removed (see PEP 291). + All defect classes have been renamed to end in 'Defect'. + Some FeedParser fixes; also a MultipartInvariantViolationDefect will be added to messages that claim to be multipart but really aren't. + Updates to documentation.
* Resolution of SF bug #1002475 and patch #1003693; Header lines that end inBarry Warsaw2004-08-071-2/+3
| | | | | | | | | | \r\n only get the \n stripped, not the \r (unless it's the last header which does get the \r stripped). Patch by Tony Meyer. test_whitespace_continuation_last_header(), test_strip_line_feed_and_carriage_return_in_headers(): New tests. _parse_headers(): Be sure to strip \r\n from the right side of header lines.
* _parsegen(): Add a missing check for NeedMoreData.Barry Warsaw2004-05-151-0/+3
|
* readline(): RFC 2046, section 5.1.2 (and partially 5.1) both state that theBarry Warsaw2004-05-131-3/+5
| | | | | | | | | | parser must recognize outer boundaries in inner parts. So cruise through the EOF stack backwards testing each predicate against the current line. There's still some discussion about whether this is (always) the best thing to do. Anthony would rather parse these messages as if the outer boundaries were ignored. I think that's counter to the RFC, but might be practically more useful. Can you say behavior flag? (ug).
* Tests for message/external-body and for duplicate boundary lines.Barry Warsaw2004-05-111-3/+12
|
* _parsegen(): Move the message/rfc822 clause to after theBarry Warsaw2004-05-111-12/+13
| | | | | | message/delivery-status clause, and genericize it to handle all (other) message/* content types. This lets us correctly parse 2 more of Anthony's MIME torture tests (specifically, the message/external-body examples).
* _parsegen(): Watch out for empty epilogues.Barry Warsaw2004-05-111-4/+5
|
* _parse_headers(): Strip a trailing newline from the envelope header. ClosesBarry Warsaw2004-05-101-0/+4
| | | | SF #951088.
* An updated FeedParser that should be RFC complaint, passes all existingBarry Warsaw2004-05-091-289/+359
| | | | | | (standard) tests, and doesn't throw parse errors. I still need throw Anthony's torture test at it, but I wanted to get this checked in and off my disk.
* New parser. Next up, making the current parser use this parserAnthony Baxter2004-03-221-0/+362