| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
|
|
|
|
| |
and str (unicode) patterns get full unicode matching by default. The re.ASCII
flag is also introduced to ask for ASCII matching instead.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've applied a modified version of Greg Chapman's patch. I've included
the fixes without introducing the reorganization mentioned, for the sake
of stability. Also, the second fix mentioned in the patch don't fix the
mentioned problem anymore, because of the change introduced by patch
#720991 (by Greg as well). The new fix wasn't complicated though, and is
included as well.
As a note. It seems that there are other places that require the
"protection" of LASTMARK_SAVE()/LASTMARK_RESTORE(), and are just waiting
for someone to find how to break them. Particularly, I belive that every
recursion of SRE_MATCH() should be protected by these macros. I won't
do that right now since I'm not completely sure about this, and we don't
have much time for testing until the next release.
|
|
|
|
|
|
|
|
|
| |
The problem is in sre_compile.py: the call to
_compile_charset near the end of _compile_info forgets to
pass in the flags, so that the info charset is not compiled
with re.U. (The info charset is used when searching to find
the first character at which a match could start; it is not
generated for patterns beginning with a repeat like '\w{1}'.)
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
from Greg Chapman.
* Modules/_sre.c
(lastmark_restore): New function, implementing algorithm to restore
a state to a given lastmark. In addition to the similar algorithm used
in a few places of SRE_MATCH, restore lastindex when restoring lastmark.
(SRE_MATCH): Replace lastmark inline restoring by lastmark_restore(),
function. Also include it where missing. In SRE_OP_MARK, set lastindex
only if i > lastmark.
* Lib/test/re_tests.py
* Lib/test/test_sre.py
Included regression tests for the fixed bugs.
* Misc/NEWS
Mention fixes.
|
|
|
|
|
|
|
| |
backed out of broken minimal repeat patch from July
also fixed a couple of minor potential resource leaks in pattern_subx
(Guido had already fixed the big one)
|
|
|
|
|
| |
sure about this one, but test #133283 now works even with the fix in
place, and so does the test suite. we'll see what comes up...
|
|
|
|
|
|
| |
- take locale into account for word boundary anchors (#410271)
- restored 2.0's *? behaviour (#233283, #408936 and others)
- speed up re.sub/re.subn
|
| |
|
|
|
|
|
|
|
| |
uppercase strings also when the IGNORECASE flag is set (bug #128899)
(also added test cases for recently fixed bugs to the regression suite
-- or in other words, check in re_tests.py too...)
|
| |
|
|
|
|
|
| |
character class. Fix provided by Andrew Kuchling. Closes bug
#116251.
|
|
|
|
| |
first scan. Closes bug #115040.
|
| |
|
| |
|
|
|
|
| |
I fixed the a bug in the regression test harness...)
|
|
|
|
|
| |
-- added basic unicode tests to test_re
-- added test case for Sjoerd's xmllib problem to re_tests
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
groups that have no value and groups that are out of bounds.
|
| |
|
| |
|
|
|