| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
its documentation.
* Documented that the compiled re methods are supposed to be more full
featured than their simpilified function counterparts.
* Documented the existing start and stop position arguments for the
findall() and finditer() methods of compiled regular expression objects.
* Added an optional flags argument to the re.findall() and re.finditer()
functions. This aligns their API with that for re.search() and
re.match().
|
|
|
|
|
|
|
|
| |
This patch includes test cases and documentation updates, as well as NEWS file
updates.
This patch also updates the sre modules so that they don't import the string
module, breaking direct circular imports.
|
| |
|
|
|
|
|
| |
a string or unicode object in sre.compile() when a different type
pattern with the same value exists.
|
| |
|
|
|
|
|
|
| |
Use isinstance() instead of comparing types directly, to enable
subclasses of str and unicode to be used as patterns.
Blessed by /F.
|
|
|
|
| |
SF bug 585882. Will forward-port.
|
| |
|
|
|
|
|
| |
like findall, but returns an iterator (which returns match objects)
instead of a list of strings/tuples.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
strings, not C strings)
removed USE_PYTHON defines, and related sre.py helpers
skip calling the subx helper if the template is callable.
interestingly enough, this means that
def callback(m):
return literal
result = pattern.sub(callback, string)
is much faster than
result = pattern.sub(literal, string)
|
|
|
|
|
|
| |
check to the test suite.
added a few missing exception checks in the _sre module
|
|
|
|
|
|
|
|
|
| |
removed (conceptually flawed) getliteral helper; the new sub/subn code
uses a faster code path for literal replacement strings, but doesn't
(yet) look for literal patterns.
added STATE_OFFSET macro, and use it to convert state.start/ptr to
char indexes
|
|
|
|
|
|
|
|
| |
compile should raise error for non-strings
SRE bug #432570, 448951:
reset group after failed match
also bumped version number to 2.2.0
|
|
|
|
|
|
| |
\g<x> group reference followed by a character escape
(also restructured a few things on the way to fixing #449000)
|
|
|
|
|
|
|
| |
#462270: sub-tle difference between pre.sub and sre.sub. PRE ignored
an empty match at the previous location, SRE didn't.
also synced with Secret Labs "sreopen" codebase.
|
| |
|
| |
|
|
|
|
|
|
| |
bug #449000, "re.sub(r'\n', ...) broke". This was Fredrik's
suggestion -- he's on vacation and said he wouldn't be able to work on
this until next week.
|
| |
|
|
|
|
| |
re.findall doesn't take a maxsplit argument
|
|
|
|
|
|
| |
- take locale into account for word boundary anchors (#410271)
- restored 2.0's *? behaviour (#233283, #408936 and others)
- speed up re.sub/re.subn
|
|
|
|
|
| |
- removed __all__ cruft from internal modules (sorry, skip)
- don't assume ASCII for string escapes (sorry, per)
|
|
|
|
|
|
| |
also modified check_all function to suppress all warnings since they aren't
relevant to what this test is doing (allows quiet checking of regsub, for
instance)
|
| |
|
|
|
|
| |
compatibility patches.
|
|
|
|
|
|
|
|
|
|
| |
-- added some more docstrings
-- fixed typo in scanner class (#125531)
-- the multiline flag (?m) should't affect the \Z operator (#127259)
-- fixed non-greedy backtracking bug (#123769, #127259)
-- added sre.DEBUG flag (currently dumps the parsed pattern structure)
-- fixed a couple of glitches in groupdict (the #126587 memory leak
had already been fixed by AMK)
|
|
|
|
|
| |
- added experimental "expand" method to match objects
- don't use the buffer interface on unicode strings
|
|
|
|
|
|
|
| |
(this should fix Sjoerd's xmllib problem)
-- added skip field to INFO header
-- changed compiler to generate charset INFO header
-- changed trace messages to support post-mortem analysis
|
|
|
|
|
| |
-- added REPEAT_ONE operator
-- added ANY_ALL operator (used to represent "(?s).")
|
|
|
|
|
|
|
|
|
| |
-- reverted REPEAT operator to use "repeat context" strategy
(from 0.8.X), but done right this time.
-- got rid of backtracking stack; use nested SRE_MATCH calls
instead (should probably put it back again in 0.9.9 ;-)
-- properly reset state in scanner mode
-- don't use aggressive inlining by default
|
|
|
|
|
|
|
|
|
|
|
| |
+ added "regs" attribute
+ fixed "pos" and "endpos" attributes
+ reset "lastindex" and "lastgroup" in scanner methods
+ removed (?P#id) syntax; the "lastindex" and "lastgroup"
attributes are now always set
+ removed string module dependencies in sre_parse
+ better debugging support in sre_parse
+ various tweaks to build under 1.5.2
|
| |
|
|
|
|
|
|
|
|
|
|
| |
used by the code generator)
- changed max repeat value in engine (to match earlier array fix)
- added experimental "which part matched?" mechanism to sre; see
http://hem.passagen.se/eff/2000_07_01_bot-archive.htm#416954
or python-dev for details.
|
|
|
|
|
|
|
|
| |
-- changed 1.6 to 2.0 in the file headers
-- fixed ISALNUM macro for the unicode locale. this
solution isn't perfect, but the best I can do with
Python's current unicode database.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-- added pickling support (only works if sre is imported)
-- fixed wordsize problems in engine
(instead of casting literals down to the character size,
cast characters up to the literal size (same as the code
word size). this prevents false hits when you're matching
a unicode pattern against an 8-bit string. (unfortunately,
this broke another test, but I think the test should be
changed in this case; more on that on python-dev)
-- added sre.purge function
(unofficial, clears the cache)
|
| |
|
|
|
|
| |
- added test suite
|
|
|
|
|
|
| |
- fixed compiler problems when using locale/unicode flags
- fixed group/octal code parsing in sub/subn templates
|
|
|
|
|
|
|
|
|
| |
(those semantics are weird...)
- got rid of $Id$'s (for the moment, at least). in other
words, there should be no more "empty" checkins.
- internal: some minor cleanups.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(test_sre still complains about split, but that's caused by
the group reset bug, not split itself)
- added more mark slots
(should be dynamically allocated, but 100 is better than 32.
and checking for the upper limit is better than overwriting
the memory ;-)
- internal: renamed the cursor helper class
- internal: removed some bloat from sre_compile
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
search() functions didn't even work because _fixflags() isn't
idempotent. I'm adding another stop-gap measure so that you can at
least use sre.search() and sre.match() with a zero flags arg.
|
|
NOTE: THIS IS VERY ROUGH ALPHA CODE!
|