summaryrefslogtreecommitdiffstats
path: root/Modules/_sre.c
Commit message (Collapse)AuthorAgeFilesLines
* (experimental) "finditer" method/function. this works pretty muchFredrik Lundh2001-10-241-0/+28
| | | | | like findall, but returns an iterator (which returns match objects) instead of a list of strings/tuples.
* another major speedup: let sre.sub/subn check for escapes in theFredrik Lundh2001-10-221-30/+89
| | | | | template string, and don't call the template compiler if we can avoid it.
* sre.split should return the last segment, even if emptyFredrik Lundh2001-10-221-11/+10
| | | | (sorry, barry)
* fixed character set description in docstring (SRE uses PythonFredrik Lundh2001-10-211-96/+41
| | | | | | | | | | | | | | | | | strings, not C strings) removed USE_PYTHON defines, and related sre.py helpers skip calling the subx helper if the template is callable. interestingly enough, this means that def callback(m): return literal result = pattern.sub(callback, string) is much faster than result = pattern.sub(literal, string)
* sre.Scanner fixes (from Greg Chapman). also added a Scanner sanityFredrik Lundh2001-10-211-0/+17
| | | | | | check to the test suite. added a few missing exception checks in the _sre module
* rewrote the pattern.sub and pattern.subn methods in CFredrik Lundh2001-10-211-113/+306
| | | | | | | | | removed (conceptually flawed) getliteral helper; the new sub/subn code uses a faster code path for literal replacement strings, but doesn't (yet) look for literal patterns. added STATE_OFFSET macro, and use it to convert state.start/ptr to char indexes
* rewrote the pattern.split method in CFredrik Lundh2001-10-201-12/+136
| | | | also restored SRE Unicode support for 1.6/2.0/2.1
* SRE bug #441409:Fredrik Lundh2001-10-181-1/+3
| | | | | | | | compile should raise error for non-strings SRE bug #432570, 448951: reset group after failed match also bumped version number to 2.2.0
* fixed #449964: sre.sub raises an exception if the template contains aFredrik Lundh2001-09-181-12/+16
| | | | | | \g<x> group reference followed by a character escape (also restructured a few things on the way to fixing #449000)
* an SRE bugfix a day keeps Guido away...Fredrik Lundh2001-09-181-9/+14
| | | | | | | #462270: sub-tle difference between pre.sub and sre.sub. PRE ignored an empty match at the previous location, SRE didn't. also synced with Secret Labs "sreopen" codebase.
* Removed unreachable return to silence SGI compiler.Sjoerd Mullender2001-08-301-2/+1
|
* Patch #445762: Support --disable-unicodeMartin v. Löwis2001-08-171-1/+1
| | | | | | | | - Do not compile unicodeobject, unicodectype, and unicodedata if Unicode is disabled - check for Py_USING_UNICODE in all places that use Unicode functions - disables unicode literals, and the builtin functions - add the types.StringTypes list - remove Unicode literals from most tests.
* init_sre(): Plug a little leak reported by Insure.Barry Warsaw2001-08-161-2/+5
|
* map re.sub() to string.replace(), when possibleFredrik Lundh2001-07-081-0/+23
|
* bug #416670Fredrik Lundh2001-07-031-16/+87
| | | | | added copy/deepcopy support to SRE (still not enabled, since it's not covered by the test suite)
* reapplied darryl gallion's minimizing repeat fix. I'm still not 100%Fredrik Lundh2001-07-021-1/+1
| | | | | sure about this one, but test #133283 now works even with the fix in place, and so does the test suite. we'll see what comes up...
* pythonware repository roundtrip (untabification)Fredrik Lundh2001-07-021-12/+13
|
* added martin's BIGCHARSET patch to SRE 2.1.1. martin reports 2xFredrik Lundh2001-07-021-0/+13
| | | | speedups for certain unicode character ranges.
* merged with pythonware's SRE 2.1.1 codebaseFredrik Lundh2001-07-021-2/+92
|
* SRE: made "copyright" string static, to avoid potential linkingFredrik Lundh2001-04-151-1/+8
| | | | conflicts.
* sre 2.1b2 update:Fredrik Lundh2001-03-221-16/+58
| | | | | | - take locale into account for word boundary anchors (#410271) - restored 2.0's *? behaviour (#233283, #408936 and others) - speed up re.sub/re.subn
* SF patch 404928: Support for next Cygwin gcc (2.95.2-8)Tim Peters2001-02-281-4/+1
|
* bumped SRE version number to 2.1. cleaned up and added 1.5.2Fredrik Lundh2001-01-161-25/+41
| | | | compatibility patches.
* fixed a memory leak in pattern cleanup (patch #103248 by cgw)Fredrik Lundh2001-01-161-2/+6
|
* added "magic" number to the _sre module, to avoid weird errors causedFredrik Lundh2001-01-151-1/+9
| | | | by compiler/engine mismatches
* -- don't use recursion for unbounded non-greedy repeatFredrik Lundh2001-01-141-2/+13
| | | | | | | | (bugs #115903, #115696) This is based on a patch by Darrel Gallion. I'm not 100% sure about this fix, but I haven't managed to come up with any test case it cannot handle...
* SRE fixes for 2.1 alpha:Fredrik Lundh2001-01-141-23/+33
| | | | | | | | | | -- added some more docstrings -- fixed typo in scanner class (#125531) -- the multiline flag (?m) should't affect the \Z operator (#127259) -- fixed non-greedy backtracking bug (#123769, #127259) -- added sre.DEBUG flag (currently dumps the parsed pattern structure) -- fixed a couple of glitches in groupdict (the #126587 memory leak had already been fixed by AMK)
* Fix bug 126587: matchobject.groupdict() leaks memory because of a missingAndrew M. Kuchling2000-12-221-0/+1
| | | | DECREF
* -- properly reset groups in findall (bug #117612)Fredrik Lundh2000-10-281-15/+18
| | | | | | | | -- fixed negative lookbehind to work correctly at the beginning of the target string (bug #117242) -- improved syntax check; you can no longer refer to a group inside itself (bug #110866)
* Accept keyword arguments for (most) pattern and match objectFredrik Lundh2000-10-031-31/+45
| | | | methods. Closes buglet #115845.
* Fixed negative lookahead/lookbehind. Closes bug #115618.Fredrik Lundh2000-10-031-4/+1
|
* Rationalize use of limits.h, moving the inclusion to Python.h.Fred Drake2000-09-261-6/+0
| | | | | | | | Add definitions of INT_MAX and LONG_MAX to pyport.h. Remove includes of limits.h and conditional definitions of INT_MAX and LONG_MAX elsewhere. This closes SourceForge patch #101659 and bug #115323.
* - fixed yet another gcc -pedantic warningFredrik Lundh2000-09-211-16/+47
| | | | | - added experimental "expand" method to match objects - don't use the buffer interface on unicode strings
* return -1 for undefined groups (as implemented in 1.5.2) instead ofFredrik Lundh2000-09-021-16/+4
| | | | None (as documented) from start/end/span. closes bug #113254
* oops. accidentally reintroduced a memory leak. put the bugfix back.Fredrik Lundh2000-08-271-3/+4
|
* don't mistake memory errors (including reaching the recursion limit)Fredrik Lundh2000-08-271-18/+24
| | | | | | | with success. also, check return values from the mark functions. this addresses (but doesn't really solve) bug #112693, and low-memory problems reported by jack jansen.
* pattern_findall(): Plug small memory leak discovered by Insure.Barry Warsaw2000-08-181-3/+3
| | | | | PyList_Append() always incref's the inserted item. Be sure to decref it regardless of whether the append succeeds or fails.
* The sre test suite currently overruns the stack on Win64, Linux64, and MontereyTrent Mick2000-08-161-2/+11
| | | | | | | | | (64-bit AIX) This is because the RECURSION_LIMIT is too low. This patch lowers to recusion limit to 7500 such that the recusion check fires before a segfault. Fredrik suggested/approved the fix in private email, modulo sre's recusion limit checking no being necessary when PyOS_CheckStack is implemented for Windows.
* -- changed findall to return empty strings instead of NoneFredrik Lundh2000-08-091-11/+11
| | | | for undefined groups
* Added a missing } in the USE_STACKCHECK code.Jack Jansen2000-08-071-0/+1
|
* -- reset marks if repeat_one tail doesn't matchFredrik Lundh2000-08-071-93/+128
| | | | | | | (this should fix Sjoerd's xmllib problem) -- added skip field to INFO header -- changed compiler to generate charset INFO header -- changed trace messages to support post-mortem analysis
* + if USE_STACKCHECK is defined, use PyOS_CheckStack to lookFredrik Lundh2000-08-071-0/+7
| | | | for excessive recursion.
* -- added recursion limit (currently ~10,000 levels)Fredrik Lundh2000-08-031-156/+172
| | | | | | | -- improved error messages -- factored out SRE_COUNT; the same code is used by SRE_OP_REPEAT_ONE_TEMPLATE -- minor cleanups
* final 0.9.8 updates:Fredrik Lundh2000-08-011-15/+46
| | | | | -- added REPEAT_ONE operator -- added ANY_ALL operator (used to represent "(?s).")
* -- fixed width calculations for alternationsFredrik Lundh2000-08-011-28/+189
| | | | | | | -- fixed literal check in branch operator (this broke test_tokenize, as reported by Mark Favas) -- added REPEAT_ONE operator (still not enabled, though) -- added some debugging stuff (maxlevel)
* SRE 0.9.8: passes the entire test suiteFredrik Lundh2000-08-011-448/+295
| | | | | | | | | -- reverted REPEAT operator to use "repeat context" strategy (from 0.8.X), but done right this time. -- got rid of backtracking stack; use nested SRE_MATCH calls instead (should probably put it back again in 0.9.9 ;-) -- properly reset state in scanner mode -- don't use aggressive inlining by default
* -- SRE 0.9.6 sync. this includes:Fredrik Lundh2000-07-231-1096/+1189
| | | | | | | | | | | + added "regs" attribute + fixed "pos" and "endpos" attributes + reset "lastindex" and "lastgroup" in scanner methods + removed (?P#id) syntax; the "lastindex" and "lastgroup" attributes are now always set + removed string module dependencies in sre_parse + better debugging support in sre_parse + various tweaks to build under 1.5.2
* Bunch of minor ANSIfications: 'void initfunc()' -> 'void initfunc(void)',Thomas Wouters2000-07-211-1/+1
| | | | | | | | | | | | | | | | | | and a couple of functions that were missed in the previous batches. Not terribly tested, but very carefully scrutinized, three times. All these were found by the little findkrc.py that I posted to python-dev, which means there might be more lurking. Cases such as this: long func(a, b) long a; long b; /* flagword */ { and other cases where the last ; in the argument list isn't followed by a newline and an opening curly bracket. Regexps to catch all are welcome, of course ;)
* replace PyXXX_Length calls with PyXXX_Size callsJeremy Hylton2000-07-121-1/+1
|
* maintenance release:Fredrik Lundh2000-07-051-32/+31
| | | | | | | | - reorganized some code to get rid of -Wall and -W4 warnings - fixed default argument handling for sub/subn/split methods (reported by Peter Schneider-Kamp).