summaryrefslogtreecommitdiffstats
path: root/Modules/_sre.c
Commit message (Collapse)AuthorAgeFilesLines
* map re.sub() to string.replace(), when possibleFredrik Lundh2001-07-081-0/+23
|
* bug #416670Fredrik Lundh2001-07-031-16/+87
| | | | | added copy/deepcopy support to SRE (still not enabled, since it's not covered by the test suite)
* reapplied darryl gallion's minimizing repeat fix. I'm still not 100%Fredrik Lundh2001-07-021-1/+1
| | | | | sure about this one, but test #133283 now works even with the fix in place, and so does the test suite. we'll see what comes up...
* pythonware repository roundtrip (untabification)Fredrik Lundh2001-07-021-12/+13
|
* added martin's BIGCHARSET patch to SRE 2.1.1. martin reports 2xFredrik Lundh2001-07-021-0/+13
| | | | speedups for certain unicode character ranges.
* merged with pythonware's SRE 2.1.1 codebaseFredrik Lundh2001-07-021-2/+92
|
* SRE: made "copyright" string static, to avoid potential linkingFredrik Lundh2001-04-151-1/+8
| | | | conflicts.
* sre 2.1b2 update:Fredrik Lundh2001-03-221-16/+58
| | | | | | - take locale into account for word boundary anchors (#410271) - restored 2.0's *? behaviour (#233283, #408936 and others) - speed up re.sub/re.subn
* SF patch 404928: Support for next Cygwin gcc (2.95.2-8)Tim Peters2001-02-281-4/+1
|
* bumped SRE version number to 2.1. cleaned up and added 1.5.2Fredrik Lundh2001-01-161-25/+41
| | | | compatibility patches.
* fixed a memory leak in pattern cleanup (patch #103248 by cgw)Fredrik Lundh2001-01-161-2/+6
|
* added "magic" number to the _sre module, to avoid weird errors causedFredrik Lundh2001-01-151-1/+9
| | | | by compiler/engine mismatches
* -- don't use recursion for unbounded non-greedy repeatFredrik Lundh2001-01-141-2/+13
| | | | | | | | (bugs #115903, #115696) This is based on a patch by Darrel Gallion. I'm not 100% sure about this fix, but I haven't managed to come up with any test case it cannot handle...
* SRE fixes for 2.1 alpha:Fredrik Lundh2001-01-141-23/+33
| | | | | | | | | | -- added some more docstrings -- fixed typo in scanner class (#125531) -- the multiline flag (?m) should't affect the \Z operator (#127259) -- fixed non-greedy backtracking bug (#123769, #127259) -- added sre.DEBUG flag (currently dumps the parsed pattern structure) -- fixed a couple of glitches in groupdict (the #126587 memory leak had already been fixed by AMK)
* Fix bug 126587: matchobject.groupdict() leaks memory because of a missingAndrew M. Kuchling2000-12-221-0/+1
| | | | DECREF
* -- properly reset groups in findall (bug #117612)Fredrik Lundh2000-10-281-15/+18
| | | | | | | | -- fixed negative lookbehind to work correctly at the beginning of the target string (bug #117242) -- improved syntax check; you can no longer refer to a group inside itself (bug #110866)
* Accept keyword arguments for (most) pattern and match objectFredrik Lundh2000-10-031-31/+45
| | | | methods. Closes buglet #115845.
* Fixed negative lookahead/lookbehind. Closes bug #115618.Fredrik Lundh2000-10-031-4/+1
|
* Rationalize use of limits.h, moving the inclusion to Python.h.Fred Drake2000-09-261-6/+0
| | | | | | | | Add definitions of INT_MAX and LONG_MAX to pyport.h. Remove includes of limits.h and conditional definitions of INT_MAX and LONG_MAX elsewhere. This closes SourceForge patch #101659 and bug #115323.
* - fixed yet another gcc -pedantic warningFredrik Lundh2000-09-211-16/+47
| | | | | - added experimental "expand" method to match objects - don't use the buffer interface on unicode strings
* return -1 for undefined groups (as implemented in 1.5.2) instead ofFredrik Lundh2000-09-021-16/+4
| | | | None (as documented) from start/end/span. closes bug #113254
* oops. accidentally reintroduced a memory leak. put the bugfix back.Fredrik Lundh2000-08-271-3/+4
|
* don't mistake memory errors (including reaching the recursion limit)Fredrik Lundh2000-08-271-18/+24
| | | | | | | with success. also, check return values from the mark functions. this addresses (but doesn't really solve) bug #112693, and low-memory problems reported by jack jansen.
* pattern_findall(): Plug small memory leak discovered by Insure.Barry Warsaw2000-08-181-3/+3
| | | | | PyList_Append() always incref's the inserted item. Be sure to decref it regardless of whether the append succeeds or fails.
* The sre test suite currently overruns the stack on Win64, Linux64, and MontereyTrent Mick2000-08-161-2/+11
| | | | | | | | | (64-bit AIX) This is because the RECURSION_LIMIT is too low. This patch lowers to recusion limit to 7500 such that the recusion check fires before a segfault. Fredrik suggested/approved the fix in private email, modulo sre's recusion limit checking no being necessary when PyOS_CheckStack is implemented for Windows.
* -- changed findall to return empty strings instead of NoneFredrik Lundh2000-08-091-11/+11
| | | | for undefined groups
* Added a missing } in the USE_STACKCHECK code.Jack Jansen2000-08-071-0/+1
|
* -- reset marks if repeat_one tail doesn't matchFredrik Lundh2000-08-071-93/+128
| | | | | | | (this should fix Sjoerd's xmllib problem) -- added skip field to INFO header -- changed compiler to generate charset INFO header -- changed trace messages to support post-mortem analysis
* + if USE_STACKCHECK is defined, use PyOS_CheckStack to lookFredrik Lundh2000-08-071-0/+7
| | | | for excessive recursion.
* -- added recursion limit (currently ~10,000 levels)Fredrik Lundh2000-08-031-156/+172
| | | | | | | -- improved error messages -- factored out SRE_COUNT; the same code is used by SRE_OP_REPEAT_ONE_TEMPLATE -- minor cleanups
* final 0.9.8 updates:Fredrik Lundh2000-08-011-15/+46
| | | | | -- added REPEAT_ONE operator -- added ANY_ALL operator (used to represent "(?s).")
* -- fixed width calculations for alternationsFredrik Lundh2000-08-011-28/+189
| | | | | | | -- fixed literal check in branch operator (this broke test_tokenize, as reported by Mark Favas) -- added REPEAT_ONE operator (still not enabled, though) -- added some debugging stuff (maxlevel)
* SRE 0.9.8: passes the entire test suiteFredrik Lundh2000-08-011-448/+295
| | | | | | | | | -- reverted REPEAT operator to use "repeat context" strategy (from 0.8.X), but done right this time. -- got rid of backtracking stack; use nested SRE_MATCH calls instead (should probably put it back again in 0.9.9 ;-) -- properly reset state in scanner mode -- don't use aggressive inlining by default
* -- SRE 0.9.6 sync. this includes:Fredrik Lundh2000-07-231-1096/+1189
| | | | | | | | | | | + added "regs" attribute + fixed "pos" and "endpos" attributes + reset "lastindex" and "lastgroup" in scanner methods + removed (?P#id) syntax; the "lastindex" and "lastgroup" attributes are now always set + removed string module dependencies in sre_parse + better debugging support in sre_parse + various tweaks to build under 1.5.2
* Bunch of minor ANSIfications: 'void initfunc()' -> 'void initfunc(void)',Thomas Wouters2000-07-211-1/+1
| | | | | | | | | | | | | | | | | | and a couple of functions that were missed in the previous batches. Not terribly tested, but very carefully scrutinized, three times. All these were found by the little findkrc.py that I posted to python-dev, which means there might be more lurking. Cases such as this: long func(a, b) long a; long b; /* flagword */ { and other cases where the last ; in the argument list isn't followed by a newline and an opening curly bracket. Regexps to catch all are welcome, of course ;)
* replace PyXXX_Length calls with PyXXX_Size callsJeremy Hylton2000-07-121-1/+1
|
* maintenance release:Fredrik Lundh2000-07-051-32/+31
| | | | | | | | - reorganized some code to get rid of -Wall and -W4 warnings - fixed default argument handling for sub/subn/split methods (reported by Peter Schneider-Kamp).
* - fixed grouping error bugFredrik Lundh2000-07-031-16/+33
| | | | - changed "group" operator to "groupref"
* - added lookbehind support (?<=pattern), (?<!pattern).Fredrik Lundh2000-07-031-55/+77
| | | | | | | | | | | | | | | | | the pattern must have a fixed width. - got rid of array-module dependencies; the match pro- gram is now stored inside the pattern object, rather than in an extra string buffer. - cleaned up a various of potential leaks, api abuses, and other minors in the engine module. - use mal's new isalnum macro, rather than my own work- around. - untabified test_sre.py. seems like I removed a couple of trailing spaces in the process...
* - experimental: added two new attributes to the match object:Fredrik Lundh2000-07-021-12/+28
| | | | | | | | "lastgroup" is the name of the last matched capturing group, "lastindex" is the index of the same group. if no group was matched, both attributes are set to None. the (?P#) feature will be removed in the next relase.
* - actually enabled charset anchors in the engine (still notFredrik Lundh2000-07-021-4/+26
| | | | | | | | | | used by the code generator) - changed max repeat value in engine (to match earlier array fix) - added experimental "which part matched?" mechanism to sre; see http://hem.passagen.se/eff/2000_07_01_bot-archive.htm#416954 or python-dev for details.
* -- use charset bitmaps where appropriate. this gives a 5-10%Fredrik Lundh2000-07-021-18/+45
| | | | | | | | | speedup for some tests, including the python tokenizer. -- added support for an optional charset anchor to the engine (currently unused by the code generator). -- removed workaround for array module bug.
* - fixed "{ in any other context" bugFredrik Lundh2000-07-011-1/+4
| | | | - minor comment touchups in the C module
* today's SRE update:Fredrik Lundh2000-07-011-4/+11
| | | | | | | | -- changed 1.6 to 2.0 in the file headers -- fixed ISALNUM macro for the unicode locale. this solution isn't perfect, but the best I can do with Python's current unicode database.
* -- changed $ to match before a trailing newline, evenFredrik Lundh2000-06-301-1/+3
| | | | if the multiline flag isn't given.
* the mad patcher strikes again:Fredrik Lundh2000-06-301-25/+24
| | | | | | | | | | | | | | | -- added pickling support (only works if sre is imported) -- fixed wordsize problems in engine (instead of casting literals down to the character size, cast characters up to the literal size (same as the code word size). this prevents false hits when you're matching a unicode pattern against an 8-bit string. (unfortunately, this broke another test, but I think the test should be changed in this case; more on that on python-dev) -- added sre.purge function (unofficial, clears the cache)
* - fixed lookahead assertions (#10, #11, #12)Fredrik Lundh2000-06-301-7/+18
| | | | - untabified sre_constants.py
* - fixed default value handling in group/groupdictFredrik Lundh2000-06-301-18/+23
| | | | - added test suite
* - fixed split behaviour on empty matchesFredrik Lundh2000-06-301-3/+3
| | | | | | - fixed compiler problems when using locale/unicode flags - fixed group/octal code parsing in sub/subn templates
* still trying to figure out how to fix the remainingFredrik Lundh2000-06-291-12/+78
| | | | | | | | | | | | | | | | | | | group reset problem. in the meantime, I added some optimizations: - added "inline" directive to LOCAL (this assumes that AC_C_INLINE does what it's supposed to do). to compile SRE on a non-unix platform that doesn't support inline, you have to add a "#define inline" somewhere... - added code to generate a SRE_OP_INFO primitive - added code to do fast prefix search (enabled by the USE_FAST_SEARCH define; default is on, in this release)