summaryrefslogtreecommitdiffstats
path: root/generic/tclUtf.c
Commit message (Collapse)AuthorAgeFilesLines
* Make sure that "string is space \u202f" will continue to return "1", even if ↵jan.nijtmans2013-07-291-1/+1
|\ | | | | | | | | | | | | in future Unicode this character (NARROW_NO_BREAK_SPACE) will cease to be a space. See: [http://www.unicode.org/review/pri249/]. Don't hardcode "tclWinError.o" for Cygwin FossilOrigin-Name: a72287aa7ddee6335514b3d64cd4eea243fd995d
| * Make sure that "string is space \u202f" will continue to return "1", even if ↵jan.nijtmans2013-07-291-1/+1
| | | | | | | | | | in future Unicode this character (NARROW_NO_BREAK_SPACE) will cease to be a space. See: [http://www.unicode.org/review/pri249/] FossilOrigin-Name: 334ab96e5ee173a0863d8d1e10d8f229ddf83511
* | Unbreak MSVC6 debug build (thanks Andreas Kupries!)jan.nijtmans2013-07-081-1/+1
|\ \ | |/ | | FossilOrigin-Name: d369017148f266851caca148d77e505603a58fd1
| * Unbreak MSVC6 debug build (thanks Andreas Kupries!)jan.nijtmans2013-07-081-1/+1
| | | | | | FossilOrigin-Name: 728fb2f25bda6df03d2d972f6fce7cd9720aa7ed
* | Use more portable TclIsSpaceProc() in stead of isspace().jan.nijtmans2013-06-171-1/+1
|\ \ | |/ | | FossilOrigin-Name: 4bfe3111b15faf3aef860431fd88ab5ab58df398
| * Use more portable TclIsSpaceProc() in stead of isspace(). jan.nijtmans2013-06-171-1/+3
| | | | | | | | | | Make sure that "string is space \u180e" continues to return 1 for whatever unicode version. FossilOrigin-Name: 571205495811b7c072968bd0605370dbe55603db
* | [3613609]: Replace strcasecmp() with UTF-8-aware version.dkf2013-05-221-0/+40
|\ \ | |/ | | FossilOrigin-Name: 89f027f118c0cae741036295827ac6105d9dc781
| * Fixed the weird edge case.dkf2013-05-221-12/+25
| | | | | | FossilOrigin-Name: 93dd8bb33b1ac77a85c94deac4ef874105ed1334
| * Slight improvement: if cs = "\xC0\x80" and ct = "\x00", loop would continue ↵jan.nijtmans2013-05-211-4/+4
| | | | | | | | | | after NUL-byte, this should not happen. FossilOrigin-Name: a765f37f7893e23f18bdc72a8e982cb2b5086eb9
| * Proposed solution for 3613609: lsort -nocase does not sort non-ASCII correctlyjan.nijtmans2013-05-211-0/+27
| | | | | | FossilOrigin-Name: 66c30c43690c5f6d5cc5776422f654515dde88b7
* | For Unicode 6.3, mongolian vowel separator (U+180e) is nominated to change ↵jan.nijtmans2013-02-251-2/+3
| | | | | | | | | | | | | | character class from Space to Control character. Make sure that "string is space" will continue to return 1 for this character. See TIP #413. FossilOrigin-Name: b553432c31ffbef4ce83729e2cef8abecf3a3260
* | merge trunkjan.nijtmans2012-10-091-2/+1
|\ \ | | | | | | | | | | | | <p>Dont include U+0082 and U+0082 in the Tcl space set FossilOrigin-Name: 227a4f0b70aca0c87863a8a814c4042e5baedfcd
* | | tip 318 updatejan.nijtmans2012-09-231-0/+4
|/ / | | | | FossilOrigin-Name: f09c1bc37736a61802bae0583d8e97c1376ca3fd
* | [Frq 3473670]: Various Unicode-relatedjan.nijtmans2012-01-221-10/+7
|\ \ | |/ | | FossilOrigin-Name: d772d08f8ace74d5cf188e91f7a3dd02bf631d16
| * [Frq 3473670]: Various Unicode-related speedups/robustnessjan.nijtmans2012-01-221-10/+7
| |\ | | | | | | FossilOrigin-Name: 2ccfd0f771a94a5f054deab617aaf33c7fe1dec3
| | * rfe-3473670: Various Unicode-related speedups/robustnessjan.nijtmans2012-01-141-10/+7
| | | | | | | | | FossilOrigin-Name: 92168a99c12bd4eca8cb5d3142f6fa1e9898d5cc
* | | [Bug 3464428] string is graph \u0120 is wrongjan.nijtmans2012-01-091-35/+23
|\ \ \ | |/ / | | | FossilOrigin-Name: e9a619e9dc3cc5d617ea1d0682e9a763ece64435
| * | [Bug 3464428] string is graph \u0120 is wrongjan.nijtmans2012-01-091-35/+23
| |\ \ | | |/ | | | FossilOrigin-Name: 14fc5c19b75591a508cf3d708a1d9450716b60ad
| | * [Bug 3464428] string is graph \u0120 is wrongjan.nijtmans2012-01-091-69/+56
| | | | | | | | | FossilOrigin-Name: a0c0feafe98ee7b31470ec34d9a61af224552cd7
* | | [Bug 3464428] string is graph \u0120 is wrongjan.nijtmans2011-12-241-1/+1
|\ \ \ | |/ / | | | FossilOrigin-Name: 0c1ac83954446a04679d583e0e4914cf2d33c3a1
| * | [Bug 3464428] string is graph \u0120 is wrongjan.nijtmans2011-12-241-1/+1
| |\ \ | | |/ | | | FossilOrigin-Name: 005fc77cde8bf71bb89a9498c56e9b1bb9d1dc85
| | * [Bug 3464428] string is graph \u0120 is wrongjan.nijtmans2011-12-231-1/+1
| | | | | | | | | FossilOrigin-Name: 13071df962ef95f5f101d0fffbb9850c3c97e9e8
* | | More isspace() callers.dgp2011-04-281-1/+1
|\ \ \ | |/ / | | | FossilOrigin-Name: 41acfe91eae4a425fa99d8d60b26e32d775a2bc6
| * | More isspace() callers.dgp2011-04-281-1/+1
| | | | | | | | | FossilOrigin-Name: 88095bbde01c7ae249326f391cd8e8f071a9b4c6
* | | Now that we're no longer using SCM based on RCS, the RCS Keyword linesdgp2011-03-021-2/+0
|\ \ \ | |/ / | | | | | | | | | cause more harm than good. Purged them (except in zlib files). FossilOrigin-Name: c64f310d38b977e7ae26a48bcf8bb8c50e453af7
| * | Now that we're no longer using SCM based on RCS, the RCS Keyword lines causedgp2011-03-021-2/+0
| |\ \ | | |/ | | | | | | | | | more harm than good. Purged them. FossilOrigin-Name: 79367df0f0e01a96f037f893e889e7cb9b807847
| | * Now that we're no longer using SCM based on RCS, the RCS Keyword lines causedgp2011-03-011-2/+0
| | | | | | | | | | | | | | | more harm than good. Purged them. FossilOrigin-Name: 90b4acd7bdab65433169a232124967885c18d972
| | * * generic/tclUtf.c (Tcl_UniCharToUtf): Corrected handling of negativedgp2005-09-071-36/+38
| | | | | | | | | | | | | | | | | | | | | | | | * tests/utf.test (utf-1.5): Tcl_UniChar input value. Incorrect handling was producing byte sequences outside of Tcl's legal internal encoding. [Bug 1283976]. FossilOrigin-Name: 8d8a47a587f731997708d8fd41a95cfbe8bb910a
| | * Made Tcl_NumUtfChars do the right thing with \u0000 when guessing the lengthdkf2003-10-081-5/+2
| | | | | | | | | | | | | | | | | | because of a negative 'length' parameter. [Bug 769812] FossilOrigin-Name: 257a93c349af03b8547b9f19208bc34c7e5ceebd
| | * * generic/TclUtf.c (Tcl_UniCharNcasecmp): Corrected failure todgp2003-03-061-4/+7
| | | | | | | | | | | | | | | | | | | | | * tests/utf.test (utf-25.*): properly compare Unicode strings of different case in a case insensitive manner. [Bug 699042] FossilOrigin-Name: 8003bbacd13be0a41c170c4ad424ba42f96f50c3
* | | * generic/tclExecute.c: fix potential uninitialized variable use anddas2009-09-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * generic/tclFCmd.c: null dereference flagged by clang static * generic/tclProc.c: analyzer. * generic/tclTimer.c: * generic/tclUtf.c: * generic/tclExecute.c: silence false positives from clang static * generic/tclIO.c: analyzer about potential null dereference. * generic/tclScan.c: * generic/tclCompExpr.c: FossilOrigin-Name: e93f957325ed49415f01559c8162a31df1ee28cf
* | | * generic/tclStringObj.c: Changed type of the 'allocated' fielddgp2009-02-111-1/+2
| | | | | | | | | | | | | | | | | | | | | of the String struct from size_t to int since only int values are ever stored in it. FossilOrigin-Name: 93efedde3fce70892b3790eaddb1a0ac32d737fa
* | | Get rid of pre-C89-isms (esp. CONST vs const).dkf2008-04-271-40/+40
|/ / | | | | FossilOrigin-Name: 2d205c22fbe5def21ccd36bc6f7b2d3831f6122d
* | Convert to using ANSI decls/definitions and using the (ANSI) assumption that ↵dkf2005-10-311-163/+167
| | | | | | | | | | | | | | | | NULL can be cast to any pointer type transparently. FossilOrigin-Name: 1e0170d2bfe15a4d78436b9a806cfeca52081da4
* | * generic/tclUtf.c (Tcl_UniCharToUtf): Corrected handling of negativedgp2005-09-071-36/+38
| | | | | | | | | | | | | | | | * tests/utf.test (utf-1.5): Tcl_UniChar input value. Incorrect handling was producing byte sequences outside of Tcl's legal internal encoding. [Bug 1283976]. FossilOrigin-Name: c76f2a19660533f3e1778ae865628f8fb35dc006
* | Systematizing the formattingdkf2005-07-211-198/+217
| | | | | | FossilOrigin-Name: ac613e6b948e9387516ecc1f16a35824c93ab347
* | Merged kennykb-numerics-branch back to the head; TIPs 132 and 232kennykb2005-05-101-1/+1
| | | | | | FossilOrigin-Name: 1cc2336920c70c6b9f7825b88dec87fc223f2c4e
* | * doc/DString.3: Eliminated use of identifier "string" in Tcl'sdgp2005-05-031-161/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * doc/Environment.3: public C API to avoid conflict/confusion with * doc/Eval.3: the std::string of C++. * doc/ExprLong.3, doc/ExprLongObj.3, doc/GetInt.3, doc/GetOpnFl.3: * doc/ParseCmd.3, doc/RegExp.3, doc/SetResult.3, doc/StrMatch.3: * doc/Utf.3, generic/tcl.decls, generic/tclBasic.c, generic/tclEnv.c: * generic/tclGet.c, generic/tclParse.c, generic/tclParseExpr.c: * generic/tclRegexp.c, generic/tclResult.c, generic/tclUtf.c: * generic/tclUtil.c, unix/tclUnixChan.c: * generic/tclDecls.h: `make genstubs` FossilOrigin-Name: 83aa957ebe8d942b417ec080d6731e06e930ba73
* | Made Tcl_NumUtfChars do the right thing with \u0000 when guessing the lengthdkf2003-10-081-5/+2
| | | | | | | | | | | | because of a negative 'length' parameter. [Bug 769812] FossilOrigin-Name: 6b243da1f01e6a318f8f0d85a35dccef11e5d718
* | * generic/TclUtf.c (Tcl_UniCharNcasecmp): Corrected failure todgp2003-03-061-4/+7
|/ | | | | | | * tests/utf.test (utf-25.*): properly compare Unicode strings of different case in a case insensitive manner. [Bug 699042] FossilOrigin-Name: a7fde7d55a48e8454fe7a8147d509dd9c4df8329
* * generic/tclExecute.c (TclExecuteByteCode INST_STR_MATCH):hobbs2003-02-181-2/+193
| | | | | | | | | | | | | | * generic/tclCmdMZ.c (Tcl_StringObjCmd STR_MATCH): * generic/tclUtf.c (TclUniCharMatch): * generic/tclInt.decls: add private TclUniCharMatch function that * generic/tclIntDecls.h: does string match on counted unicode * generic/tclStubInit.c: strings. Tcl_UniCharCaseMatch has the * tests/string.test: failing that it can't handle strings or * tests/stringComp.test: patterns with embedded NULLs. Added tests that actually try strings/pats with NULLs. TclUniCharMatch should be TIPed and made public in the next minor version rev. FossilOrigin-Name: 28dcdcf39e0981d8917cd869b26dbdb4c0aa8ff6
* * generic/tclUtf.c: make use of TclUtfToUniChar macro throughouthobbs2002-11-121-22/+31
| | | | | | | the functions, and add extra optimization to Tcl_NumUtfChars for one-byte/char case. FossilOrigin-Name: af7f25d96af6971821edd111f83ab4a90e4e4393
* * doc/CmdCmplt.3: Applied Patch 585105 to fully CONST-ifydgp2002-08-051-121/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * doc/Concat.3: all remaining public interfaces of Tcl. * doc/CrtCommand.3: Notably, the parser no longer writes on * doc/CrtSlave.3: the string it is parsing, so it is no * doc/CrtTrace.3: longer necessary for Tcl_Eval() to be * doc/Eval.3: given a writable string. Also, the * doc/ExprLong.3: refactoring of the Tcl_*Var* routines * doc/LinkVar.3: by Miguel Sofer is included, so that the * doc/ParseCmd.3: "part1" argument for them no longer needs * doc/SetVar.3: to be writable either. * doc/TraceVar.3: * doc/UpVar.3: Compatibility support has been enhanced so * generic/tcl.decls that a #define of USE_NON_CONST will remove * generic/tcl.h all possible source incompatibilities with * generic/tclBasic.c the 8.3 version of the header file(s). * generic/tclCmdMZ.c The new #define of USE_COMPAT_CONST now does * generic/tclCompCmds.c what USE_NON_CONST used to do -- disable * generic/tclCompExpr.c only those new CONST's that introduce * generic/tclCompile.c irreconcilable incompatibilities. * generic/tclCompile.h * generic/tclDecls.h Several bugs are also fixed by this patch. * generic/tclEnv.c [Bugs 584051,580433] [Patches 585105,582429] * generic/tclEvent.c * generic/tclInt.decls * generic/tclInt.h * generic/tclIntDecls.h * generic/tclInterp.c * generic/tclLink.c * generic/tclObj.c * generic/tclParse.c * generic/tclParseExpr.c * generic/tclProc.c * generic/tclTest.c * generic/tclUtf.c * generic/tclUtil.c * generic/tclVar.c * mac/tclMacTest.c * tests/expr-old.test * tests/parseExpr.test * unix/tclUnixTest.c * unix/tclXtTest.c * win/tclWinTest.c FossilOrigin-Name: e476c22fecaa0dd7fea635d29d8ea1d5579365a1
* Global symbols are now all either prefixed with 'tcl' (or 'Tcl' or ...) or ↵dkf2002-07-191-3/+3
| | | | | have file-scope. FossilOrigin-Name: 86e27ff753182370088914b09b67faefe53a8d37
* * unix/configure: regen'edhobbs2002-05-301-4/+4
| | | | | | | | | | | * unix/configure.in: replaced bigendian check with autoconf standard AC_C_BIG_ENDIAN, which defined WORDS_BIGENDIAN on bigendian systems. * generic/tclUtf.c (Tcl_UniCharNcmp): * generic/tclInt.h (TclUniCharNcmp): use WORDS_BIGENDIAN instead of TCL_OPTIMIZE_UNICODE_COMPARE to enable memcmp alternative. FossilOrigin-Name: 5a5c16e5a7f51a49329ffd01213fa96bbf77eddd
* Made Tcl_UniCharNcmp faster on big-endian machines; the system memcmp()isdkf2002-05-291-4/+11
| | | | | | | probably optimized far in excess of anything we could do! Little-endian just use the old code... FossilOrigin-Name: b3535ea3919b47c72ac881fe7096124210b2d529
* * generic/tclInt.decls:hobbs2002-05-291-12/+55
| | | | | | | | | | | | | | | | | | | | | | | | | * generic/tclIntDecls.h: * generic/tclStubInit.c: * generic/tclUtf.c: added TclpUtfNcmp2 private command that mirrors Tcl_UtfNcmp, but takes n in bytes, not utf-8 chars. This provides a faster alternative for comparing utf strings internally. (Tcl_UniCharNcmp, Tcl_UniCharNcasecmp): removed the explicit end of string check as it wasn't correct for the function (by doc and logic). * generic/tclCmdMZ.c (Tcl_StringObjCmd): reworked the string equal comparison code to use TclpUtfNcmp2 as well as short-circuit for equal objects or unequal length strings in the equal case. Removed the use of goto and streamlined the other parts. * generic/tclExecute.c (TclExecuteByteCode): added check for object equality in the comparison instructions. Added short-circuit for != length strings in INST_EQ, INST_NEQ and INST_STR_CMP. Reworked INST_STR_CMP to use TclpUtfNcmp2 where appropriate, and only use Tcl_UniCharNcmp when at least one of the objects is a Unicode obj with no utf bytes. FossilOrigin-Name: c78da914bed148014464cf2ed45f9ffb37c6b626
* * Partial TIP 27 rollback. Following routinesdgp2002-02-081-3/+3
| | | | | | | | | | | | | | | restored to return (char *): Tcl_DStringAppend, Tcl_DStringAppendElement, Tcl_JoinPath, Tcl_TranslateFileName, Tcl_ExternalToUtfDString, Tcl_UtfToExternalDString, Tcl_UniCharToUtfDString, Tcl_GetCwd, Tcl_WinTCharToUtf. Also restored Tcl_WinUtfToTChar to return (TCHAR *) and Tcl_UtfToUniCharDString to return (Tcl_UniChar *). Modified some callers. This change recognizes that Tcl_DStrings are de-facto white-box objects. * generic/tclCmdMZ.c: corrected use of C++-style comment. FossilOrigin-Name: bb1a244cde9f05a5477cf5dd8e8ab44cd978459f
* * Sought out and eliminated instances of CONST-casting that are nodgp2002-01-261-2/+2
| | | | | | longer needed after the TIP 27 effort. FossilOrigin-Name: 4bca1d26dbe0eea4e2c7807477efc846faa7ca75
* * Updated APIs in generic/tclUtf.c and generic/tclRegexp.c accordingdgp2002-01-171-14/+14
| | | | | | to the guidelines of TIP 27. Updated callers. FossilOrigin-Name: 17ade1570084cb2d14c947ac65b1832f709d3bb6