Commit message (Collapse) | Author | Age | Files | Lines | ||
---|---|---|---|---|---|---|
* | Remove some tip389 restrictions in test-cases, which are no longer necessary. | jan.nijtmans | 2018-05-07 | 1 | -2/+2 | |
| | | | | Eliminate gcc compiler warnings when compiling with -DTCL_UTF_MAX=6 Other code clean-up and comment improvements. No change in functionality. | |||||
* | Implement special "string totitle" for Extended Georgian characters (new ↵ | jan.nijtmans | 2018-05-01 | 1 | -3/+9 | |
| | | | | behavior in Unicode 11) | |||||
* | Merge 8.6 (bug-fix and test-case for Tcl_UtfAtIndex with TCL_UTF_MAX=4) | dgp | 2018-04-24 | 1 | -4/+21 | |
|\ | | | | | ((Replacement checkin for earlier attempt with botched timestamp)) | |||||
| * | Bug-fix in Tcl_UtfAtIndex (for TCL_UTF_MAX=4 only). With test-case (in ↵ | jan.nijtmans | 2018-04-23 | 1 | -0/+8 | |
| | | | | | | | | "string totitle") demonstrating the bug. | |||||
| | | ||||||
| \ | ||||||
*-. \ | TIP #389 implementation. | jan.nijtmans | 2018-04-20 | 1 | -91/+120 | |
|\ \ \ | | |/ | ||||||
| | * | Slightly improved (more fail-safe) surrogate handling for TCL_UTF_MAX>3. ↵ | jan.nijtmans | 2018-04-19 | 1 | -7/+14 | |
| | | | | | | | | | | | | Backported from latest TIP 389 implementation. (to be used for androwish) | |||||
| * | | Slightly better unmatched-surrogates handling. Unmatched High surrogates ↵ | jan.nijtmans | 2018-04-17 | 1 | -6/+13 | |
| | | | | | | | | | | | | will still be silently removed, but Unmatched Low surrogates will pass through as-is now. Inspired by Kevin Kenny's remarks. Thanks! | |||||
| * | | merge core-8-branch | jan.nijtmans | 2018-01-10 | 1 | -10/+14 | |
| |\ \ | |/ / |/| | | ||||||
| * | | merge core-8-branch | jan.nijtmans | 2017-12-01 | 1 | -34/+40 | |
| |\ \ | ||||||
| * \ \ | Merge core-8-branch. Also, use a different value for TCL_STUB_MAGIC when ↵ | jan.nijtmans | 2017-11-29 | 1 | -9/+23 | |
| |\ \ \ | | | | | | | | | | | | | | | | TCL_UTF_MAX>4. | |||||
| * | | | | Fix [8e1e31eac0fd6b6c4452bc108a98ab08c6b64588|8e1e31eac0]: lsort treats NUL ↵ | jan.nijtmans | 2017-11-29 | 1 | -3/+76 | |
| | | | | | | | | | | | | | | | | | | | | chars strangely | |||||
| * | | | | Treat invalid UTF-8 characters in the range 0x80-0x9F as cp1252: See ↵ | jan.nijtmans | 2017-11-29 | 1 | -2/+15 | |
| | | | | | | | | | | | | | | | | | | | | [https://en.wikipedia.org/wiki/UTF-8]. To be added to TIP #389 | |||||
| * | | | | merge core-8-branch | jan.nijtmans | 2017-11-20 | 1 | -13/+11 | |
| |\ \ \ \ | ||||||
| * | | | | | merge core-8-branch. Fix some Tcl_UniChar initialization, in case ↵ | jan.nijtmans | 2017-11-17 | 1 | -10/+24 | |
| | | | | | | | | | | | | | | | | | | | | | | | | TCL_UTF_MAX == 4 | |||||
| * | | | | | Somewhat simplified implementation of TIP #389, in which the "string length" ↵ | jan.nijtmans | 2017-11-07 | 1 | -29/+66 | |
| |/ / / / | | | | | | | | | | | | | | | | if characters > U+FFFF is considered to be 2, not 1. | |||||
* | | | | | Fix ↵ | jan.nijtmans | 2018-01-10 | 1 | -10/+14 | |
|\ \ \ \ \ | |_|_|/ / |/| | | | | | | | | | | | | | | [https://core.tcl.tk/tk/info/00a27923ee26437611e1ed83f96e15b6caabcd8b|00a27923ee]: (Tcl part, remaining is in Tk) text/entry dysfunctional when pasting an emoji on MacOSX. This changes the handling of incoming valid 4-byte UTF-8 sequences: Those are no longer split in 4 separate characters (as was done for invalid byte sequences) but replaced by a single ' replacement character' . | |||||
| * | | | | (partial) fix for ↵ | jan.nijtmans | 2018-01-09 | 1 | -10/+14 | |
| | |_|/ | |/| | | | | | | | | | | | | | | [https://core.tcl.tk/tk/info/00a27923ee26437611e1ed83f96e15b6caabcd8b|00a27923ee]: text/entry dysfunctional when pasting an emoji on MacOSX. Don't handle incoming valid 4-byte UTF-8 characters as invalid byte sequences (since they aren't), but as being the Unicode replacement character. | |||||
| * | | | Fix handling of surrogates (when TCL_UTF_MAX > 3) in ↵ | jan.nijtmans | 2017-12-28 | 1 | -28/+29 | |
| | | | | | | | | | | | | | | | | Tcl_UtfNcmp()/Tcl_UtfNcasecmp()/TclUtfCasecmp(). Backported from core-8-branch, where this was fixed already. | |||||
* | | | | Fix [8e1e31eac0fd6b6c4452bc108a98ab08c6b64588|8e1e31eac0]: lsort treats NUL ↵ | jan.nijtmans | 2017-11-30 | 1 | -28/+75 | |
| | | | | | | | | | | | | | | | | | | | | chars strangely. Also fix various initializations, which only make a difference when TCL_UTF_MAX == 4. Add new test-cases which demonstrate the fix. For TCL_UTF_MAX == 4, surrogates will now be handled as expected as well when sorting. | |||||
* | | | | merge core-8-6-branch | jan.nijtmans | 2017-11-29 | 1 | -7/+7 | |
|\ \ \ \ | |/ / / | | | / | |_|/ |/| | | ||||||
| * | | Fix Tcl_UtfFindFirst()/Tcl_UtfFindLast(), which were broken by [83c0c569d6]. ↵ | jan.nijtmans | 2017-11-29 | 1 | -7/+7 | |
| | | | | | | | | | | | | | | | Not detected, because those functions aren't used anywhere in Tcl. So, added new test-cases, makeing sure this doesn't happen again. | |||||
* | | | merge core-8-6-branch | jan.nijtmans | 2017-11-29 | 1 | -16/+58 | |
|\ \ \ | |/ / | | / | |/ |/| | ||||||
| * | Update some functions in tclUtf.c to handle surrogate pairs when TCL_UTF_MAX ↵ | jan.nijtmans | 2017-11-29 | 1 | -16/+58 | |
| | | | | | | | | | | == 4. Also update documentation to distinguish better between "Tcl_UniChar" and "Unicode character": Those are not necessary the same when TCL_UTF_MAX == 4. No change when TCL_UTF_MAX == 4 or TCL_UTF_MAX == 6. | |||||
* | | merge core-8-6-branch | jan.nijtmans | 2017-08-18 | 1 | -25/+53 | |
|\ \ | |/ | ||||||
| * | merge core-8-6-branch | jan.nijtmans | 2017-07-03 | 1 | -1/+1 | |
| |\ | ||||||
| | * | 'inline static' -> 'static inline' and 'INLINE' -> 'inline', for consistancy. | jan.nijtmans | 2017-07-03 | 1 | -2/+2 | |
| | | | ||||||
| * | | merge core-8-6-branch | jan.nijtmans | 2017-06-13 | 1 | -22/+21 | |
| |\ \ | | |/ | ||||||
| * | | Better UTF-8 surrogate handling, only functional when TCL_UTF_MAX>3 | jan.nijtmans | 2017-06-08 | 1 | -19/+49 | |
| | | | ||||||
* | | | merge core-8-6-branch | jan.nijtmans | 2017-06-08 | 1 | -13/+14 | |
|\ \ \ | | |/ | |/| | ||||||
| * | | Fix [2738427]: Tcl_NumUtfChars(...) no overflow check. | jan.nijtmans | 2017-06-08 | 1 | -13/+14 | |
| |\ \ | | |/ | |/| | ||||||
| | * | Fix [2738427]: Tcl_NumUtfChars(...) no overflow check. | jan.nijtmans | 2017-06-08 | 1 | -13/+14 | |
| | | | ||||||
* | | | merge core-8-6-branch | jan.nijtmans | 2017-06-06 | 1 | -18/+18 | |
|\ \ \ | |/ / | ||||||
| * | | Follow-up to [67aa9a2070]: Use uppercase consistantly, slight optimization ↵ | jan.nijtmans | 2017-06-06 | 1 | -18/+18 | |
| |\ \ | | |/ | | | | | | | in character tests, comment fixes. No change in functionality. | |||||
| | * | [67aa9a2070] Tcl_UtfToUniChar returns single byte for invalid UTF-8 input as ↵ | jan.nijtmans | 2017-06-06 | 1 | -75/+52 | |
| | | | | | | | | | | | | documented. | |||||
* | | | [67aa9a2070] Tcl_UtfToUniChar returns single byte for invalid UTF-8 input as ↵ | dgp | 2017-06-05 | 1 | -3/+9 | |
|\ \ \ | |/ / | | | | | | | documented. | |||||
| * | | Fix [67aa9a207037ae67f9014b544c3db34fa732f2dc|67aa9a2070]: Security: Invalid ↵ | jan.nijtmans | 2017-06-02 | 1 | -3/+9 | |
| | | | | | | | | | | | | UTF-8 can inject unexpected characters | |||||
* | | | Merge core-8-6-branch. This removes the work currently being done in ↵ | jan.nijtmans | 2017-06-02 | 1 | -9/+3 | |
|\ \ \ | |/ / | | | | | | | | | | "sebres-8-6-clock-speedup-cr1" branch, but that will be merged again as soon as the work is done. All other changes in "trunk" since then (e.g. the INST_STR_CONCAT1 performance improvement, and the removal of SunOS-4) are retained. | |||||
* | | | merge core-8-6-branch | jan.nijtmans | 2017-05-31 | 1 | -3/+9 | |
|\ \ \ | ||||||
| * | | | Fix [67aa9a207037ae67f9014b544c3db34fa732f2dc|67aa9a2070]: Security: Invalid ↵ | jan.nijtmans | 2017-05-31 | 1 | -3/+9 | |
| |/ / | | | | | | | | | | UTF-8 can inject unexpected characters | |||||
* | | | Don't ever allow UTF-8 sequences of more than 4 characters to be generated ↵ | jan.nijtmans | 2016-08-30 | 1 | -44/+24 | |
|\ \ \ | |/ / | | | | | | | | | | or parsed, even when TCL_UTF_MAX>4: According to current Unicode standard, a byte string of >4 characters can never form a single UTF-8 character. And a few minor micro-optimizations related to UTF-8 handling. | |||||
| * | | Don't ever allow UTF-8 sequences of more than 4 characters to be generated ↵ | jan.nijtmans | 2016-08-30 | 1 | -44/+24 | |
| | | | | | | | | | | | | | | | or parsed, even when TCL_UTF_MAX>4: According to current Unicode standard, a byte string of >4 characters can never form a single UTF-8 character. And a few minor micro-optimizations related to UTF-8 handling. | |||||
* | | | Rename UtfCount() to TclUtfCount() and use it in more places. Suggested by ↵ | jan.nijtmans | 2016-04-05 | 1 | -14/+8 | |
|/ / | | | | | | | pspjuth here: [e99a79a32650e7e5] | |||||
* | | Various Unicode handling enhancements, when building with TCL_UTF_MAX > 3, ↵ | jan.nijtmans | 2015-09-01 | 1 | -32/+93 | |
| | | | | | | | | inspired by androwish. No effect if TCL_UTF_MAX=3 (which is the default) | |||||
* | | Make sure that "string is space \u202f" will continue to return "1", even if ↵ | jan.nijtmans | 2013-07-29 | 1 | -1/+1 | |
|\ \ | |/ | | | | | | | in future Unicode this character (NARROW_NO_BREAK_SPACE) will cease to be a space. See: [http://www.unicode.org/review/pri249/]. Don't hardcode "tclWinError.o" for Cygwin | |||||
| * | Make sure that "string is space \u202f" will continue to return "1", even if ↵ | jan.nijtmans | 2013-07-29 | 1 | -1/+1 | |
| | | | | | | | | in future Unicode this character (NARROW_NO_BREAK_SPACE) will cease to be a space. See: [http://www.unicode.org/review/pri249/] | |||||
* | | Unbreak MSVC6 debug build (thanks Andreas Kupries!) | jan.nijtmans | 2013-07-08 | 1 | -1/+1 | |
|\ \ | |/ | ||||||
| * | Unbreak MSVC6 debug build (thanks Andreas Kupries!) | jan.nijtmans | 2013-07-08 | 1 | -1/+1 | |
| | | ||||||
* | | Use more portable TclIsSpaceProc() in stead of isspace(). | jan.nijtmans | 2013-06-17 | 1 | -1/+1 | |
|\ \ | |/ | ||||||
| * | Use more portable TclIsSpaceProc() in stead of isspace(). | jan.nijtmans | 2013-06-17 | 1 | -1/+3 | |
| | | | | | | Make sure that "string is space \u180e" continues to return 1 for whatever unicode version. | |||||
* | | [3613609]: Replace strcasecmp() with UTF-8-aware version. | dkf | 2013-05-22 | 1 | -0/+40 | |
|\ \ | |/ |