| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| |\ \ \ \ \ \ \ \ |
|
| | | | | | | | | | |
|
| | | | | | | | | | |
|
|\ \ \ \ \ \ \ \ \ \
| |_|_|_|_|_|_|/ / /
|/| | | | | | | / /
| | |_|_|_|_|_|/ /
| |/| | | | | | | |
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
bytes after the string-end are read. The command will return -1 in that case. No need for additional arguments any more.
|
|\ \ \ \ \ \ \ \ \
| |/ / / / / / / / |
|
| | | | | | | | | |
|
|\ \ \ \ \ \ \ \ \
| |/ / / / / / / / |
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
character, we cannot conclude that the next byte also will be, or can by
taken as a single byte. At least we cannot when TCL_UTF_MAX > 3 so that we
have room for valid two-byte sequences after incomplete sequence detection.
No need for conditional code, just use an algorithm that always works.
|
| |_|_|_|_|_|/ /
|/| | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
more than 3 bytes. This is more consistant with what Tcl 8.7 does too.
For TCL_UTF_MAX==6: Make sure that Tcl_UtfNext()/Tcl_UtfPrev() never move more than 4 bytes.
For TCL_UTF_MAX==3: No change.
Introduce ucs2_utf16 test constraint, since many test results now become the same for ucs2 and utf16.
|
|\ \ \ \ \ \ \ \
| |/ / / / / / / |
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Tcl_NumUtfChars(). Don't use "-1" in the Tcl_NumUtfChars() calculation, since that raises more questions than it solves, but that's easy to be remedied as well: Juse use >= in stead of > in the comparation. Great idea, Don!
Backport more code formatting from Tcl 8.6 (e.g. use of CONST, which makes no sense any more in c-files)
|
|\ \ \ \ \ \ \ \
| |/ / / / / / / |
|
| | | | | | | | |
|
|\ \ \ \ \ \ \ \
| |/ / / / / / / |
|
| | | | | | | | |
|
| |\ \ \ \ \ \ \ |
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Update the comments to describe what it does now, and cautions that callers
take into account.
|
|\ \ \ \ \ \ \ \ \
| | |/ / / / / / /
| |/| | | | | | | |
|
| |/ / / / / / /
| | | | | | | |
| | | | | | | |
| | | | | | | | |
testdescriptions/testresults more equal among branches, so the real differences are more visible.
|
|\ \ \ \ \ \ \ \
| |/ / / / / / / |
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
first in Tcl_UtfNext(), so if src[1] is invalid src[2] doesn't need to be checked any more.
Note: This order change, calling Invalid() first was wrong, and is corrected in later commits. Thanks, Don, for noticing this!
|
| |_|_|_|_|/ /
|/| | | | | |
| | | | | | |
| | | | | | | |
in TCL_UTF_MAX>3 builds.
|
|\ \ \ \ \ \ \
| |/ / / / / /
| | | | | / /
| |_|_|_|/ /
|/| | | | | |
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
"knownBug" testcase utf-6.93.1.
Rename tip389 selector to utf16, since that's what it actually is, in contrast to ucs2 and ucs4.
|
| |_|_|/ /
|/| | | |
| | | | |
| | | | | |
fixes all "knownBug" testcases related to tip389.
|
| |_|/ /
|/| | |
| | | |
| | | |
| | | | |
TCL_UTF_MAX=4. (even though TCL_UTF_MAX=4 is unsupported, it would be nice to make it work)
Marked various test-cases as "knownBug", those work correctly in core-8-branch (8.7). The fix there could be backported. Low prio.
|
| | | |
| | | |
| | | |
| | | |
| | | | |
commit, it was not correct).
Perfectionalize TclUtfToUCS4()/TclUCS4Complete() and new (internal) function TclUCS4ToUtf(). They can help preventing bugs regarding splitting/joining surrogates. Used them in a few more places.
|
| |/ /
|/| |
| | |
| | |
| | |
| | | |
always for whatever testConstraints.
Fix one invalid use of TclUCS4Complete(), and let TclUtfToUCS4() handle (invalid) 4-byte sequences.
Test-case cleanup (removal of unnecessary quoting)
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | | |
bytes.
Tcl_UtfToUniChar() now never reads more than TCL_UTF_MAX bytes any more. The UtfToUtf encoder/decoder is adapted to do attitional checks (more tricky than in Tcl 8.7, since we want compatibility with earlier 8.6 releases).
Other callers of Tcl_UtfToUniChar() needs to be revised for the same problem. Most callers will need to change Tcl_UtfToUniChar() -> TclUtfToUCS4() and Tcl_UtfCharComplete() -> TclUCS4Complete(), but that's not done yet.
|
| | | |
| | | |
| | | |
| | | |
| | | | |
Tcl_UtfToUniChar() are suspicious, because those cannot handle 4-byte UTF-8 sequences reliable.
So, there's more work to do, but this part can already be backported to Tcl 8.6 and see where we get.
|
| | | |
| | | |
| | | |
| | | | |
test-cases fail when we no longer check the validity of the 3th trail byte.
|
| | | | |
|
| |\ \ \ |
|
| |\ \ \ \
| | | | | |
| | | | | | |
Quick exit from Tcl_UtfToChar16()/Tcl_UtfToUniChar() when lead-byte is 0xF5 - 0xF7.
|
| |\ \ \ \ \ |
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
bytes than the end of the string.
If you want to null-terminate string explitely, use \x00
|
|\ \ \ \ \ \ \
| |_|_|_|/ / /
|/| | | | / /
| | |_|_|/ /
| |/| | | | |
|
| | | | | | |
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
[69634d51fb74551b] for Tcl 8.5 (with TCL_UTF_MAX=4) too. Also fix some comments which were not up to date.
No change at all in behavior for TCL_UTF_MAX=3.
|
| |_|_|/ /
|/| | | | |
|
| |_|/ /
|/| | |
| | | |
| | | | |
from 8.7. This fixes [69634d51fb]: handling out of range UCS4 values (at least, it's fixed in 8.6 now the same way as it's fixed in 8.7).
|
|\ \ \ \
| |/ / / |
|
| |\ \ \ |
|
| | |/ / |
|
| | |\ \ |
|
| | |\ \ \ |
|
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
adapted to the needs of TIPs 389/542.
|
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
problem.
|
| | |\ \ \ \ |
|