summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorandreask <andreask>2014-06-05 00:22:59 (GMT)
committerandreask <andreask>2014-06-05 00:22:59 (GMT)
commita17f5d02f8c3ec8babbb27325cba2039e56f1f10 (patch)
treecc034fcadb374472cdd4ff9ea89a87930bb39dbd
parent22e5e3f23f66d83f09a06b80a22a8215ab1dfc24 (diff)
downloadtcl-a17f5d02f8c3ec8babbb27325cba2039e56f1f10.zip
tcl-a17f5d02f8c3ec8babbb27325cba2039e56f1f10.tar.gz
tcl-a17f5d02f8c3ec8babbb27325cba2039e56f1f10.tar.bz2
Fixed a tricky interaction of IO system and encodings which could
result in a panic. The relevant function is ReadChars() (short RC in the following). When the encoding and translation transforms deliver more characters than were requested the iterative algorithm used by RC reduces the value of "dstLimit" (= the number of bytes allowed to be copied into the destination buffer) to force the next round to deliver less characters, hopefully the number requested. The existing code used the byte located just after the last wanted character to determine the new limit. The resulting value could _undershoot_ the best possible limit because Tcl_ExternalToUtf would effectively reduce this limit further, by TCL_UTF_MAX+1, to have enough space for a single multi-byte character in the buffer, and a closing '\0' as well. One effect of this were additional calls to ReadChars() to retrieve the characters missed by a call with an undershot limit. In the limit (sic) however this was also able to cause a full-blown "Buffer Underflow" panic if the original request was for less than TCL_UTF_MAX characters (*), and we are using a single-byte encoding like iso-8859-1. Because then the undershot dstLimit would prevent the next round from copying anything, and causing it to try and consolidate the current buffer with the next buffer, thinking that it had to merge a multi-byte character split across buffer boundaries. (Ad *) For example because the previous call had undershot already and left only such a small amount of characters behind! The basic fix to the problem is to add TCL_UTF_MAX back to the limit, like is done in all the (three) other places in RC setting a new one. Note however that this naive fix may generate a new limit which is the same as the old, or possibly larger. If that happens we act very conservatively and reduce the limit by only one byte instead. While I believe that this last conservative approach will never reduce the limit to TCL_UTF_MAX or less before reaching a state where it returnds the exact amount of requested characters I still added a check against this situation anyway, causing a new panic if triggered.
-rw-r--r--generic/tclIO.c23
1 files changed, 21 insertions, 2 deletions
diff --git a/generic/tclIO.c b/generic/tclIO.c
index b7135e9..e414668 100644
--- a/generic/tclIO.c
+++ b/generic/tclIO.c
@@ -5596,9 +5596,27 @@ ReadChars(
/*
* We read more chars than allowed. Reset limits to
* prevent that and try again.
+ *
+ * Note how we are adding back TCL_UTF_MAX to ensure that the
+ * Tcl_External2Utf invoked by the next round will have enough
+ * space in the destination for at least one multi-byte
+ * character. Without that nothing will be copied and the system
+ * will try to consolidate the entire current and next buffer,
+ * likely triggering the "Buffer Underflow" panic.
*/
- dstLimit = Tcl_UtfAtIndex(dst, charsToRead + 1) - dst;
+ int newLimit = Tcl_UtfAtIndex(dst, charsToRead + 1) - dst + TCL_UTF_MAX;
+
+ if (newLimit >= dstLimit) {
+ dstLimit --;
+ } else {
+ dstLimit = newLimit;
+ }
+
+ if (dstLimit <= TCL_UTF_MAX) {
+ Tcl_Panic ("Not enough space left for a single multi-byte character.");
+ }
+
statePtr->flags = savedFlags;
statePtr->inputEncodingFlags = savedIEFlags;
statePtr->inputEncodingState = savedState;
@@ -5661,7 +5679,8 @@ ReadChars(
*/
if (nextPtr->nextRemoved - srcLen < 0) {
- Tcl_Panic("Buffer Underflow, BUFFER_PADDING not enough");
+ Tcl_Panic("Buffer Underflow, BUFFER_PADDING not enough (%d < %d)",
+ nextPtr->nextRemoved, srcLen);
}
nextPtr->nextRemoved -= srcLen;