From a17f5d02f8c3ec8babbb27325cba2039e56f1f10 Mon Sep 17 00:00:00 2001 From: andreask <andreask> Date: Thu, 5 Jun 2014 00:22:59 +0000 Subject: Fixed a tricky interaction of IO system and encodings which could result in a panic. The relevant function is ReadChars() (short RC in the following). When the encoding and translation transforms deliver more characters than were requested the iterative algorithm used by RC reduces the value of "dstLimit" (= the number of bytes allowed to be copied into the destination buffer) to force the next round to deliver less characters, hopefully the number requested. The existing code used the byte located just after the last wanted character to determine the new limit. The resulting value could _undershoot_ the best possible limit because Tcl_ExternalToUtf would effectively reduce this limit further, by TCL_UTF_MAX+1, to have enough space for a single multi-byte character in the buffer, and a closing '\0' as well. One effect of this were additional calls to ReadChars() to retrieve the characters missed by a call with an undershot limit. In the limit (sic) however this was also able to cause a full-blown "Buffer Underflow" panic if the original request was for less than TCL_UTF_MAX characters (*), and we are using a single-byte encoding like iso-8859-1. Because then the undershot dstLimit would prevent the next round from copying anything, and causing it to try and consolidate the current buffer with the next buffer, thinking that it had to merge a multi-byte character split across buffer boundaries. (Ad *) For example because the previous call had undershot already and left only such a small amount of characters behind! The basic fix to the problem is to add TCL_UTF_MAX back to the limit, like is done in all the (three) other places in RC setting a new one. Note however that this naive fix may generate a new limit which is the same as the old, or possibly larger. If that happens we act very conservatively and reduce the limit by only one byte instead. While I believe that this last conservative approach will never reduce the limit to TCL_UTF_MAX or less before reaching a state where it returnds the exact amount of requested characters I still added a check against this situation anyway, causing a new panic if triggered. --- generic/tclIO.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/generic/tclIO.c b/generic/tclIO.c index b7135e9..e414668 100644 --- a/generic/tclIO.c +++ b/generic/tclIO.c @@ -5596,9 +5596,27 @@ ReadChars( /* * We read more chars than allowed. Reset limits to * prevent that and try again. + * + * Note how we are adding back TCL_UTF_MAX to ensure that the + * Tcl_External2Utf invoked by the next round will have enough + * space in the destination for at least one multi-byte + * character. Without that nothing will be copied and the system + * will try to consolidate the entire current and next buffer, + * likely triggering the "Buffer Underflow" panic. */ - dstLimit = Tcl_UtfAtIndex(dst, charsToRead + 1) - dst; + int newLimit = Tcl_UtfAtIndex(dst, charsToRead + 1) - dst + TCL_UTF_MAX; + + if (newLimit >= dstLimit) { + dstLimit --; + } else { + dstLimit = newLimit; + } + + if (dstLimit <= TCL_UTF_MAX) { + Tcl_Panic ("Not enough space left for a single multi-byte character."); + } + statePtr->flags = savedFlags; statePtr->inputEncodingFlags = savedIEFlags; statePtr->inputEncodingState = savedState; @@ -5661,7 +5679,8 @@ ReadChars( */ if (nextPtr->nextRemoved - srcLen < 0) { - Tcl_Panic("Buffer Underflow, BUFFER_PADDING not enough"); + Tcl_Panic("Buffer Underflow, BUFFER_PADDING not enough (%d < %d)", + nextPtr->nextRemoved, srcLen); } nextPtr->nextRemoved -= srcLen; -- cgit v0.12