diff options
author | andreask <andreask> | 2014-06-05 00:22:59 (GMT) |
---|---|---|
committer | andreask <andreask> | 2014-06-05 00:22:59 (GMT) |
commit | a17f5d02f8c3ec8babbb27325cba2039e56f1f10 (patch) | |
tree | cc034fcadb374472cdd4ff9ea89a87930bb39dbd | |
parent | 22e5e3f23f66d83f09a06b80a22a8215ab1dfc24 (diff) | |
download | tcl-a17f5d02f8c3ec8babbb27325cba2039e56f1f10.zip tcl-a17f5d02f8c3ec8babbb27325cba2039e56f1f10.tar.gz tcl-a17f5d02f8c3ec8babbb27325cba2039e56f1f10.tar.bz2 |
Fixed a tricky interaction of IO system and encodings which could
result in a panic.
The relevant function is ReadChars() (short RC in the following).
When the encoding and translation transforms deliver more characters
than were requested the iterative algorithm used by RC reduces the
value of "dstLimit" (= the number of bytes allowed to be copied into
the destination buffer) to force the next round to deliver less
characters, hopefully the number requested.
The existing code used the byte located just after the last wanted
character to determine the new limit. The resulting value could
_undershoot_ the best possible limit because Tcl_ExternalToUtf would
effectively reduce this limit further, by TCL_UTF_MAX+1, to have
enough space for a single multi-byte character in the buffer, and a
closing '\0' as well.
One effect of this were additional calls to ReadChars() to retrieve
the characters missed by a call with an undershot limit.
In the limit (sic) however this was also able to cause a full-blown
"Buffer Underflow" panic if the original request was for less than
TCL_UTF_MAX characters (*), and we are using a single-byte encoding
like iso-8859-1. Because then the undershot dstLimit would prevent the
next round from copying anything, and causing it to try and
consolidate the current buffer with the next buffer, thinking that it
had to merge a multi-byte character split across buffer boundaries.
(Ad *) For example because the previous call had undershot already and
left only such a small amount of characters behind!
The basic fix to the problem is to add TCL_UTF_MAX back to the limit,
like is done in all the (three) other places in RC setting a new
one. Note however that this naive fix may generate a new limit which
is the same as the old, or possibly larger. If that happens we act
very conservatively and reduce the limit by only one byte instead.
While I believe that this last conservative approach will never reduce
the limit to TCL_UTF_MAX or less before reaching a state where it
returnds the exact amount of requested characters I still added a
check against this situation anyway, causing a new panic if triggered.
-rw-r--r-- | generic/tclIO.c | 23 |
1 files changed, 21 insertions, 2 deletions
diff --git a/generic/tclIO.c b/generic/tclIO.c index b7135e9..e414668 100644 --- a/generic/tclIO.c +++ b/generic/tclIO.c @@ -5596,9 +5596,27 @@ ReadChars( /* * We read more chars than allowed. Reset limits to * prevent that and try again. + * + * Note how we are adding back TCL_UTF_MAX to ensure that the + * Tcl_External2Utf invoked by the next round will have enough + * space in the destination for at least one multi-byte + * character. Without that nothing will be copied and the system + * will try to consolidate the entire current and next buffer, + * likely triggering the "Buffer Underflow" panic. */ - dstLimit = Tcl_UtfAtIndex(dst, charsToRead + 1) - dst; + int newLimit = Tcl_UtfAtIndex(dst, charsToRead + 1) - dst + TCL_UTF_MAX; + + if (newLimit >= dstLimit) { + dstLimit --; + } else { + dstLimit = newLimit; + } + + if (dstLimit <= TCL_UTF_MAX) { + Tcl_Panic ("Not enough space left for a single multi-byte character."); + } + statePtr->flags = savedFlags; statePtr->inputEncodingFlags = savedIEFlags; statePtr->inputEncodingState = savedState; @@ -5661,7 +5679,8 @@ ReadChars( */ if (nextPtr->nextRemoved - srcLen < 0) { - Tcl_Panic("Buffer Underflow, BUFFER_PADDING not enough"); + Tcl_Panic("Buffer Underflow, BUFFER_PADDING not enough (%d < %d)", + nextPtr->nextRemoved, srcLen); } nextPtr->nextRemoved -= srcLen; |