diff options
author | Gregory P. Smith <greg@krypto.org> | 2022-09-04 16:54:56 (GMT) |
---|---|---|
committer | GitHub <noreply@github.com> | 2022-09-04 16:54:56 (GMT) |
commit | eace09e63ed7978dbdfeb1ae537fac505e6b5b0e (patch) | |
tree | 28c33c23faec72038fd65cd37d476e6b8ee3de2b /Objects | |
parent | 4dea99f18eeef15a6d69236146793c801d12c329 (diff) | |
download | cpython-eace09e63ed7978dbdfeb1ae537fac505e6b5b0e.zip cpython-eace09e63ed7978dbdfeb1ae537fac505e6b5b0e.tar.gz cpython-eace09e63ed7978dbdfeb1ae537fac505e6b5b0e.tar.bz2 |
[3.10] gh-95778: Correctly pre-check for int-to-str conversion (GH-96537) (#96563)
Converting a large enough `int` to a decimal string raises `ValueError` as expected. However, the raise comes _after_ the quadratic-time base-conversion algorithm has run to completion. For effective DOS prevention, we need some kind of check before entering the quadratic-time loop. Oops! =)
The quick fix: essentially we catch _most_ values that exceed the threshold up front. Those that slip through will still be on the small side (read: sufficiently fast), and will get caught by the existing check so that the limit remains exact.
The justification for the current check. The C code check is:
```c
max_str_digits / (3 * PyLong_SHIFT) <= (size_a - 11) / 10
```
In GitHub markdown math-speak, writing $M$ for `max_str_digits`, $L$ for `PyLong_SHIFT` and $s$ for `size_a`, that check is:
$$\left\lfloor\frac{M}{3L}\right\rfloor \le \left\lfloor\frac{s - 11}{10}\right\rfloor$$
From this it follows that
$$\frac{M}{3L} < \frac{s-1}{10}$$
hence that
$$\frac{L(s-1)}{M} > \frac{10}{3} > \log_2(10).$$
So
$$2^{L(s-1)} > 10^M.$$
But our input integer $a$ satisfies $|a| \ge 2^{L(s-1)}$, so $|a|$ is larger than $10^M$. This shows that we don't accidentally capture anything _below_ the intended limit in the check.
<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->
Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
(cherry picked from commit b126196838bbaf5f4d35120e0e6bcde435b0b480)
Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
Diffstat (limited to 'Objects')
-rw-r--r-- | Objects/longobject.c | 26 |
1 files changed, 22 insertions, 4 deletions
diff --git a/Objects/longobject.c b/Objects/longobject.c index 780ea81..aea5edc 100644 --- a/Objects/longobject.c +++ b/Objects/longobject.c @@ -36,7 +36,8 @@ _Py_IDENTIFIER(big); #define IS_SMALL_INT(ival) (-NSMALLNEGINTS <= (ival) && (ival) < NSMALLPOSINTS) #define IS_SMALL_UINT(ival) ((ival) < NSMALLPOSINTS) -#define _MAX_STR_DIGITS_ERROR_FMT "Exceeds the limit (%d) for integer string conversion: value has %zd digits" +#define _MAX_STR_DIGITS_ERROR_FMT_TO_INT "Exceeds the limit (%d) for integer string conversion: value has %zd digits" +#define _MAX_STR_DIGITS_ERROR_FMT_TO_STR "Exceeds the limit (%d) for integer string conversion" static PyObject * get_small_int(sdigit ival) @@ -1604,6 +1605,23 @@ long_to_decimal_string_internal(PyObject *aa, size_a = Py_ABS(Py_SIZE(a)); negative = Py_SIZE(a) < 0; + /* quick and dirty pre-check for overflowing the decimal digit limit, + based on the inequality 10/3 >= log2(10) + + explanation in https://github.com/python/cpython/pull/96537 + */ + if (size_a >= 10 * _PY_LONG_MAX_STR_DIGITS_THRESHOLD + / (3 * PyLong_SHIFT) + 2) { + PyInterpreterState *interp = _PyInterpreterState_GET(); + int max_str_digits = interp->int_max_str_digits; + if ((max_str_digits > 0) && + (max_str_digits / (3 * PyLong_SHIFT) <= (size_a - 11) / 10)) { + PyErr_Format(PyExc_ValueError, _MAX_STR_DIGITS_ERROR_FMT_TO_STR, + max_str_digits); + return -1; + } + } + /* quick and dirty upper bound for the number of digits required to express a in base _PyLong_DECIMAL_BASE: @@ -1669,8 +1687,8 @@ long_to_decimal_string_internal(PyObject *aa, Py_ssize_t strlen_nosign = strlen - negative; if ((max_str_digits > 0) && (strlen_nosign > max_str_digits)) { Py_DECREF(scratch); - PyErr_Format(PyExc_ValueError, _MAX_STR_DIGITS_ERROR_FMT, - max_str_digits, strlen_nosign); + PyErr_Format(PyExc_ValueError, _MAX_STR_DIGITS_ERROR_FMT_TO_STR, + max_str_digits); return -1; } } @@ -2344,7 +2362,7 @@ digit beyond the first. PyInterpreterState *interp = _PyInterpreterState_GET(); int max_str_digits = interp->int_max_str_digits; if ((max_str_digits > 0) && (digits > max_str_digits)) { - PyErr_Format(PyExc_ValueError, _MAX_STR_DIGITS_ERROR_FMT, + PyErr_Format(PyExc_ValueError, _MAX_STR_DIGITS_ERROR_FMT_TO_INT, max_str_digits, digits); return NULL; } |