gh-118164: Break a loop between _pydecimal and _pylong and optimize int to str conversion (GH-118483)

For converting large ints to strings, CPython invokes a function in _pylong.py, which uses the decimal module to implement an asymptotically waaaaay sub-quadratic algorithm. But if the C decimal module isn't available, CPython uses _pydecimal.py instead. Which in turn frequently does str(int). If the int is very large, _pylong ends up doing the work, which in turn asks decimal to do "big" arithmetic, which in turn calls str(big_int), which in turn ... it can become infinite mutual recursion. This change introduces a different int->str function that doesn't use decimal. It's asymptotically worse, "Karatsuba time" instead of quadratic time, so still a huge improvement. _pylong switches to that when the C decimal isn't available. It is also used for not too large integers (less than 450_000 bits), where it is faster (up to 2 times for 30_000 bits) than the asymptotically better implementation that uses the C decimal. Co-authored-by: Tim Peters <tim.peters@gmail.com>
author: Serhiy Storchaka <storchaka@gmail.com> 2024-05-05 05:20:06 (GMT)
committer: GitHub <noreply@github.com> 2024-05-05 05:20:06 (GMT)
commit: 711c80bfca5dd17cb7c6ec26f0e44848b33aec04 (patch)
tree: 62d8956a00dd3f7c8c5a81fd0e3229ca7c2018ea /Lib/test/test_int.py
parent: 5dd36732c850084ce262b7869ed90d73a281296a (diff)
download: cpython-711c80bfca5dd17cb7c6ec26f0e44848b33aec04.zip
cpython-711c80bfca5dd17cb7c6ec26f0e44848b33aec04.tar.gz
cpython-711c80bfca5dd17cb7c6ec26f0e44848b33aec04.tar.bz2
1 files changed, 21 insertions, 10 deletions
diff --git a/Lib/test/test_int.py b/Lib/test/test_int.py
index 47fc50a..c862639 100644
--- a/Lib/test/test_int.py
+++ b/Lib/test/test_int.py
@@ -829,17 +829,28 @@ class PyLongModuleTests(unittest.TestCase):
         sys.set_int_max_str_digits(self._previous_limit)
         super().tearDown()
 
-    def test_pylong_int_to_decimal(self):
-        n = (1 << 100_000) - 1
-        suffix = '9883109375'
+    def _test_pylong_int_to_decimal(self, n, suffix):
         s = str(n)
-        assert s[-10:] == suffix
-        s = str(-n)
-        assert s[-10:] == suffix
-        s = '%d' % n
-        assert s[-10:] == suffix
-        s = b'%d' % n
-        assert s[-10:] == suffix.encode('ascii')
+        self.assertEqual(s[-10:], suffix)
+        s2 = str(-n)
+        self.assertEqual(s2, '-' + s)
+        s3 = '%d' % n
+        self.assertEqual(s3, s)
+        s4 = b'%d' % n
+        self.assertEqual(s4, s.encode('ascii'))
+
+    def test_pylong_int_to_decimal(self):
+        self._test_pylong_int_to_decimal((1 << 100_000), '9883109376')
+        self._test_pylong_int_to_decimal((1 << 100_000) - 1, '9883109375')
+        self._test_pylong_int_to_decimal(10**30_000, '0000000000')
+        self._test_pylong_int_to_decimal(10**30_000 - 1, '9999999999')
+        self._test_pylong_int_to_decimal(3**60_000, '9313200001')
+
+    @support.requires_resource('cpu')
+    def test_pylong_int_to_decimal_2(self):
+        self._test_pylong_int_to_decimal(2**1_000_000, '2747109376')
+        self._test_pylong_int_to_decimal(10**300_000, '0000000000')
+        self._test_pylong_int_to_decimal(3**600_000, '3132000001')
 
     def test_pylong_int_divmod(self):
         n = (1 << 100_000)
author	Serhiy Storchaka <storchaka@gmail.com>	2024-05-05 05:20:06 (GMT)
committer	GitHub <noreply@github.com>	2024-05-05 05:20:06 (GMT)
commit	711c80bfca5dd17cb7c6ec26f0e44848b33aec04 (patch)
tree	62d8956a00dd3f7c8c5a81fd0e3229ca7c2018ea /Lib/test/test_int.py
parent	5dd36732c850084ce262b7869ed90d73a281296a (diff)
download	cpython-711c80bfca5dd17cb7c6ec26f0e44848b33aec04.zip cpython-711c80bfca5dd17cb7c6ec26f0e44848b33aec04.tar.gz cpython-711c80bfca5dd17cb7c6ec26f0e44848b33aec04.tar.bz2