diff options
author | Antoine Pitrou <solipsis@pitrou.net> | 2011-11-25 15:34:23 (GMT) |
---|---|---|
committer | Antoine Pitrou <solipsis@pitrou.net> | 2011-11-25 15:34:23 (GMT) |
commit | e333d00d3ae71467454d94fbd9ece93c1e821022 (patch) | |
tree | 72bf479303b7b31f629deeb5c8ae7cfc350a2a5e /Doc | |
parent | 0481f4bca45fbe9b3c3f2a2d7135fcbd8b34dc30 (diff) | |
parent | fd9ebd4a361805607baea3e038652f207575ced8 (diff) | |
download | cpython-e333d00d3ae71467454d94fbd9ece93c1e821022.zip cpython-e333d00d3ae71467454d94fbd9ece93c1e821022.tar.gz cpython-e333d00d3ae71467454d94fbd9ece93c1e821022.tar.bz2 |
Clarify concatenation behaviour of immutable strings, and remove explicit
mention of the CPython optimization hack.
Diffstat (limited to 'Doc')
-rw-r--r-- | Doc/faq/programming.rst | 26 | ||||
-rw-r--r-- | Doc/library/stdtypes.rst | 21 |
2 files changed, 38 insertions, 9 deletions
diff --git a/Doc/faq/programming.rst b/Doc/faq/programming.rst index d1a3daf..f157a94 100644 --- a/Doc/faq/programming.rst +++ b/Doc/faq/programming.rst @@ -989,6 +989,32 @@ What does 'UnicodeDecodeError' or 'UnicodeEncodeError' error mean? See the :ref:`unicode-howto`. +What is the most efficient way to concatenate many strings together? +-------------------------------------------------------------------- + +:class:`str` and :class:`bytes` objects are immutable, therefore concatenating +many strings together is inefficient as each concatenation creates a new +object. In the general case, the total runtime cost is quadratic in the +total string length. + +To accumulate many :class:`str` objects, the recommended idiom is to place +them into a list and call :meth:`str.join` at the end:: + + chunks = [] + for s in my_strings: + chunks.append(s) + result = ''.join(chunks) + +(another reasonably efficient idiom is to use :class:`io.StringIO`) + +To accumulate many :class:`bytes` objects, the recommended idiom is to extend +a :class:`bytearray` object using in-place concatenation (the ``+=`` operator):: + + result = bytearray() + for b in my_bytes_objects: + result += b + + Sequences (Tuples/Lists) ======================== diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index cdb2a4a..5bb4324 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -968,15 +968,18 @@ Notes: If *k* is ``None``, it is treated like ``1``. (6) - .. impl-detail:: - - If *s* and *t* are both strings, some Python implementations such as - CPython can usually perform an in-place optimization for assignments of - the form ``s = s + t`` or ``s += t``. When applicable, this optimization - makes quadratic run-time much less likely. This optimization is both - version and implementation dependent. For performance sensitive code, it - is preferable to use the :meth:`str.join` method which assures consistent - linear concatenation performance across versions and implementations. + Concatenating immutable strings always results in a new object. This means + that building up a string by repeated concatenation will have a quadratic + runtime cost in the total string length. To get a linear runtime cost, + you must switch to one of the alternatives below: + + * if concatenating :class:`str` objects, you can build a list and use + :meth:`str.join` at the end; + + * if concatenating :class:`bytes` objects, you can similarly use + :meth:`bytes.join`, or you can do in-place concatenation with a + :class:`bytearray` object. :class:`bytearray` objects are mutable and + have an efficient overallocation mechanism. .. _string-methods: |