From bd866e972c1db654105a5204742eeff342a91719 Mon Sep 17 00:00:00 2001 From: Antoine Pitrou Date: Sat, 5 Feb 2011 12:13:38 +0000 Subject: Everybody hates this one :) (bytes indexing) --- Doc/howto/pyporting.rst | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/Doc/howto/pyporting.rst b/Doc/howto/pyporting.rst index 38a13af..0c7721c 100644 --- a/Doc/howto/pyporting.rst +++ b/Doc/howto/pyporting.rst @@ -367,6 +367,37 @@ To turn the warning into an exception, use the ``-bb`` flag instead:: BytesWarning: Comparison between bytes and string +Indexing bytes objects +'''''''''''''''''''''' + +Another potentially surprising change is the indexing behaviour of bytes +objects in Python 3:: + + >>> b"xyz"[0] + 120 + +Indeed, Python 3 bytes objects (as well as :class:`bytearray` objects) +are sequences of integers. But code converted from Python 2 will often +assume that indexing a bytestring produces another bytestring, not an +integer. To reconcile both behaviours, use slicing:: + + >>> b"xyz"[0:1] + b'x' + >>> n = 1 + >>> b"xyz"[n:n+1] + b'y' + +The only remaining gotcha is that an out-of-bounds slice returns an empty +bytes object instead of raising ``IndexError``: + + >>> b"xyz"[3] + Traceback (most recent call last): + File "", line 1, in + IndexError: index out of range + >>> b"xyz"[3:4] + b'' + + ``__str__()``/``__unicode__()`` ''''''''''''''''''''''''''''''' In Python 2, objects can specify both a string and unicode representation of -- cgit v0.12