From d6b20dc54fb46e699fe6358534ee8302ad91d303 Mon Sep 17 00:00:00 2001 From: Georg Brandl Date: Thu, 6 Dec 2007 09:45:39 +0000 Subject: Better re.split examples. --- Doc/library/re.rst | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-) diff --git a/Doc/library/re.rst b/Doc/library/re.rst index fbc9267..dae765e 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -1091,16 +1091,21 @@ method is invaluable for converting textual data into data structures that can b easily read and modified by Python as demonstrated in the following example that creates a phonebook. -First, get the input using triple-quoted string syntax:: +First, here is the input. Normally it may come from a file, here we are using +triple-quoted string syntax:: - >>> input = """Ross McFluff 834.345.1254 155 Elm Street - Ronald Heathmore 892.345.3428 436 Finley Avenue - Frank Burger 925.541.7625 662 South Dogwood Way - Heather Albrecht 548.326.4584 919 Park Place""" + >>> input = """Ross McFluff: 834.345.1254 155 Elm Street -Then, convert the string into a list with each line having its own entry:: + Ronald Heathmore: 892.345.3428 436 Finley Avenue + Frank Burger: 925.541.7625 662 South Dogwood Way - >>> entries = re.split("\n", input) + + Heather Albrecht: 548.326.4584 919 Park Place""" + +The entries are separated by one or more newlines. Now we convert the string +into a list with each nonempty line having its own entry:: + + >>> entries = re.split("\n+", input) >>> entries ['Ross McFluff 834.345.1254 155 Elm Street', 'Ronald Heathmore 892.345.3428 436 Finley Avenue', @@ -1111,16 +1116,17 @@ Finally, split each entry into a list with first name, last name, telephone number, and address. We use the ``maxsplit`` paramater of :func:`split` because the address has spaces, our splitting pattern, in it:: - >>> [re.split(" ", entry, 3) for entry in entries] + >>> [re.split(":? ", entry, 3) for entry in entries] [['Ross', 'McFluff', '834.345.1254', '155 Elm Street'], ['Ronald', 'Heathmore', '892.345.3428', '436 Finley Avenue'], ['Frank', 'Burger', '925.541.7625', '662 South Dogwood Way'], ['Heather', 'Albrecht', '548.326.4584', '919 Park Place']] -With a ``maxsplit`` of ``4``, we could seperate the house number from the street -name:: +The ``:?`` pattern matches the colon after the last name, so that it does not +occur in the result list. With a ``maxsplit`` of ``4``, we could seperate the +house number from the street name:: - >>> [re.split(" ", entry, 4) for entry in entries] + >>> [re.split(":? ", entry, 4) for entry in entries] [['Ross', 'McFluff', '834.345.1254', '155', 'Elm Street'], ['Ronald', 'Heathmore', '892.345.3428', '436', 'Finley Avenue'], ['Frank', 'Burger', '925.541.7625', '662', 'South Dogwood Way'], -- cgit v0.12