diff options
author | Raymond Hettinger <python@rcn.com> | 2015-03-22 22:29:09 (GMT) |
---|---|---|
committer | Raymond Hettinger <python@rcn.com> | 2015-03-22 22:29:09 (GMT) |
commit | f6e31b79a873cce039070289bcf1d6fe434cb19e (patch) | |
tree | 635374ddb7708c3e1d2f876411b0ee4d7d1a2a42 /Doc | |
parent | 936da2a796459fcb09a292d749971db8d9a7a0dd (diff) | |
download | cpython-f6e31b79a873cce039070289bcf1d6fe434cb19e.zip cpython-f6e31b79a873cce039070289bcf1d6fe434cb19e.tar.gz cpython-f6e31b79a873cce039070289bcf1d6fe434cb19e.tar.bz2 |
Issue 23729: Document ElementTree namespace handling and fix an omission in the XPATH predicate table.
Diffstat (limited to 'Doc')
-rw-r--r-- | Doc/library/xml.etree.elementtree.rst | 68 |
1 files changed, 68 insertions, 0 deletions
diff --git a/Doc/library/xml.etree.elementtree.rst b/Doc/library/xml.etree.elementtree.rst index 3263dc2a..f09934b 100644 --- a/Doc/library/xml.etree.elementtree.rst +++ b/Doc/library/xml.etree.elementtree.rst @@ -284,6 +284,71 @@ sub-elements for a given element:: >>> ET.dump(a) <a><b /><c><d /></c></a> +Parsing XML with Namespaces +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If the XML input has `namespaces +<https://en.wikipedia.org/wiki/XML_namespace>`__, tags and attributes +with prefixes in the form ``prefix:sometag`` get expanded to +``{uri}tag`` where the *prefix* is replaced by the full *URI*. Also, +if there is a `default namespace +<http://www.w3.org/TR/2006/REC-xml-names-20060816/#defaulting>`__, +that full URI gets prepended to all of the non-prefixed tags. + +Here is an XML example that incorporates two namespaces, one with the +prefix "fictional" and the other serving as the default namespace: + +.. code-block:: xml + + <?xml version="1.0"?> + <actors xmlns:fictional="http://characters.example.com" + xmlns="http://people.example.com"> + <actor> + <name>John Cleese</name> + <fictional:character>Lancelot</fictional:character> + <fictional:character>Archie Leach</fictional:character> + </actor> + <actor> + <name>Eric Idle</name> + <fictional:character>Sir Robin</fictional:character> + <fictional:character>Gunther</fictional:character> + <fictional:character>Commander Clement</fictional:character> + </actor> + </actors> + +One way to search and explore this XML example is to manually add the +URI to every tag or attribute in the xpath of a *find()* or *findall()*:: + + root = from_string(xml_text) + for actor in root.findall('{http://people.example.com}actor'): + name = actor.find('{http://people.example.com}name') + print(name.text) + for char in actor.findall('{http://characters.example.com}character'): + print(' |-->', char.text) + +Another way to search the namespaced XML example is to create a +dictionary with your own prefixes and use those in the search:: + + ns = {'real_person': 'http://people.example.com', + 'role': 'http://characters.example.com'} + + for actor in root.findall('real_person:actor', ns): + name = actor.find('real_person:name', ns) + print(name.text) + for char in actor.findall('role:character', ns): + print(' |-->', char.text) + +These two approaches both output:: + + John Cleese + |--> Lancelot + |--> Archie Leach + Eric Idle + |--> Sir Robin + |--> Gunther + |--> Commander Clement + + Additional resources ^^^^^^^^^^^^^^^^^^^^ @@ -366,6 +431,9 @@ Supported XPath syntax | ``[tag]`` | Selects all elements that have a child named | | | ``tag``. Only immediate children are supported. | +-----------------------+------------------------------------------------------+ +| ``[tag=text]`` | Selects all elements that have a child named | +| | ``tag`` that includes the given ``text``. | ++-----------------------+------------------------------------------------------+ | ``[position]`` | Selects all elements that are located at the given | | | position. The position can be either an integer | | | (1 is the first position), the expression ``last()`` | |