diff options
author | Stefan Behnel <stefan_ml@behnel.de> | 2019-05-01 20:34:13 (GMT) |
---|---|---|
committer | GitHub <noreply@github.com> | 2019-05-01 20:34:13 (GMT) |
commit | e1d5dd645d5f59867cb0ad63179110f310cbca89 (patch) | |
tree | 08f42f6dbd41508652886b10c78dfb190d395933 /Doc | |
parent | ee88af3f4f7493df4ecf52faf429e63351bbcd5c (diff) | |
download | cpython-e1d5dd645d5f59867cb0ad63179110f310cbca89.zip cpython-e1d5dd645d5f59867cb0ad63179110f310cbca89.tar.gz cpython-e1d5dd645d5f59867cb0ad63179110f310cbca89.tar.bz2 |
bpo-13611: C14N 2.0 implementation for ElementTree (GH-12966)
* Implement C14N 2.0 as a new canonicalize() function in ElementTree.
Missing features:
- prefix renaming in XPath expressions (tag and attribute text is supported)
- preservation of original prefixes given redundant namespace declarations
Diffstat (limited to 'Doc')
-rw-r--r-- | Doc/library/xml.etree.elementtree.rst | 60 | ||||
-rw-r--r-- | Doc/whatsnew/3.8.rst | 4 |
2 files changed, 64 insertions, 0 deletions
diff --git a/Doc/library/xml.etree.elementtree.rst b/Doc/library/xml.etree.elementtree.rst index 66090af..ef74d0c 100644 --- a/Doc/library/xml.etree.elementtree.rst +++ b/Doc/library/xml.etree.elementtree.rst @@ -465,6 +465,53 @@ Reference Functions ^^^^^^^^^ +.. function:: canonicalize(xml_data=None, *, out=None, from_file=None, **options) + + `C14N 2.0 <https://www.w3.org/TR/xml-c14n2/>`_ transformation function. + + Canonicalization is a way to normalise XML output in a way that allows + byte-by-byte comparisons and digital signatures. It reduced the freedom + that XML serializers have and instead generates a more constrained XML + representation. The main restrictions regard the placement of namespace + declarations, the ordering of attributes, and ignorable whitespace. + + This function takes an XML data string (*xml_data*) or a file path or + file-like object (*from_file*) as input, converts it to the canonical + form, and writes it out using the *out* file(-like) object, if provided, + or returns it as a text string if not. The output file receives text, + not bytes. It should therefore be opened in text mode with ``utf-8`` + encoding. + + Typical uses:: + + xml_data = "<root>...</root>" + print(canonicalize(xml_data)) + + with open("c14n_output.xml", mode='w', encoding='utf-8') as out_file: + canonicalize(xml_data, out=out_file) + + with open("c14n_output.xml", mode='w', encoding='utf-8') as out_file: + canonicalize(from_file="inputfile.xml", out=out_file) + + The configuration *options* are as follows: + + - *with_comments*: set to true to include comments (default: false) + - *strip_text*: set to true to strip whitespace before and after text content + (default: false) + - *rewrite_prefixes*: set to true to replace namespace prefixes by "n{number}" + (default: false) + - *qname_aware_tags*: a set of qname aware tag names in which prefixes + should be replaced in text content (default: empty) + - *qname_aware_attrs*: a set of qname aware attribute names in which prefixes + should be replaced in text content (default: empty) + - *exclude_attrs*: a set of attribute names that should not be serialised + - *exclude_tags*: a set of tag names that should not be serialised + + In the option list above, "a set" refers to any collection or iterable of + strings, no ordering is expected. + + .. versionadded:: 3.8 + .. function:: Comment(text=None) @@ -1114,6 +1161,19 @@ TreeBuilder Objects .. versionadded:: 3.8 +.. class:: C14NWriterTarget(write, *, \ + with_comments=False, strip_text=False, rewrite_prefixes=False, \ + qname_aware_tags=None, qname_aware_attrs=None, \ + exclude_attrs=None, exclude_tags=None) + + A `C14N 2.0 <https://www.w3.org/TR/xml-c14n2/>`_ writer. Arguments are the + same as for the :func:`canonicalize` function. This class does not build a + tree but translates the callback events directly into a serialised form + using the *write* function. + + .. versionadded:: 3.8 + + .. _elementtree-xmlparser-objects: XMLParser Objects diff --git a/Doc/whatsnew/3.8.rst b/Doc/whatsnew/3.8.rst index bbc55dd..37570bc 100644 --- a/Doc/whatsnew/3.8.rst +++ b/Doc/whatsnew/3.8.rst @@ -525,6 +525,10 @@ xml external entities by default. (Contributed by Christian Heimes in :issue:`17239`.) +* The :mod:`xml.etree.ElementTree` module provides a new function + :func:`–xml.etree.ElementTree.canonicalize()` that implements C14N 2.0. + (Contributed by Stefan Behnel in :issue:`13611`.) + Optimizations ============= |