diff options
author | Christian Heimes <christian@cheimes.de> | 2013-03-26 16:47:23 (GMT) |
---|---|---|
committer | Christian Heimes <christian@cheimes.de> | 2013-03-26 16:47:23 (GMT) |
commit | 768f6a53601a6c4e0b914aaedb977dd2ca97532a (patch) | |
tree | 0a15e62fa957038dd0e6ad2cd704d3378ac336a5 /Doc | |
parent | c40f97f8beaacfb834d3f4f22d581e37dd82c14d (diff) | |
parent | 7380a67267d9ec59b70617ea59ff31819f530942 (diff) | |
download | cpython-768f6a53601a6c4e0b914aaedb977dd2ca97532a.zip cpython-768f6a53601a6c4e0b914aaedb977dd2ca97532a.tar.gz cpython-768f6a53601a6c4e0b914aaedb977dd2ca97532a.tar.bz2 |
Issue 17538: Document XML vulnerabilties
Diffstat (limited to 'Doc')
-rw-r--r-- | Doc/library/pyexpat.rst | 7 | ||||
-rw-r--r-- | Doc/library/xml.dom.minidom.rst | 8 | ||||
-rw-r--r-- | Doc/library/xml.dom.pulldom.rst | 8 | ||||
-rw-r--r-- | Doc/library/xml.etree.elementtree.rst | 7 | ||||
-rw-r--r-- | Doc/library/xml.rst | 104 | ||||
-rw-r--r-- | Doc/library/xml.sax.rst | 8 | ||||
-rw-r--r-- | Doc/library/xmlrpc.client.rst | 7 | ||||
-rw-r--r-- | Doc/library/xmlrpc.server.rst | 7 |
8 files changed, 156 insertions, 0 deletions
diff --git a/Doc/library/pyexpat.rst b/Doc/library/pyexpat.rst index 861546c..420e407 100644 --- a/Doc/library/pyexpat.rst +++ b/Doc/library/pyexpat.rst @@ -14,6 +14,13 @@ references to these attributes should be marked using the :member: role. +.. warning:: + + The :mod:`pyexpat` module is not secure against maliciously + constructed data. If you need to parse untrusted or unauthenticated data see + :ref:`xml-vulnerabilities`. + + .. index:: single: Expat The :mod:`xml.parsers.expat` module is a Python interface to the Expat diff --git a/Doc/library/xml.dom.minidom.rst b/Doc/library/xml.dom.minidom.rst index a75325f..e90c177 100644 --- a/Doc/library/xml.dom.minidom.rst +++ b/Doc/library/xml.dom.minidom.rst @@ -17,6 +17,14 @@ to be simpler than the full DOM and also significantly smaller. Users who are not already proficient with the DOM should consider using the :mod:`xml.etree.ElementTree` module for their XML processing instead + +.. warning:: + + The :mod:`xml.dom.minidom` module is not secure against + maliciously constructed data. If you need to parse untrusted or + unauthenticated data see :ref:`xml-vulnerabilities`. + + DOM applications typically start by parsing some XML into a DOM. With :mod:`xml.dom.minidom`, this is done through the parse functions:: diff --git a/Doc/library/xml.dom.pulldom.rst b/Doc/library/xml.dom.pulldom.rst index eb16a09..8aa9cfb 100644 --- a/Doc/library/xml.dom.pulldom.rst +++ b/Doc/library/xml.dom.pulldom.rst @@ -17,6 +17,14 @@ processing model together with callbacks, the user of a pull parser is responsible for explicitly pulling events from the stream, looping over those events until either processing is finished or an error condition occurs. + +.. warning:: + + The :mod:`xml.dom.pulldom` module is not secure against + maliciously constructed data. If you need to parse untrusted or + unauthenticated data see :ref:`xml-vulnerabilities`. + + Example:: from xml.dom import pulldom diff --git a/Doc/library/xml.etree.elementtree.rst b/Doc/library/xml.etree.elementtree.rst index 144e344..e429f04 100644 --- a/Doc/library/xml.etree.elementtree.rst +++ b/Doc/library/xml.etree.elementtree.rst @@ -12,6 +12,13 @@ for parsing and creating XML data. This module will use a fast implementation whenever available. The :mod:`xml.etree.cElementTree` module is deprecated. + +.. warning:: + + The :mod:`xml.etree.ElementTree` module is not secure against + maliciously constructed data. If you need to parse untrusted or + unauthenticated data see :ref:`xml-vulnerabilities`. + Tutorial -------- diff --git a/Doc/library/xml.rst b/Doc/library/xml.rst index 21b2e23..b86d51a 100644 --- a/Doc/library/xml.rst +++ b/Doc/library/xml.rst @@ -3,8 +3,21 @@ XML Processing Modules ====================== +.. module:: xml + :synopsis: Package containing XML processing modules +.. sectionauthor:: Christian Heimes <christian@python.org> +.. sectionauthor:: Georg Brandl <georg@python.org> + + Python's interfaces for processing XML are grouped in the ``xml`` package. +.. warning:: + + The XML modules are not secure against erroneous or maliciously + constructed data. If you need to parse untrusted or unauthenticated data see + :ref:`xml-vulnerabilities`. + + It is important to note that modules in the :mod:`xml` package require that there be at least one SAX-compliant XML parser available. The Expat parser is included with Python, so the :mod:`xml.parsers.expat` module will always be @@ -27,3 +40,94 @@ The XML handling submodules are: * :mod:`xml.sax`: SAX2 base classes and convenience functions * :mod:`xml.parsers.expat`: the Expat parser binding + + +.. _xml-vulnerabilities: + +XML vulnerabilities +=================== + +The XML processing modules are not secure against maliciously constructed data. +An attacker can abuse vulnerabilities for e.g. denial of service attacks, to +access local files, to generate network connections to other machines, or +to or circumvent firewalls. The attacks on XML abuse unfamiliar features +like inline `DTD`_ (document type definition) with entities. + + +========================= ======== ========= ========= ======== ========= +kind sax etree minidom pulldom xmlrpc +========================= ======== ========= ========= ======== ========= +billion laughs **True** **True** **True** **True** **True** +quadratic blowup **True** **True** **True** **True** **True** +external entity expansion **True** False (1) False (2) **True** False (3) +DTD retrieval **True** False False **True** False +decompression bomb False False False False **True** +========================= ======== ========= ========= ======== ========= + +1. :mod:`xml.etree.ElementTree` doesn't expand external entities and raises a + ParserError when an entity occurs. +2. :mod:`xml.dom.minidom` doesn't expand external entities and simply returns + the unexpanded entity verbatim. +3. :mod:`xmlrpclib` doesn't expand external entities and omits them. + + +billion laughs / exponential entity expansion + The `Billion Laughs`_ attack -- also known as exponential entity expansion -- + uses multiple levels of nested entities. Each entity refers to another entity + several times, the final entity definition contains a small string. Eventually + the small string is expanded to several gigabytes. The exponential expansion + consumes lots of CPU time, too. + +quadratic blowup entity expansion + A quadratic blowup attack is similar to a `Billion Laughs`_ attack; it abuses + entity expansion, too. Instead of nested entities it repeats one large entity + with a couple of thousand chars over and over again. The attack isn't as + efficient as the exponential case but it avoids triggering countermeasures of + parsers against heavily nested entities. + +external entity expansion + Entity declarations can contain more than just text for replacement. They can + also point to external resources by public identifiers or system identifiers. + System identifiers are standard URIs or can refer to local files. The XML + parser retrieves the resource with e.g. HTTP or FTP requests and embeds the + content into the XML document. + +DTD retrieval + Some XML libraries like Python's mod:'xml.dom.pulldom' retrieve document type + definitions from remote or local locations. The feature has similar + implications as the external entity expansion issue. + +decompression bomb + The issue of decompression bombs (aka `ZIP bomb`_) apply to all XML libraries + that can parse compressed XML stream like gzipped HTTP streams or LZMA-ed + files. For an attacker it can reduce the amount of transmitted data by three + magnitudes or more. + +The documentation of `defusedxml`_ on PyPI has further information about +all known attack vectors with examples and references. + +defused packages +---------------- + +`defusedxml`_ is a pure Python package with modified subclasses of all stdlib +XML parsers that prevent any potentially malicious operation. The courses of +action are recommended for any server code that parses untrusted XML data. The +package also ships with example exploits and an extended documentation on more +XML exploits like xpath injection. + +`defusedexpat`_ provides a modified libexpat and patched replacment +:mod:`pyexpat` extension module with countermeasures against entity expansion +DoS attacks. Defusedexpat still allows a sane and configurable amount of entity +expansions. The modifications will be merged into future releases of Python. + +The workarounds and modifications are not included in patch releases as they +break backward compatibility. After all inline DTD and entity expansion are +well-definied XML features. + + +.. _defusedxml: <https://pypi.python.org/pypi/defusedxml/> +.. _defusedexpat: <https://pypi.python.org/pypi/defusedexpat/> +.. _Billion Laughs: http://en.wikipedia.org/wiki/Billion_laughs +.. _ZIP bomb: http://en.wikipedia.org/wiki/Zip_bomb +.. _DTD: http://en.wikipedia.org/wiki/Document_Type_Definition + diff --git a/Doc/library/xml.sax.rst b/Doc/library/xml.sax.rst index 1bf55b4..d5c56b6 100644 --- a/Doc/library/xml.sax.rst +++ b/Doc/library/xml.sax.rst @@ -13,6 +13,14 @@ Simple API for XML (SAX) interface for Python. The package itself provides the SAX exceptions and the convenience functions which will be most used by users of the SAX API. + +.. warning:: + + The :mod:`xml.sax` module is not secure against maliciously + constructed data. If you need to parse untrusted or unauthenticated data see + :ref:`xml-vulnerabilities`. + + The convenience functions are: diff --git a/Doc/library/xmlrpc.client.rst b/Doc/library/xmlrpc.client.rst index 1871c99..3a53655 100644 --- a/Doc/library/xmlrpc.client.rst +++ b/Doc/library/xmlrpc.client.rst @@ -21,6 +21,13 @@ supports writing XML-RPC client code; it handles all the details of translating between conformable Python objects and XML on the wire. +.. warning:: + + The :mod:`xmlrpc.client` module is not secure against maliciously + constructed data. If you need to parse untrusted or unauthenticated data see + :ref:`xml-vulnerabilities`. + + .. class:: ServerProxy(uri, transport=None, encoding=None, verbose=False, \ allow_none=False, use_datetime=False, \ use_builtin_types=False) diff --git a/Doc/library/xmlrpc.server.rst b/Doc/library/xmlrpc.server.rst index 6493fd4..18fee2f 100644 --- a/Doc/library/xmlrpc.server.rst +++ b/Doc/library/xmlrpc.server.rst @@ -16,6 +16,13 @@ servers written in Python. Servers can either be free standing, using :class:`CGIXMLRPCRequestHandler`. +.. warning:: + + The :mod:`xmlrpc.client` module is not secure against maliciously + constructed data. If you need to parse untrusted or unauthenticated data see + :ref:`xml-vulnerabilities`. + + .. class:: SimpleXMLRPCServer(addr, requestHandler=SimpleXMLRPCRequestHandler,\ logRequests=True, allow_none=False, encoding=None,\ bind_and_activate=True, use_builtin_types=False) |