diff options
author | Andrew Kuchling <amk@amk.ca> | 2014-02-15 20:33:44 (GMT) |
---|---|---|
committer | Andrew Kuchling <amk@amk.ca> | 2014-02-15 20:33:44 (GMT) |
commit | 4da9ab0357c9c49a81df4a752d15ffe090381df8 (patch) | |
tree | 6377958a6e69c102759f273762fa52d5e8fa2734 /Doc | |
parent | 29352c436cc2489c76736cb1cff00fbdcf7bb0cd (diff) | |
download | cpython-4da9ab0357c9c49a81df4a752d15ffe090381df8.zip cpython-4da9ab0357c9c49a81df4a752d15ffe090381df8.tar.gz cpython-4da9ab0357c9c49a81df4a752d15ffe090381df8.tar.bz2 |
#20237: make a revision pass over the XML vulnerabilities section
Diffstat (limited to 'Doc')
-rw-r--r-- | Doc/library/xml.rst | 73 |
1 files changed, 35 insertions, 38 deletions
diff --git a/Doc/library/xml.rst b/Doc/library/xml.rst index f793bae..0188219 100644 --- a/Doc/library/xml.rst +++ b/Doc/library/xml.rst @@ -14,9 +14,9 @@ Python's interfaces for processing XML are grouped in the ``xml`` package. .. warning:: The XML modules are not secure against erroneous or maliciously - constructed data. If you need to parse untrusted or unauthenticated data see - :ref:`xml-vulnerabilities`. - + constructed data. If you need to parse untrusted or + unauthenticated data see the :ref:`xml-vulnerabilities` and + :ref:`defused-packages` sections. It is important to note that modules in the :mod:`xml` package require that there be at least one SAX-compliant XML parser available. The Expat parser is @@ -46,16 +46,15 @@ The XML handling submodules are: .. _xml-vulnerabilities: XML vulnerabilities -=================== +------------------- The XML processing modules are not secure against maliciously constructed data. -An attacker can abuse vulnerabilities for e.g. denial of service attacks, to -access local files, to generate network connections to other machines, or -to or circumvent firewalls. The attacks on XML abuse unfamiliar features -like inline `DTD`_ (document type definition) with entities. +An attacker can abuse XML features to carry out denial of service attacks, +access local files, generate network connections to other machines, or +circumvent firewalls. -The following table gives an overview of the known attacks and if the various -modules are vulnerable to them. +The following table gives an overview of the known attacks and whether +the various modules are vulnerable to them. ========================= ======== ========= ========= ======== ========= kind sax etree minidom pulldom xmlrpc @@ -68,7 +67,7 @@ decompression bomb No No No No **Yes** ========================= ======== ========= ========= ======== ========= 1. :mod:`xml.etree.ElementTree` doesn't expand external entities and raises a - ParserError when an entity occurs. + :exc:`ParserError` when an entity occurs. 2. :mod:`xml.dom.minidom` doesn't expand external entities and simply returns the unexpanded entity verbatim. 3. :mod:`xmlrpclib` doesn't expand external entities and omits them. @@ -77,23 +76,21 @@ decompression bomb No No No No **Yes** billion laughs / exponential entity expansion The `Billion Laughs`_ attack -- also known as exponential entity expansion -- uses multiple levels of nested entities. Each entity refers to another entity - several times, the final entity definition contains a small string. Eventually - the small string is expanded to several gigabytes. The exponential expansion - consumes lots of CPU time, too. + several times, and the final entity definition contains a small string. + The exponential expansion results in several gigabytes of text and + consumes lots of memory and CPU time. quadratic blowup entity expansion A quadratic blowup attack is similar to a `Billion Laughs`_ attack; it abuses entity expansion, too. Instead of nested entities it repeats one large entity with a couple of thousand chars over and over again. The attack isn't as - efficient as the exponential case but it avoids triggering countermeasures of - parsers against heavily nested entities. + efficient as the exponential case but it avoids triggering parser countermeasures + that forbid deeply-nested entities. external entity expansion Entity declarations can contain more than just text for replacement. They can - also point to external resources by public identifiers or system identifiers. - System identifiers are standard URIs or can refer to local files. The XML - parser retrieves the resource with e.g. HTTP or FTP requests and embeds the - content into the XML document. + also point to external resources or local files. The XML + parser accesses the resource and embeds the content into the XML document. DTD retrieval Some XML libraries like Python's :mod:`xml.dom.pulldom` retrieve document type @@ -101,31 +98,32 @@ DTD retrieval implications as the external entity expansion issue. decompression bomb - The issue of decompression bombs (aka `ZIP bomb`_) apply to all XML libraries - that can parse compressed XML stream like gzipped HTTP streams or LZMA-ed + Decompression bombs (aka `ZIP bomb`_) apply to all XML libraries + that can parse compressed XML streams such as gzipped HTTP streams or + LZMA-compressed files. For an attacker it can reduce the amount of transmitted data by three magnitudes or more. -The documentation of `defusedxml`_ on PyPI has further information about +The documentation for `defusedxml`_ on PyPI has further information about all known attack vectors with examples and references. -defused packages ----------------- +.. _defused-packages: -`defusedxml`_ is a pure Python package with modified subclasses of all stdlib -XML parsers that prevent any potentially malicious operation. The courses of -action are recommended for any server code that parses untrusted XML data. The -package also ships with example exploits and an extended documentation on more -XML exploits like xpath injection. +The :mod:`defusedxml` and :mod:`defusedexpat` Packages +------------------------------------------------------ -`defusedexpat`_ provides a modified libexpat and patched replacment -:mod:`pyexpat` extension module with countermeasures against entity expansion -DoS attacks. Defusedexpat still allows a sane and configurable amount of entity -expansions. The modifications will be merged into future releases of Python. +`defusedxml`_ is a pure Python package with modified subclasses of all stdlib +XML parsers that prevent any potentially malicious operation. Use of this +package is recommended for any server code that parses untrusted XML data. The +package also ships with example exploits and extended documentation on more +XML exploits such as XPath injection. -The workarounds and modifications are not included in patch releases as they -break backward compatibility. After all inline DTD and entity expansion are -well-definied XML features. +`defusedexpat`_ provides a modified libexpat and a patched +:mod:`pyexpat` module that have countermeasures against entity expansion +DoS attacks. The :mod:`defusedexpat` module still allows a sane and configurable amount of entity +expansions. The modifications may be included in some future release of Python, +but will not be included in any bugfix releases of +Python because they break backward compatibility. .. _defusedxml: https://pypi.python.org/pypi/defusedxml/ @@ -133,4 +131,3 @@ well-definied XML features. .. _Billion Laughs: http://en.wikipedia.org/wiki/Billion_laughs .. _ZIP bomb: http://en.wikipedia.org/wiki/Zip_bomb .. _DTD: http://en.wikipedia.org/wiki/Document_Type_Definition - |