summaryrefslogtreecommitdiffstats
path: root/Doc/library/html.entities.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Doc/library/html.entities.rst')
-rw-r--r--Doc/library/html.entities.rst23
1 files changed, 18 insertions, 5 deletions
diff --git a/Doc/library/html.entities.rst b/Doc/library/html.entities.rst
index b8b4aa8..65ce817 100644
--- a/Doc/library/html.entities.rst
+++ b/Doc/library/html.entities.rst
@@ -9,11 +9,19 @@
--------------
-This module defines three dictionaries, ``name2codepoint``, ``codepoint2name``,
-and ``entitydefs``. ``entitydefs`` is used to provide the :attr:`entitydefs`
-attribute of the :class:`html.parser.HTMLParser` class. The definition provided
-here contains all the entities defined by XHTML 1.0 that can be handled using
-simple textual substitution in the Latin-1 character set (ISO-8859-1).
+This module defines four dictionaries, :data:`html5`,
+:data:`name2codepoint`, :data:`codepoint2name`, and :data:`entitydefs`.
+
+
+.. data:: html5
+
+ A dictionary that maps HTML5 named character references [#]_ to the
+ equivalent Unicode character(s), e.g. ``html5['gt;'] == '>'``.
+ Note that the trailing semicolon is included in the name (e.g. ``'gt;'``),
+ however some of the names are accepted by the standard even without the
+ semicolon: in this case the name is present with and without the ``';'``.
+
+ .. versionadded:: 3.3
.. data:: entitydefs
@@ -30,3 +38,8 @@ simple textual substitution in the Latin-1 character set (ISO-8859-1).
.. data:: codepoint2name
A dictionary that maps Unicode codepoints to HTML entity names.
+
+
+.. rubric:: Footnotes
+
+.. [#] See http://www.w3.org/TR/html5/named-character-references.html