diff options
Diffstat (limited to 'Doc')
-rw-r--r-- | Doc/library/html.entities.rst | 21 |
1 files changed, 19 insertions, 2 deletions
diff --git a/Doc/library/html.entities.rst b/Doc/library/html.entities.rst index b8b4aa8..f0dd7aa 100644 --- a/Doc/library/html.entities.rst +++ b/Doc/library/html.entities.rst @@ -9,13 +9,25 @@ -------------- -This module defines three dictionaries, ``name2codepoint``, ``codepoint2name``, -and ``entitydefs``. ``entitydefs`` is used to provide the :attr:`entitydefs` +This module defines four dictionaries, :data:`html5`, +:data:`name2codepoint`, :data:`codepoint2name`, and :data:`entitydefs`. +:data:`entitydefs` is used to provide the :attr:`entitydefs` attribute of the :class:`html.parser.HTMLParser` class. The definition provided here contains all the entities defined by XHTML 1.0 that can be handled using simple textual substitution in the Latin-1 character set (ISO-8859-1). +.. data:: html5 + + A dictionary that maps HTML5 named character references [#]_ to the + equivalent Unicode character(s), e.g. ``html5['gt;'] == '>'``. + Note that the trailing semicolon is included in the name (e.g. ``'gt;'``), + however some of the names are accepted by the standard even without the + semicolon: in this case the name is present with and without the ``';'``. + + .. versionadded:: 3.3 + + .. data:: entitydefs A dictionary mapping XHTML 1.0 entity definitions to their replacement text in @@ -30,3 +42,8 @@ simple textual substitution in the Latin-1 character set (ISO-8859-1). .. data:: codepoint2name A dictionary that maps Unicode codepoints to HTML entity names. + + +.. rubric:: Footnotes + +.. [#] See http://www.w3.org/TR/html5/named-character-references.html |