summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorAntoine Pitrou <solipsis@pitrou.net>2012-08-24 17:39:47 (GMT)
committerAntoine Pitrou <solipsis@pitrou.net>2012-08-24 17:39:47 (GMT)
commitcbf9d5f6cd9c9f2fab1dee39561817476bcce944 (patch)
tree2e941c7bbc0caffadf98c1312d52a476846ff90c
parent07c6e7168919c275e47fa35c741413270d3d80fd (diff)
parent331624b67d1b86187b13b4b0030dbca60003fa1e (diff)
downloadcpython-cbf9d5f6cd9c9f2fab1dee39561817476bcce944.zip
cpython-cbf9d5f6cd9c9f2fab1dee39561817476bcce944.tar.gz
cpython-cbf9d5f6cd9c9f2fab1dee39561817476bcce944.tar.bz2
Issue #14674: Add a discussion of the json module's standard compliance.
Patch by Chris Rebert.
-rw-r--r--Doc/library/json.rst117
-rw-r--r--Misc/NEWS3
2 files changed, 114 insertions, 6 deletions
diff --git a/Doc/library/json.rst b/Doc/library/json.rst
index 8686561..5f15926 100644
--- a/Doc/library/json.rst
+++ b/Doc/library/json.rst
@@ -6,8 +6,10 @@
.. moduleauthor:: Bob Ippolito <bob@redivi.com>
.. sectionauthor:: Bob Ippolito <bob@redivi.com>
-`JSON (JavaScript Object Notation) <http://json.org>`_ is a subset of JavaScript
-syntax (ECMA-262 3rd edition) used as a lightweight data interchange format.
+`JSON (JavaScript Object Notation) <http://json.org>`_, specified by
+:rfc:`4627`, is a lightweight data interchange format based on a subset of
+`JavaScript <http://en.wikipedia.org/wiki/JavaScript>`_ syntax (`ECMA-262 3rd
+edition <http://www.ecma-international.org/publications/files/ECMA-ST-ARCH/ECMA-262,%203rd%20edition,%20December%201999.pdf>`_).
:mod:`json` exposes an API familiar to users of the standard library
:mod:`marshal` and :mod:`pickle` modules.
@@ -105,8 +107,10 @@ Using json.tool from the shell to validate and pretty-print::
.. note::
- The JSON produced by this module's default settings is a subset of
- YAML, so it may be used as a serializer for that as well.
+ JSON is a subset of `YAML <http://yaml.org/>`_ 1.2. The JSON produced by
+ this module's default settings (in particular, the default *separators*
+ value) is also a subset of YAML 1.0 and 1.1. This module can thus also be
+ used as a YAML serializer.
Basic Usage
@@ -185,7 +189,8 @@ Basic Usage
*object_hook* is an optional function that will be called with the result of
any object literal decoded (a :class:`dict`). The return value of
*object_hook* will be used instead of the :class:`dict`. This feature can be used
- to implement custom decoders (e.g. JSON-RPC class hinting).
+ to implement custom decoders (e.g. `JSON-RPC <http://www.jsonrpc.org>`_
+ class hinting).
*object_pairs_hook* is an optional function that will be called with the
result of any object literal decoded with an ordered list of pairs. The
@@ -230,7 +235,7 @@ Basic Usage
*encoding* which is ignored and deprecated.
-Encoders and decoders
+Encoders and Decoders
---------------------
.. class:: JSONDecoder(object_hook=None, parse_float=None, parse_int=None, parse_constant=None, strict=True, object_pairs_hook=None)
@@ -415,3 +420,103 @@ Encoders and decoders
for chunk in json.JSONEncoder().iterencode(bigobject):
mysocket.write(chunk)
+
+
+Standard Compliance
+-------------------
+
+The JSON format is specified by :rfc:`4627`. This section details this
+module's level of compliance with the RFC. For simplicity,
+:class:`JSONEncoder` and :class:`JSONDecoder` subclasses, and parameters other
+than those explicitly mentioned, are not considered.
+
+This module does not comply with the RFC in a strict fashion, implementing some
+extensions that are valid JavaScript but not valid JSON. In particular:
+
+- Top-level non-object, non-array values are accepted and output;
+- Infinite and NaN number values are accepted and output;
+- Repeated names within an object are accepted, and only the value of the last
+ name-value pair is used.
+
+Since the RFC permits RFC-compliant parsers to accept input texts that are not
+RFC-compliant, this module's deserializer is technically RFC-compliant under
+default settings.
+
+Character Encodings
+^^^^^^^^^^^^^^^^^^^
+
+The RFC recommends that JSON be represented using either UTF-8, UTF-16, or
+UTF-32, with UTF-8 being the default.
+
+As permitted, though not required, by the RFC, this module's serializer sets
+*ensure_ascii=True* by default, thus escaping the output so that the resulting
+strings only contain ASCII characters.
+
+Other than the *ensure_ascii* parameter, this module is defined strictly in
+terms of conversion between Python objects and
+:class:`Unicode strings <str>`, and thus does not otherwise address the issue
+of character encodings.
+
+
+Top-level Non-Object, Non-Array Values
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The RFC specifies that the top-level value of a JSON text must be either a
+JSON object or array (Python :class:`dict` or :class:`list`). This module's
+deserializer also accepts input texts consisting solely of a
+JSON null, boolean, number, or string value::
+
+ >>> just_a_json_string = '"spam and eggs"' # Not by itself a valid JSON text
+ >>> json.loads(just_a_json_string)
+ 'spam and eggs'
+
+This module itself does not include a way to request that such input texts be
+regarded as illegal. Likewise, this module's serializer also accepts single
+Python :data:`None`, :class:`bool`, numeric, and :class:`str`
+values as input and will generate output texts consisting solely of a top-level
+JSON null, boolean, number, or string value without raising an exception::
+
+ >>> neither_a_list_nor_a_dict = "spam and eggs"
+ >>> json.dumps(neither_a_list_nor_a_dict) # The result is not a valid JSON text
+ '"spam and eggs"'
+
+This module's serializer does not itself include a way to enforce the
+aforementioned constraint.
+
+
+Infinite and NaN Number Values
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The RFC does not permit the representation of infinite or NaN number values.
+Despite that, by default, this module accepts and outputs ``Infinity``,
+``-Infinity``, and ``NaN`` as if they were valid JSON number literal values::
+
+ >>> # Neither of these calls raises an exception, but the results are not valid JSON
+ >>> json.dumps(float('-inf'))
+ '-Infinity'
+ >>> json.dumps(float('nan'))
+ 'NaN'
+ >>> # Same when deserializing
+ >>> json.loads('-Infinity')
+ -inf
+ >>> json.loads('NaN')
+ nan
+
+In the serializer, the *allow_nan* parameter can be used to alter this
+behavior. In the deserializer, the *parse_constant* parameter can be used to
+alter this behavior.
+
+
+Repeated Names Within an Object
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The RFC specifies that the names within a JSON object should be unique, but
+does not specify how repeated names in JSON objects should be handled. By
+default, this module does not raise an exception; instead, it ignores all but
+the last name-value pair for a given name::
+
+ >>> weird_json = '{"x": 1, "x": 2, "x": 3}'
+ >>> json.loads(weird_json)
+ {'x': 3}
+
+The *object_pairs_hook* parameter can be used to alter this behavior.
diff --git a/Misc/NEWS b/Misc/NEWS
index e772a70..6d046ff 100644
--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -84,6 +84,9 @@ Tools/Demos
Documentation
-------------
+- Issue #14674: Add a discussion of the json module's standard compliance.
+ Patch by Chris Rebert.
+
- Create a 'Concurrent Execution' section in the docs, and split up the
'Optional Operating System Services' section to use a more user-centric
classification scheme (splitting them across the new CE section, IPC and