diff options
author | Antoine Pitrou <solipsis@pitrou.net> | 2012-08-24 17:37:23 (GMT) |
---|---|---|
committer | Antoine Pitrou <solipsis@pitrou.net> | 2012-08-24 17:37:23 (GMT) |
commit | 331624b67d1b86187b13b4b0030dbca60003fa1e (patch) | |
tree | cb81b59b3229c5a8daa7611afea99ecc0ceb7d28 /Doc | |
parent | a61b09f406fe5bada60af5d2e23a75b9de87ebb5 (diff) | |
download | cpython-331624b67d1b86187b13b4b0030dbca60003fa1e.zip cpython-331624b67d1b86187b13b4b0030dbca60003fa1e.tar.gz cpython-331624b67d1b86187b13b4b0030dbca60003fa1e.tar.bz2 |
Issue #14674: Add a discussion of the json module's standard compliance.
Patch by Chris Rebert.
Diffstat (limited to 'Doc')
-rw-r--r-- | Doc/library/json.rst | 117 |
1 files changed, 111 insertions, 6 deletions
diff --git a/Doc/library/json.rst b/Doc/library/json.rst index 8686561..5f15926 100644 --- a/Doc/library/json.rst +++ b/Doc/library/json.rst @@ -6,8 +6,10 @@ .. moduleauthor:: Bob Ippolito <bob@redivi.com> .. sectionauthor:: Bob Ippolito <bob@redivi.com> -`JSON (JavaScript Object Notation) <http://json.org>`_ is a subset of JavaScript -syntax (ECMA-262 3rd edition) used as a lightweight data interchange format. +`JSON (JavaScript Object Notation) <http://json.org>`_, specified by +:rfc:`4627`, is a lightweight data interchange format based on a subset of +`JavaScript <http://en.wikipedia.org/wiki/JavaScript>`_ syntax (`ECMA-262 3rd +edition <http://www.ecma-international.org/publications/files/ECMA-ST-ARCH/ECMA-262,%203rd%20edition,%20December%201999.pdf>`_). :mod:`json` exposes an API familiar to users of the standard library :mod:`marshal` and :mod:`pickle` modules. @@ -105,8 +107,10 @@ Using json.tool from the shell to validate and pretty-print:: .. note:: - The JSON produced by this module's default settings is a subset of - YAML, so it may be used as a serializer for that as well. + JSON is a subset of `YAML <http://yaml.org/>`_ 1.2. The JSON produced by + this module's default settings (in particular, the default *separators* + value) is also a subset of YAML 1.0 and 1.1. This module can thus also be + used as a YAML serializer. Basic Usage @@ -185,7 +189,8 @@ Basic Usage *object_hook* is an optional function that will be called with the result of any object literal decoded (a :class:`dict`). The return value of *object_hook* will be used instead of the :class:`dict`. This feature can be used - to implement custom decoders (e.g. JSON-RPC class hinting). + to implement custom decoders (e.g. `JSON-RPC <http://www.jsonrpc.org>`_ + class hinting). *object_pairs_hook* is an optional function that will be called with the result of any object literal decoded with an ordered list of pairs. The @@ -230,7 +235,7 @@ Basic Usage *encoding* which is ignored and deprecated. -Encoders and decoders +Encoders and Decoders --------------------- .. class:: JSONDecoder(object_hook=None, parse_float=None, parse_int=None, parse_constant=None, strict=True, object_pairs_hook=None) @@ -415,3 +420,103 @@ Encoders and decoders for chunk in json.JSONEncoder().iterencode(bigobject): mysocket.write(chunk) + + +Standard Compliance +------------------- + +The JSON format is specified by :rfc:`4627`. This section details this +module's level of compliance with the RFC. For simplicity, +:class:`JSONEncoder` and :class:`JSONDecoder` subclasses, and parameters other +than those explicitly mentioned, are not considered. + +This module does not comply with the RFC in a strict fashion, implementing some +extensions that are valid JavaScript but not valid JSON. In particular: + +- Top-level non-object, non-array values are accepted and output; +- Infinite and NaN number values are accepted and output; +- Repeated names within an object are accepted, and only the value of the last + name-value pair is used. + +Since the RFC permits RFC-compliant parsers to accept input texts that are not +RFC-compliant, this module's deserializer is technically RFC-compliant under +default settings. + +Character Encodings +^^^^^^^^^^^^^^^^^^^ + +The RFC recommends that JSON be represented using either UTF-8, UTF-16, or +UTF-32, with UTF-8 being the default. + +As permitted, though not required, by the RFC, this module's serializer sets +*ensure_ascii=True* by default, thus escaping the output so that the resulting +strings only contain ASCII characters. + +Other than the *ensure_ascii* parameter, this module is defined strictly in +terms of conversion between Python objects and +:class:`Unicode strings <str>`, and thus does not otherwise address the issue +of character encodings. + + +Top-level Non-Object, Non-Array Values +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The RFC specifies that the top-level value of a JSON text must be either a +JSON object or array (Python :class:`dict` or :class:`list`). This module's +deserializer also accepts input texts consisting solely of a +JSON null, boolean, number, or string value:: + + >>> just_a_json_string = '"spam and eggs"' # Not by itself a valid JSON text + >>> json.loads(just_a_json_string) + 'spam and eggs' + +This module itself does not include a way to request that such input texts be +regarded as illegal. Likewise, this module's serializer also accepts single +Python :data:`None`, :class:`bool`, numeric, and :class:`str` +values as input and will generate output texts consisting solely of a top-level +JSON null, boolean, number, or string value without raising an exception:: + + >>> neither_a_list_nor_a_dict = "spam and eggs" + >>> json.dumps(neither_a_list_nor_a_dict) # The result is not a valid JSON text + '"spam and eggs"' + +This module's serializer does not itself include a way to enforce the +aforementioned constraint. + + +Infinite and NaN Number Values +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The RFC does not permit the representation of infinite or NaN number values. +Despite that, by default, this module accepts and outputs ``Infinity``, +``-Infinity``, and ``NaN`` as if they were valid JSON number literal values:: + + >>> # Neither of these calls raises an exception, but the results are not valid JSON + >>> json.dumps(float('-inf')) + '-Infinity' + >>> json.dumps(float('nan')) + 'NaN' + >>> # Same when deserializing + >>> json.loads('-Infinity') + -inf + >>> json.loads('NaN') + nan + +In the serializer, the *allow_nan* parameter can be used to alter this +behavior. In the deserializer, the *parse_constant* parameter can be used to +alter this behavior. + + +Repeated Names Within an Object +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The RFC specifies that the names within a JSON object should be unique, but +does not specify how repeated names in JSON objects should be handled. By +default, this module does not raise an exception; instead, it ignores all but +the last name-value pair for a given name:: + + >>> weird_json = '{"x": 1, "x": 2, "x": 3}' + >>> json.loads(weird_json) + {'x': 3} + +The *object_pairs_hook* parameter can be used to alter this behavior. |