diff options
Diffstat (limited to 'Doc/reference/lexical_analysis.rst')
-rw-r--r-- | Doc/reference/lexical_analysis.rst | 158 |
1 files changed, 140 insertions, 18 deletions
diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 37f25f1..a7c6a68 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -405,7 +405,8 @@ String literals are described by the following lexical definitions: .. productionlist:: stringliteral: [`stringprefix`](`shortstring` | `longstring`) - stringprefix: "r" | "u" | "R" | "U" + stringprefix: "r" | "u" | "R" | "U" | "f" | "F" + : | "fr" | "Fr" | "fR" | "FR" | "rf" | "rF" | "Rf" | "RF" shortstring: "'" `shortstringitem`* "'" | '"' `shortstringitem`* '"' longstring: "'''" `longstringitem`* "'''" | '"""' `longstringitem`* '"""' shortstringitem: `shortstringchar` | `stringescapeseq` @@ -464,6 +465,11 @@ is not supported. to simplify the maintenance of dual Python 2.x and 3.x codebases. See :pep:`414` for more information. +A string literal with ``'f'`` or ``'F'`` in its prefix is a +:dfn:`formatted string literal`; see :ref:`f-strings`. The ``'f'`` may be +combined with ``'r'``, but not with ``'b'`` or ``'u'``, therefore raw +formatted strings are possible, but formatted bytes literals are not. + In triple-quoted literals, unescaped newlines and quotes are allowed (and are retained), except that three unescaped quotes in a row terminate the literal. (A "quote" is the character used to open the literal, i.e. either ``'`` or ``"``.) @@ -554,6 +560,10 @@ is more easily recognized as broken.) It is also important to note that the escape sequences only recognized in string literals fall into the category of unrecognized escapes for bytes literals. + .. versionchanged:: 3.6 + Unrecognized escape sequences produce a DeprecationWarning. In + some future version of Python they will be a SyntaxError. + Even in a raw literal, quotes can be escaped with a backslash, but the backslash remains in the result; for example, ``r"\""`` is a valid string literal consisting of two characters: a backslash and a double quote; ``r"\"`` @@ -583,7 +593,106 @@ comments to parts of strings, for example:: Note that this feature is defined at the syntactical level, but implemented at compile time. The '+' operator must be used to concatenate string expressions at run time. Also note that literal concatenation can use different quoting -styles for each component (even mixing raw strings and triple quoted strings). +styles for each component (even mixing raw strings and triple quoted strings), +and formatted string literals may be concatenated with plain string literals. + + +.. index:: + single: formatted string literal + single: interpolated string literal + single: string; formatted literal + single: string; interpolated literal + single: f-string +.. _f-strings: + +Formatted string literals +------------------------- + +.. versionadded:: 3.6 + +A :dfn:`formatted string literal` or :dfn:`f-string` is a string literal +that is prefixed with ``'f'`` or ``'F'``. These strings may contain +replacement fields, which are expressions delimited by curly braces ``{}``. +While other string literals always have a constant value, formatted strings +are really expressions evaluated at run time. + +Escape sequences are decoded like in ordinary string literals (except when +a literal is also marked as a raw string). After decoding, the grammar +for the contents of the string is: + +.. productionlist:: + f_string: (`literal_char` | "{{" | "}}" | `replacement_field`)* + replacement_field: "{" `f_expression` ["!" `conversion`] [":" `format_spec`] "}" + f_expression: (`conditional_expression` | "*" `or_expr`) + : ("," `conditional_expression` | "," "*" `or_expr`)* [","] + : | `yield_expression` + conversion: "s" | "r" | "a" + format_spec: (`literal_char` | NULL | `replacement_field`)* + literal_char: <any code point except "{", "}" or NULL> + +The parts of the string outside curly braces are treated literally, +except that any doubled curly braces ``'{{'`` or ``'}}'`` are replaced +with the corresponding single curly brace. A single opening curly +bracket ``'{'`` marks a replacement field, which starts with a +Python expression. After the expression, there may be a conversion field, +introduced by an exclamation point ``'!'``. A format specifier may also +be appended, introduced by a colon ``':'``. A replacement field ends +with a closing curly bracket ``'}'``. + +Expressions in formatted string literals are treated like regular +Python expressions surrounded by parentheses, with a few exceptions. +An empty expression is not allowed, and a :keyword:`lambda` expression +must be surrounded by explicit parentheses. Replacement expressions +can contain line breaks (e.g. in triple-quoted strings), but they +cannot contain comments. Each expression is evaluated in the context +where the formatted string literal appears, in order from left to right. + +If a conversion is specified, the result of evaluating the expression +is converted before formatting. Conversion ``'!s'`` calls :func:`str` on +the result, ``'!r'`` calls :func:`repr`, and ``'!a'`` calls :func:`ascii`. + +The result is then formatted using the :func:`format` protocol. The +format specifier is passed to the :meth:`__format__` method of the +expression or conversion result. An empty string is passed when the +format specifier is omitted. The formatted result is then included in +the final value of the whole string. + +Top-level format specifiers may include nested replacement fields. +These nested fields may include their own conversion fields and +format specifiers, but may not include more deeply-nested replacement fields. + +Formatted string literals may be concatenated, but replacement fields +cannot be split across literals. + +Some examples of formatted string literals:: + + >>> name = "Fred" + >>> f"He said his name is {name!r}." + "He said his name is 'Fred'." + >>> f"He said his name is {repr(name)}." # repr() is equivalent to !r + "He said his name is 'Fred'." + >>> width = 10 + >>> precision = 4 + >>> value = decimal.Decimal("12.34567") + >>> f"result: {value:{width}.{precision}}" # nested fields + 'result: 12.35' + +A consequence of sharing the same syntax as regular string literals is +that characters in the replacement fields must not conflict with the +quoting used in the outer formatted string literal. Also, escape +sequences normally apply to the outer formatted string literal, +rather than inner string literals:: + + f"abc {a["x"]} def" # error: outer string literal ended prematurely + f"abc {a[\"x\"]} def" # workaround: escape the inner quotes + f"abc {a['x']} def" # workaround: use different quoting + + f"newline: {ord('\n')}" # error: literal line break in inner string + f"newline: {ord('\\n')}" # workaround: double escaping + fr"newline: {ord('\n')}" # workaround: raw outer string + +See also :pep:`498` for the proposal that added formatted string literals, +and :meth:`str.format`, which uses a related format string mechanism. .. _numbers: @@ -612,20 +721,24 @@ Integer literals Integer literals are described by the following lexical definitions: .. productionlist:: - integer: `decimalinteger` | `octinteger` | `hexinteger` | `bininteger` - decimalinteger: `nonzerodigit` `digit`* | "0"+ + integer: `decinteger` | `bininteger` | `octinteger` | `hexinteger` + decinteger: `nonzerodigit` (["_"] `digit`)* | "0"+ (["_"] "0")* + bininteger: "0" ("b" | "B") (["_"] `bindigit`)+ + octinteger: "0" ("o" | "O") (["_"] `octdigit`)+ + hexinteger: "0" ("x" | "X") (["_"] `hexdigit`)+ nonzerodigit: "1"..."9" digit: "0"..."9" - octinteger: "0" ("o" | "O") `octdigit`+ - hexinteger: "0" ("x" | "X") `hexdigit`+ - bininteger: "0" ("b" | "B") `bindigit`+ + bindigit: "0" | "1" octdigit: "0"..."7" hexdigit: `digit` | "a"..."f" | "A"..."F" - bindigit: "0" | "1" There is no limit for the length of integer literals apart from what can be stored in available memory. +Underscores are ignored for determining the numeric value of the literal. They +can be used to group digits for enhanced readability. One underscore can occur +between digits, and after base specifiers like ``0x``. + Note that leading zeros in a non-zero decimal number are not allowed. This is for disambiguation with C-style octal literals, which Python used before version 3.0. @@ -634,6 +747,10 @@ Some examples of integer literals:: 7 2147483647 0o177 0b100110111 3 79228162514264337593543950336 0o377 0xdeadbeef + 100_000_000_000 0b_1110_0101 + +.. versionchanged:: 3.6 + Underscores are now allowed for grouping purposes in literals. .. _floating: @@ -645,23 +762,28 @@ Floating point literals are described by the following lexical definitions: .. productionlist:: floatnumber: `pointfloat` | `exponentfloat` - pointfloat: [`intpart`] `fraction` | `intpart` "." - exponentfloat: (`intpart` | `pointfloat`) `exponent` - intpart: `digit`+ - fraction: "." `digit`+ - exponent: ("e" | "E") ["+" | "-"] `digit`+ + pointfloat: [`digitpart`] `fraction` | `digitpart` "." + exponentfloat: (`digitpart` | `pointfloat`) `exponent` + digitpart: `digit` (["_"] `digit`)* + fraction: "." `digitpart` + exponent: ("e" | "E") ["+" | "-"] `digitpart` Note that the integer and exponent parts are always interpreted using radix 10. For example, ``077e010`` is legal, and denotes the same number as ``77e10``. The -allowed range of floating point literals is implementation-dependent. Some -examples of floating point literals:: +allowed range of floating point literals is implementation-dependent. As in +integer literals, underscores are supported for digit grouping. - 3.14 10. .001 1e100 3.14e-10 0e0 +Some examples of floating point literals:: + + 3.14 10. .001 1e100 3.14e-10 0e0 3.14_15_93 Note that numeric literals do not include a sign; a phrase like ``-1`` is actually an expression composed of the unary operator ``-`` and the literal ``1``. +.. versionchanged:: 3.6 + Underscores are now allowed for grouping purposes in literals. + .. _imaginary: @@ -671,7 +793,7 @@ Imaginary literals Imaginary literals are described by the following lexical definitions: .. productionlist:: - imagnumber: (`floatnumber` | `intpart`) ("j" | "J") + imagnumber: (`floatnumber` | `digitpart`) ("j" | "J") An imaginary literal yields a complex number with a real part of 0.0. Complex numbers are represented as a pair of floating point numbers and have the same @@ -679,7 +801,7 @@ restrictions on their range. To create a complex number with a nonzero real part, add a floating point number to it, e.g., ``(3+4j)``. Some examples of imaginary literals:: - 3.14j 10.j 10j .001j 1e100j 3.14e-10j + 3.14j 10.j 10j .001j 1e100j 3.14e-10j 3.14_15_93j .. _operators: |