diff options
author | Raymond Hettinger <rhettinger@users.noreply.github.com> | 2021-05-17 02:21:14 (GMT) |
---|---|---|
committer | GitHub <noreply@github.com> | 2021-05-17 02:21:14 (GMT) |
commit | b3f65e819f552561294a66e350a9f5a3131f7df2 (patch) | |
tree | 406616b909c355daff709910584973dc946d4373 | |
parent | fdc7e52f5f1853e350407c472ae031339ac7f60c (diff) | |
download | cpython-b3f65e819f552561294a66e350a9f5a3131f7df2.zip cpython-b3f65e819f552561294a66e350a9f5a3131f7df2.tar.gz cpython-b3f65e819f552561294a66e350a9f5a3131f7df2.tar.bz2 |
Apply edits from Allen Downey's review of the linear_regression docs. (GH-26176)
-rw-r--r-- | Doc/library/statistics.rst | 26 | ||||
-rw-r--r-- | Lib/statistics.py | 12 |
2 files changed, 15 insertions, 23 deletions
diff --git a/Doc/library/statistics.rst b/Doc/library/statistics.rst index 117d2b6..a65c984 100644 --- a/Doc/library/statistics.rst +++ b/Doc/library/statistics.rst @@ -631,25 +631,25 @@ However, for reading convenience, most of the examples show sorted sequences. Return the intercept and slope of `simple linear regression <https://en.wikipedia.org/wiki/Simple_linear_regression>`_ parameters estimated using ordinary least squares. Simple linear - regression describes relationship between *regressor* and - *dependent variable* in terms of linear function: + regression describes the relationship between *regressor* and + *dependent variable* in terms of this linear function: *dependent_variable = intercept + slope \* regressor + noise* where ``intercept`` and ``slope`` are the regression parameters that are - estimated, and noise term is an unobserved random variable, for the + estimated, and noise represents the variability of the data that was not explained by the linear regression - (it is equal to the difference between prediction and the actual values + (it is equal to the difference between predicted and actual values of dependent variable). Both inputs must be of the same length (no less than two), and regressor - needs not to be constant, otherwise :exc:`StatisticsError` is raised. + needs not to be constant; otherwise :exc:`StatisticsError` is raised. - For example, if we took the data on the data on `release dates of the Monty + For example, we can use the `release dates of the Monty Python films <https://en.wikipedia.org/wiki/Monty_Python#Films>`_, and used - it to predict the cumulative number of Monty Python films produced, we could - predict what would be the number of films they could have made till year - 2019, assuming that they kept the pace. + it to predict the cumulative number of Monty Python films + that would have been produced by 2019 + assuming that they kept the pace. .. doctest:: @@ -659,14 +659,6 @@ However, for reading convenience, most of the examples show sorted sequences. >>> round(intercept + slope * 2019) 16 - We could also use it to "predict" how many Monty Python films existed when - Brian Cohen was born. - - .. doctest:: - - >>> round(intercept + slope * 1) - -610 - .. versionadded:: 3.10 diff --git a/Lib/statistics.py b/Lib/statistics.py index 507a5b2..5d38f85 100644 --- a/Lib/statistics.py +++ b/Lib/statistics.py @@ -930,15 +930,15 @@ def linear_regression(regressor, dependent_variable, /): Return the intercept and slope of simple linear regression parameters estimated using ordinary least squares. Simple linear regression describes relationship between *regressor* and - *dependent variable* in terms of linear function:: + *dependent variable* in terms of linear function: dependent_variable = intercept + slope * regressor + noise - where ``intercept`` and ``slope`` are the regression parameters that are - estimated, and noise term is an unobserved random variable, for the - variability of the data that was not explained by the linear regression - (it is equal to the difference between prediction and the actual values - of dependent variable). + where *intercept* and *slope* are the regression parameters that are + estimated, and noise represents the variability of the data that was + not explained by the linear regression (it is equal to the + difference between predicted and actual values of dependent + variable). The parameters are returned as a named tuple. |