summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorRaymond Hettinger <rhettinger@users.noreply.github.com>2021-05-17 02:21:14 (GMT)
committerGitHub <noreply@github.com>2021-05-17 02:21:14 (GMT)
commitb3f65e819f552561294a66e350a9f5a3131f7df2 (patch)
tree406616b909c355daff709910584973dc946d4373
parentfdc7e52f5f1853e350407c472ae031339ac7f60c (diff)
downloadcpython-b3f65e819f552561294a66e350a9f5a3131f7df2.zip
cpython-b3f65e819f552561294a66e350a9f5a3131f7df2.tar.gz
cpython-b3f65e819f552561294a66e350a9f5a3131f7df2.tar.bz2
Apply edits from Allen Downey's review of the linear_regression docs. (GH-26176)
-rw-r--r--Doc/library/statistics.rst26
-rw-r--r--Lib/statistics.py12
2 files changed, 15 insertions, 23 deletions
diff --git a/Doc/library/statistics.rst b/Doc/library/statistics.rst
index 117d2b6..a65c984 100644
--- a/Doc/library/statistics.rst
+++ b/Doc/library/statistics.rst
@@ -631,25 +631,25 @@ However, for reading convenience, most of the examples show sorted sequences.
Return the intercept and slope of `simple linear regression
<https://en.wikipedia.org/wiki/Simple_linear_regression>`_
parameters estimated using ordinary least squares. Simple linear
- regression describes relationship between *regressor* and
- *dependent variable* in terms of linear function:
+ regression describes the relationship between *regressor* and
+ *dependent variable* in terms of this linear function:
*dependent_variable = intercept + slope \* regressor + noise*
where ``intercept`` and ``slope`` are the regression parameters that are
- estimated, and noise term is an unobserved random variable, for the
+ estimated, and noise represents the
variability of the data that was not explained by the linear regression
- (it is equal to the difference between prediction and the actual values
+ (it is equal to the difference between predicted and actual values
of dependent variable).
Both inputs must be of the same length (no less than two), and regressor
- needs not to be constant, otherwise :exc:`StatisticsError` is raised.
+ needs not to be constant; otherwise :exc:`StatisticsError` is raised.
- For example, if we took the data on the data on `release dates of the Monty
+ For example, we can use the `release dates of the Monty
Python films <https://en.wikipedia.org/wiki/Monty_Python#Films>`_, and used
- it to predict the cumulative number of Monty Python films produced, we could
- predict what would be the number of films they could have made till year
- 2019, assuming that they kept the pace.
+ it to predict the cumulative number of Monty Python films
+ that would have been produced by 2019
+ assuming that they kept the pace.
.. doctest::
@@ -659,14 +659,6 @@ However, for reading convenience, most of the examples show sorted sequences.
>>> round(intercept + slope * 2019)
16
- We could also use it to "predict" how many Monty Python films existed when
- Brian Cohen was born.
-
- .. doctest::
-
- >>> round(intercept + slope * 1)
- -610
-
.. versionadded:: 3.10
diff --git a/Lib/statistics.py b/Lib/statistics.py
index 507a5b2..5d38f85 100644
--- a/Lib/statistics.py
+++ b/Lib/statistics.py
@@ -930,15 +930,15 @@ def linear_regression(regressor, dependent_variable, /):
Return the intercept and slope of simple linear regression
parameters estimated using ordinary least squares. Simple linear
regression describes relationship between *regressor* and
- *dependent variable* in terms of linear function::
+ *dependent variable* in terms of linear function:
dependent_variable = intercept + slope * regressor + noise
- where ``intercept`` and ``slope`` are the regression parameters that are
- estimated, and noise term is an unobserved random variable, for the
- variability of the data that was not explained by the linear regression
- (it is equal to the difference between prediction and the actual values
- of dependent variable).
+ where *intercept* and *slope* are the regression parameters that are
+ estimated, and noise represents the variability of the data that was
+ not explained by the linear regression (it is equal to the
+ difference between predicted and actual values of dependent
+ variable).
The parameters are returned as a named tuple.