bpo-36018: Add another example for NormalDist() (GH-18191) (GH-18192)

author: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com> 2020-01-26 05:24:13 (GMT)
committer: Raymond Hettinger <rhettinger@users.noreply.github.com> 2020-01-26 05:24:13 (GMT)
commit: eebcff8c071b38b53bd429892524ba8518cbeb98 (patch)
tree: d102f4d564bf5080d46fb55d3390ba5dd6b1d695
parent: eec7636bfd07412b5872c0683636e9e98bf79a8c (diff)
download: cpython-eebcff8c071b38b53bd429892524ba8518cbeb98.zip
cpython-eebcff8c071b38b53bd429892524ba8518cbeb98.tar.gz
cpython-eebcff8c071b38b53bd429892524ba8518cbeb98.tar.bz2
1 files changed, 36 insertions, 0 deletions
diff --git a/Doc/library/statistics.rst b/Doc/library/statistics.rst
index 4c7239c..09b02ca 100644
--- a/Doc/library/statistics.rst
+++ b/Doc/library/statistics.rst
@@ -772,6 +772,42 @@ Carlo simulation <https://en.wikipedia.org/wiki/Monte_Carlo_method>`_:
     >>> quantiles(map(model, X, Y, Z))       # doctest: +SKIP
     [1.4591308524824727, 1.8035946855390597, 2.175091447274739]
 
+Normal distributions can be used to approximate `Binomial
+distributions <http://mathworld.wolfram.com/BinomialDistribution.html>`_
+when the sample size is large and when the probability of a successful
+trial is near 50%.
+
+For example, an open source conference has 750 attendees and two rooms with a
+500 person capacity.  There is a talk about Python and another about Ruby.
+In previous conferences, 65% of the attendees preferred to listen to Python
+talks.  Assuming the population preferences haven't changed, what is the
+probability that the rooms will stay within their capacity limits?
+
+.. doctest::
+
+    >>> n = 750             # Sample size
+    >>> p = 0.65            # Preference for Python
+    >>> q = 1.0 - p         # Preference for Ruby
+    >>> k = 500             # Room capacity
+
+    >>> # Approximation using the cumulative normal distribution
+    >>> from math import sqrt
+    >>> round(NormalDist(mu=n*p, sigma=sqrt(n*p*q)).cdf(k + 0.5), 4)
+    0.8402
+
+    >>> # Solution using the cumulative binomial distribution
+    >>> from math import comb, fsum
+    >>> round(fsum(comb(n, r) * p**r * q**(n-r) for r in range(k+1)), 4)
+    0.8402
+
+    >>> # Approximation using a simulation
+    >>> from random import seed, choices
+    >>> seed(8675309)
+    >>> def trial():
+    ...     return choices(('Python', 'Ruby'), (p, q), k=n).count('Python')
+    >>> mean(trial() <= k for i in range(10_000))
+    0.8398
+
 Normal distributions commonly arise in machine learning problems.
 
 Wikipedia has a `nice example of a Naive Bayesian Classifier
author	Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>	2020-01-26 05:24:13 (GMT)
committer	Raymond Hettinger <rhettinger@users.noreply.github.com>	2020-01-26 05:24:13 (GMT)
commit	eebcff8c071b38b53bd429892524ba8518cbeb98 (patch)
tree	d102f4d564bf5080d46fb55d3390ba5dd6b1d695
parent	eec7636bfd07412b5872c0683636e9e98bf79a8c (diff)
download	cpython-eebcff8c071b38b53bd429892524ba8518cbeb98.zip cpython-eebcff8c071b38b53bd429892524ba8518cbeb98.tar.gz cpython-eebcff8c071b38b53bd429892524ba8518cbeb98.tar.bz2