summaryrefslogtreecommitdiffstats
path: root/Doc/library
diff options
context:
space:
mode:
authorRaymond Hettinger <rhettinger@users.noreply.github.com>2020-01-26 04:21:17 (GMT)
committerGitHub <noreply@github.com>2020-01-26 04:21:17 (GMT)
commit10355ed7f132ed10f1e0d8bd64ccb744b86b1cce (patch)
tree8fe80f251921aa769730e089c93ee4d4d5401b28 /Doc/library
parent4515a590a4a4c09231a66e81782f33b4bfcd5054 (diff)
downloadcpython-10355ed7f132ed10f1e0d8bd64ccb744b86b1cce.zip
cpython-10355ed7f132ed10f1e0d8bd64ccb744b86b1cce.tar.gz
cpython-10355ed7f132ed10f1e0d8bd64ccb744b86b1cce.tar.bz2
bpo-36018: Add another example for NormalDist() (#18191)
Diffstat (limited to 'Doc/library')
-rw-r--r--Doc/library/statistics.rst36
1 files changed, 36 insertions, 0 deletions
diff --git a/Doc/library/statistics.rst b/Doc/library/statistics.rst
index 4c7239c..09b02ca 100644
--- a/Doc/library/statistics.rst
+++ b/Doc/library/statistics.rst
@@ -772,6 +772,42 @@ Carlo simulation <https://en.wikipedia.org/wiki/Monte_Carlo_method>`_:
>>> quantiles(map(model, X, Y, Z)) # doctest: +SKIP
[1.4591308524824727, 1.8035946855390597, 2.175091447274739]
+Normal distributions can be used to approximate `Binomial
+distributions <http://mathworld.wolfram.com/BinomialDistribution.html>`_
+when the sample size is large and when the probability of a successful
+trial is near 50%.
+
+For example, an open source conference has 750 attendees and two rooms with a
+500 person capacity. There is a talk about Python and another about Ruby.
+In previous conferences, 65% of the attendees preferred to listen to Python
+talks. Assuming the population preferences haven't changed, what is the
+probability that the rooms will stay within their capacity limits?
+
+.. doctest::
+
+ >>> n = 750 # Sample size
+ >>> p = 0.65 # Preference for Python
+ >>> q = 1.0 - p # Preference for Ruby
+ >>> k = 500 # Room capacity
+
+ >>> # Approximation using the cumulative normal distribution
+ >>> from math import sqrt
+ >>> round(NormalDist(mu=n*p, sigma=sqrt(n*p*q)).cdf(k + 0.5), 4)
+ 0.8402
+
+ >>> # Solution using the cumulative binomial distribution
+ >>> from math import comb, fsum
+ >>> round(fsum(comb(n, r) * p**r * q**(n-r) for r in range(k+1)), 4)
+ 0.8402
+
+ >>> # Approximation using a simulation
+ >>> from random import seed, choices
+ >>> seed(8675309)
+ >>> def trial():
+ ... return choices(('Python', 'Ruby'), (p, q), k=n).count('Python')
+ >>> mean(trial() <= k for i in range(10_000))
+ 0.8398
+
Normal distributions commonly arise in machine learning problems.
Wikipedia has a `nice example of a Naive Bayesian Classifier