# Probability distributions in Python - example with the Binomial distribution

In this notebook we will work with Binomial distribution, using the scipy.stats subpackage

We will try using the different methods of the distribution.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# For specific probability distributions (e.g., the Binomial distribution) we will use a new library: Scipy (actually only the subpackage scipy.stats) 
import scipy.stats as stats

### Compute probabilities (pdf)

For a stochasticvariable following a binomial distribution, with parameters n=6 and p=0.70, compute P(X = 6):

In [None]:
stats.binom.pmf(k=6, n=6, p=0.70)

We can also compute the probability of every possible outcome (P(X = 0), P(X = 1), P(X = 2), P(X = 3), P(X = 4), P(X = 5) and P(X = 6)):

In [None]:
print(stats.binom.pmf(k=[0,1,2,3,4,5,6], n=6, p=0.70))

In [None]:
# we can also visualise these probabilities (plot the pdf):
plt.bar([0,1,2,3,4,5,6], stats.binom.pmf(k=[0,1,2,3,4,5,6], n=6, p=0.70), width=0.1, color='red')
plt.show()

### Compute cdf, inverse cdf, mean, variance, etc

Python has many other methods for every distribution. 

We can also compute the cdf (.cdf), the inverse cdf (.ppf), the Expectation value/the mean (.mean), the variance (.var)

In [None]:
# compute the cdf; P(X <= x):
print(stats.binom.cdf(k=[0,1,2,3,4,5,6], n=6, p=0.70))

In [None]:
# visualise the cdf:
plt.bar([0,1,2,3,4,5,6], stats.binom.cdf(k=[0,1,2,3,4,5,6], n=6, p=0.70), width=0.1, color='red')
plt.show()

In [None]:
# compute the inverse cdf, in python called ".ppf" for percent point function. 
# For instance we can compute the quartiles:
stats.binom.ppf(q=[0.25, 0.50, 0.75], n=6, p=0.70)

Can you visually verify these values from the cdf plot above?

hints:<br>
find 0.25 on the y-axis and then go to corresponding x-value - this should be Q1<br>
find 0.50 on the y-axis and then go to corresponding x-value - this should be Q2 = the median<br>
find 0.75 on the y-axis and then go to corresponding x-value - this should be Q3 

In [None]:
# compute the expectation value / the mean:
stats.binom.mean(n=6, p=0.70)

In [None]:
# compute the variance:
stats.binom.var(n=6, p=0.70)

In [None]:
# compute the standard deviation:
stats.binom.std(n=6, p=0.70)

In [None]:
# compute the median:
stats.binom.median(n=6, p=0.70)

## Simulating random variates:

In [None]:
# we can simulate a random variate - that is a single observation of the random variable - using .rvs:
print(stats.binom.rvs(size=1, n=6, p=0.70))

Try repreating the code above a few times. 

What are we simulating?

In [None]:
# We can also simulate many obersavtions in one go:
print(stats.binom.rvs(size=100, n=6, p=0.70))