{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "3f7cd02e",
   "metadata": {},
   "source": [
    "## Area of plates simulation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4535e984",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n",
    "import scipy.stats as stats"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "604816b9",
   "metadata": {},
   "source": [
    "We image producing many plates and meassuring the width and length of each plate.\n",
    "\n",
    "Instead of doing this IRL we simulate :)\n",
    "\n",
    "We store our simulated data as we would store real data (here in a pandas DataFrame)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9323a635",
   "metadata": {},
   "outputs": [],
   "source": [
    "np.random.seed(2242)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "56c3f31a",
   "metadata": {},
   "outputs": [],
   "source": [
    "# number of simulations:\n",
    "k = 10000"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ed206437",
   "metadata": {},
   "outputs": [],
   "source": [
    "# simulating width and length from normal distributions:\n",
    "\n",
    "simulation_data = pd.DataFrame({\n",
    " 'length': stats.norm.rvs(size=k, loc=2, scale=0.01),\n",
    " 'width':  stats.norm.rvs(size=k, loc=3, scale=0.02)\n",
    " })\n",
    "\n",
    "print(simulation_data.head())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9fffdf58",
   "metadata": {},
   "outputs": [],
   "source": [
    "# for each simulated plate we calculate the area and store this (as a new column) in our DataFrame:\n",
    "simulation_data['area'] = simulation_data['length'] * simulation_data['width']\n",
    "\n",
    "print(simulation_data.head())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "24e3d643",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Compute mean and standard deviation from simulated plate areas:\n",
    "print(simulation_data['area'].mean())      # mean area\n",
    "print(simulation_data['area'].std(ddof=1)) # sample standard deviation of area\n",
    "print(simulation_data['area'].var(ddof=1)) # sample variance of area"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "45af0e55",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Lets plot the distribution of simulated plate areas:\n",
    "plt.hist(simulation_data['area'], bins=100)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eab013f9",
   "metadata": {},
   "source": [
    "What do you think about this distribution? (describe in own words)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4ce8234f",
   "metadata": {},
   "outputs": [],
   "source": [
    "# how many values deviate by more than 0.10 from 6.00 m2 ?\n",
    "\n",
    "# plates that have an area below 5.90 are very \"small\" (indicated by a boolean variable)\n",
    "simulation_data['small'] = simulation_data['area'] < 5.90\n",
    "\n",
    "# plates that have an area above 6.10 are very \"large\" (indicated by a boolean variable)\n",
    "simulation_data['large'] = simulation_data['area'] > 6.10\n",
    "\n",
    "print(simulation_data.head())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "49ec03b3",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Showing an example of a row that has a \"True\" value:\n",
    "print(simulation_data.iloc[43:48])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9c4b1c0a",
   "metadata": {},
   "outputs": [],
   "source": [
    "# how many values deviate by more than 0.10 from 6m2 ? \n",
    "# these are the very \"small\" plus the very \"large\":\n",
    "\n",
    "# In Python we can simply add \"True\" (1) and \"False\" (0) values:\n",
    "print(np.sum(simulation_data['small']) + np.sum(simulation_data['large']))\n",
    "\n",
    "# same result as a fraction of total number os simulations:\n",
    "print((np.sum(simulation_data['small']) + np.sum(simulation_data['large']))/k)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3b3c9b6d",
   "metadata": {},
   "source": [
    "approx 4-5% of plates deviate more than 0.10 from 6m2"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c660078d",
   "metadata": {},
   "source": [
    "### Other probabilities (from unknown distributions)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5585e925",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Imagine the plates have a thickness between 0.95cm and 1.05cm\n",
    "\n",
    "# Simulate the thickness of each plate (assume the thickness is independent of width and length)\n",
    "simulation_data['thickness'] = stats.uniform.rvs(size=k, loc=0.0095, scale=0.001)\n",
    "\n",
    "print(simulation_data.head())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6d5f5620",
   "metadata": {},
   "source": [
    "KAHOOT! (x1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f9ab9360",
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.hist(simulation_data['thickness'])\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "251654e5",
   "metadata": {},
   "source": [
    "What is the distribution of plate thicknesses? Is it as you expected?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5434aba1",
   "metadata": {},
   "outputs": [],
   "source": [
    "# As a rule plates are discarded if the value = (thickness^2)/area is less than 0.000015\n",
    "\n",
    "# make a boolean (True/False) variable to indicate if each simulated plate will be discarded:\n",
    "\n",
    "simulation_data['value'] = simulation_data['thickness']**2/simulation_data['area']\n",
    "simulation_data['discard'] = simulation_data['value'] < 1.5e-5\n",
    "\n",
    "print(simulation_data.head())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1f1d98d1",
   "metadata": {},
   "outputs": [],
   "source": [
    "# visualise the distribution of the \"value\" (rule for discarding plates):\n",
    "plt.hist(simulation_data['value'])\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "90f1a5fe",
   "metadata": {},
   "source": [
    "What do you think about this distribution? How would you describe it in your own words?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2322a336",
   "metadata": {},
   "outputs": [],
   "source": [
    "# how many plates (out of k) are discarded?\n",
    "print(simulation_data['discard'].sum())\n",
    "print(simulation_data['discard'].sum()/k)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fb72a611",
   "metadata": {},
   "source": [
    "approximately 1% of plates are discarded"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c81450c5",
   "metadata": {},
   "source": [
    "Tilbage til slides!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b11783b0",
   "metadata": {},
   "outputs": [],
   "source": [
    "# variance of the \"value\"\n",
    "print(simulation_data['value'].var(ddof=1))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "aec67ae5",
   "metadata": {},
   "outputs": [],
   "source": [
    "# variance of the plate thickness:\n",
    "simulation_data['thickness'].var(ddof=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5f3a36b3",
   "metadata": {},
   "outputs": [],
   "source": [
    "# theoretical variance of the plate thickness:\n",
    "0.001**2 / 12"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "275ddd50",
   "metadata": {},
   "outputs": [],
   "source": [
    "# using error propagation to calculate variance of \"value\":\n",
    "(0.01**2/(2**2*3))**2 * 0.01**2 + (0.01**2/(2*3**3))**2 * 0.02**2 + (2*0.01/(2*3))**2 * (0.001**2/12)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "pernille",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}