{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Model checking\n", "\n", "##### Keywords: model checking, posterior predictive, test statistic, discrepancy, bayesian p-values" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline\n", "import numpy as np\n", "import scipy as sp\n", "import matplotlib as mpl\n", "import matplotlib.cm as cm\n", "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "pd.set_option('display.width', 500)\n", "pd.set_option('display.max_columns', 100)\n", "pd.set_option('display.notebook_repr_html', True)\n", "import seaborn as sns\n", "sns.set_style(\"whitegrid\")\n", "sns.set_context(\"poster\")\n", "import pymc3 as pm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll illustrate some of the concepts of Model Checking by considering an experiment.\n", "\n", "## The light speed experiment\n", "\n", "Simon Newcomb did an experiment in 1882 to measure the speed of light. These are the times required for light to travel 7442 metres. These are recorded as deviations from 24,800 nanoseconds." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "light_speed = np.array([28, 26, 33, 24, 34, -44, 27, 16, 40, -2, 29, 22, 24, 21, 25,\n", " 30, 23, 29, 31, 19, 24, 20, 36, 32, 36, 28, 25, 21, 28, 29,\n", " 37, 25, 28, 26, 30, 32, 36, 26, 30, 22, 36, 23, 27, 27, 28,\n", " 27, 31, 27, 26, 33, 26, 32, 32, 24, 39, 28, 24, 25, 32, 25,\n", " 29, 27, 28, 29, 16, 23])" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "10.663610099255504" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "light_speed.std()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We use a normal models with reakly informative priors to model this experiment." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": true }, "outputs": [], "source": [ "with pm.Model() as light_model:\n", " mu = pm.Uniform('mu', lower=-1000,\n", " upper=1000.0)\n", " sigma = pm.Uniform('sigma', lower=0.1, upper=1000.0)\n", " obsv = pm.Normal('obsv', mu=mu, sd=sigma, observed=light_speed)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And let us sample...." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Auto-assigning NUTS sampler...\n", "Initializing NUTS using jitter+adapt_diag...\n", "Multiprocess sampling (2 chains in 2 jobs)\n", "NUTS: [sigma_interval__, mu_interval__]\n", "100%|██████████| 10500/10500 [00:08<00:00, 1191.06it/s]\n", "The acceptance probability does not match the target. It is 0.881671068051, but should be close to 0.8. Try to increase the number of tuning steps.\n", "The acceptance probability does not match the target. It is 0.888306785997, but should be close to 0.8. Try to increase the number of tuning steps.\n" ] } ], "source": [ "with light_model:\n", " trace = pm.sample(10000)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", " | mean | \n", "sd | \n", "mc_error | \n", "hpd_2.5 | \n", "hpd_97.5 | \n", "n_eff | \n", "Rhat | \n", "
---|---|---|---|---|---|---|---|
mu | \n", "26.206077 | \n", "1.346849 | \n", "0.009393 | \n", "23.571715 | \n", "28.84782 | \n", "18789.0 | \n", "1.000004 | \n", "
sigma | \n", "10.950253 | \n", "0.969146 | \n", "0.006581 | \n", "9.158164 | \n", "12.90844 | \n", "19058.0 | \n", "1.000050 | \n", "