---
title: "Hypothesis Tests for Population Proportions"
author: "Evan L. Ray"
date: "November 6, 2017"
output: ioslides_presentation
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, cache = FALSE)
require(ggplot2)
require(scales)
require(dplyr)
require(tidyr)
require(readr)
require(mosaic)
```
## Is Paul the Octopus Psychic?
Recall our procedure for hypothesis testing:
1. Collect **data**: for each of 8 trials, was the prediction correct?
2. Calculate a **sample statistic** (called the test statistic):
* $x =$ total number correct (8 in our case)
3. Obtain the **sampling distribution** of the test statistic, assuming a **null hypothesis** of no effect (in this case, assuming Paul is just guessing)
4. Calculate the **p-value**: probability of getting a test statistic "at least as extreme" as what we observed in step 2
5. If the p-value is low, reject the null hypothesis and conclude that Paul is psychic!
## More Carefully...
* **Test Statistic**: $x = 8$ (observed number correct)
* **Null Hypothesis**: Paul was just guessing: $p = 0.5$
* **Alternative Hypothesis**: Paul is psychic: $p > 0.5$
* **Sampling Distribution**, assuming null hypothesis is true:
$$X \sim \text{Binomial}(8, 0.5)$$
(check assumptions!)
* **p-value**:
$$P(X \geq 8) = 0.0039$$
```{r, echo = FALSE, fig.height=2.25, fig.width=3.3}
Paul_success_probs <- data.frame(
num_successes = seq(from = 0, to = 8),
pv = factor(c(rep(0, 8), 1)),
probability = dbinom(x = seq(from = 0, to = 8), size = 8, prob = 0.5))
ggplot() +
geom_col(mapping = aes(x = num_successes, y = probability, fill = pv),
data = Paul_success_probs) +
xlab("Number of Successes") +
scale_fill_manual("p-value", values = c("black", "red"))
```
* **Conclusion**: It's unlikely that Paul would get 8/8 right if he was just guessing, so we reject the null hypothesis and conclude that he is psychic!
## IMPORTANT FACT!!!

* Hypothesis tests are **guaranteed** tell you the wrong thing sometimes!!!!!!!!!!!!!!!!
* This is similar to the fact that confidence intervals are **guaranteed** to miss the population parameter sometimes.
* More on this in a few days.
(image credit: Paul J. Sullivan)
## More on Hypotheses
* **Null Hypothesis**: (Short Hand: $H_0$)
* Nothing has changed since the past...
* People are just guessing...
* Nothing interesting is going on...
* $p =$ (proportion from the past)/(chance of being right if just guessing)/(etc...)
* **Alternative Hypothesis**: (Short Hand: $H_A$)
* Times have changed!
* People know what they're doing!
* The world is fascinating!
* $p \neq$ (value from null hypothesis)
* $p >$ (value from null hypothesis)
* $p <$ (value from null hypothesis)
## Examples of Hypotheses
* Paul the Octopus, 8 right out of 8
* Null Hypothesis ($H_0$): $p = 0.5$
* Alternative Hypothesis ($H_A$): $p > 0.5$
* Proportion of M and M's that are blue (concerned it's lower now!); 12 blue out of 100
* Null Hypothesis ($H_0$): $p = 0.16$
* Alternative Hypothesis ($H_A$): $p < 0.16$
* The National Center for Education Statistics released a report in 1996 saying that 66% of students had missed at least one day of school in the past month. A more recent survey of 8302 students found that 5562 of them had missed at least one day of school. Has the rate of absenteeism changed?
* Null Hypothesis ($H_0$): $p = 0.66$
* Alternative Hypothesis ($H_A$): $p \neq 0.66$
## More on P-Values
* **p-value**: probability of getting a test statistic "at least as extreme" as what we observed, assuming $H_0$ is true
* What counts as "at least as extreme" depends on the form of the alternative hypothesis
## P-Values for One-Sided Tests
* Paul predicts 8 of 8 correctly
* $H_0$: $p = 0.5$
* $H_A$: $p > 0.5$
* **p-value**: $P(X \geq 8) = 0.0039$
if $X \sim \text{Binomial}(8, 0.5)$
```{r, echo = FALSE, fig.height=3.5, fig.width=3.75}
Paul_success_probs <- data.frame(
num_successes = seq(from = 0, to = 8),
pv = factor(c(rep(0, 8), 1)),
probability = dbinom(x = seq(from = 0, to = 8), size = 8, prob = 0.5))
ggplot() +
geom_col(mapping = aes(x = num_successes, y = probability, fill = pv),
data = Paul_success_probs) +
xlab("Number of Successes") +
scale_fill_manual("Included\nin p-value", labels = c("No", "Yes"), values = c("black", "red"))
```
* 12 Blue M\&M's out of 100
* $H_0$: $p = 0.16$
* $H_A$: $p < 0.16$
* **p-value**: $P(X \leq 12) = 0.1703$
if $X \sim \text{Binomial}(100, 0.16)$
```{r, echo = FALSE, fig.height=3.5, fig.width=3.75}
Paul_success_probs <- data.frame(
num_successes = seq(from = 0, to = 100),
pv = factor(c(rep(1, 13), rep(0, 88))),
probability = dbinom(x = seq(from = 0, to = 100), size = 100, prob = 0.16))
ggplot() +
geom_col(mapping = aes(x = num_successes, y = probability, fill = pv),
data = Paul_success_probs) +
xlab("Number of Successes") +
scale_fill_manual("Included\nin p-value", labels = c("No", "Yes"), values = c("black", "red"))
```
## P-Values for Two-Sided Tests
* 5562 out of 8302 students missed at least one day of school.
* $H_0$: $p = 0.66$, $H_A$: $p \neq 0.66$
* If $H_0$ is true, $X \sim \text{Binomial}(8302, 0.66)$
* "At least as extreme": at least as far from the expected value
* $E(X) = np = 8302 * 0.66 = 5479.32$
```{r, warning=FALSE, echo = FALSE, fig.height=2, fig.width=4}
Paul_success_probs <- data.frame(
num_successes = seq(from = 0, to = 8302),
pv = 0,
probability = dbinom(x = seq(from = 0, to = 8302), size = 8302, prob = 0.66))
diff <- 5562 - 5479
Paul_success_probs$pv[Paul_success_probs$num_successes >= 5562] <- 1
Paul_success_probs$pv[Paul_success_probs$num_successes <= 5479 - diff] <- 1
Paul_success_probs$pv <- factor(Paul_success_probs$pv)
ggplot() +
geom_col(mapping = aes(x = num_successes, y = probability, fill = pv),
data = Paul_success_probs) +
geom_vline(mapping = aes(xintercept = xintercept), color = "red", linetype = 2, data = data.frame(xintercept = 5479.32)) +
xlab("Number of Successes") +
scale_fill_manual("Included\nin p-value", labels = c("No", "Yes"), values = c("black", "red"))
```
```{r, warning=FALSE, echo = FALSE, fig.height=2, fig.width=4}
Paul_success_probs <- data.frame(
num_successes = seq(from = 0, to = 8302),
pv = 0,
probability = dbinom(x = seq(from = 0, to = 8302), size = 8302, prob = 0.66))
diff <- 5562 - 5479
Paul_success_probs$pv[Paul_success_probs$num_successes >= 5562] <- 1
Paul_success_probs$pv[Paul_success_probs$num_successes <= 5479 - diff] <- 1
Paul_success_probs$pv <- factor(Paul_success_probs$pv)
ggplot() +
geom_col(mapping = aes(x = num_successes, y = probability, fill = pv),
data = Paul_success_probs) +
geom_vline(mapping = aes(xintercept = xintercept),
color = "red",
linetype = 2,
size = 1,
data = data.frame(xintercept = 5479.32)) +
xlab("Number of Successes") +
xlim(c(5250, 5700)) +
scale_fill_manual("Included\nin p-value", labels = c("No", "Yes"), values = c("black", "red"))
```
* R actually does something slightly different, but the results will usually be the same as what's described here.
## Calculation of P-Values in R
* Suppose we have a data frame with a variable indicating success/failure:
```{r, echo = FALSE}
paul_guesses <- data.frame(result = rep("correct", 8))
```
```{r, echo = TRUE}
paul_guesses
```
## Calculation of P-Values in R (Cont'd)
* One-sided: $H_A$: $p > 0.5$
```{r, echo = TRUE}
binom.test(paul_guesses$result,
success = "correct",
p = 0.5,
alternative = "greater")
```
## Calculation of P-Values in R (Cont'd)
* One-sided: $H_A$: $p < 0.16$
* Suppose we know the number of trials ($n = 100$ M\&M's) and number of successes ($x = 12$ blue)
```{r, echo = TRUE}
binom.test(x = 12,
n = 100,
p = 0.16,
alternative = "less")
```
## Calculation of P-Values in R (Cont'd)
* Two-sided: $H_A$: $p \neq 0.66$
* Suppose we know the number of trials ($n = 8302$ students) and number of successes ($x = 5562$ missed school)
```{r, echo = TRUE}
binom.test(x = 5562,
n = 8302,
p = 0.66,
alternative = "two.sided")
```
## Drawing Conclusions
* **p-value**: probability of getting a test statistic at least as extreme as what we observed, assuming $H_0$ is true
* e.g., probability of getting at least 8 predictions right if Paul is just guessing
* If the p-value is **small**, that is evidence that the null hypothesis may not be true
## Drawing Conclusions
* **p-value**: probability of getting a test statistic at least as extreme as what we observed, assuming $H_0$ is true
* e.g., probability of getting at least 8 predictions right if Paul is just guessing
* If the p-value is **small**, that is evidence that the null hypothesis may not be true
* If we need to make a decision about whether or not the null hypothesis is true, we can see whether the p-value is smaller than a cutoff of our choosing
* The cutoff is the **significance level** of the test
* Denote the significance level by $\alpha$ (alpha)
* A common significance level: $\alpha = 0.05$
* But this choice is arbitrary
## Drawing Conclusions
* If the p-value $< \alpha$, we "reject" $H_0$: the data offer enough evidence to conclude that $H_0$ is not true at the significance level $\alpha$.
* If the p-value $\geq \alpha$, we "fail to reject"" $H_0$: the data **don't** offer enough evidence to conclude that $H_0$ is not true at the significance level $\alpha$.
## Note About the Book
* The procedure described in these slides is different from what's in the book.
* Our method uses
* **sample statistic** = number of successes in the sample
* **sampling distribution** = modeled with a Binomial
* The book's method uses
* **sample statistic** = proportion of successes in the sample
* **sampling distribution** = modeled with a Normal
* Everything else is the same (hypotheses, p-values, conclusions), and both methods are valid.
* Our procedure requires less work:
* fewer assumptions to check (more broadly applicable)
* fewer computations (e.g. no need to calculate $\sqrt{p(1-p)/n}$)