November 13, 2017

Outline of Hypothesis Tests (Again)

  1. Collect Data: (For each of 8 attempts, was Paul's prediction right?)
  2. Calculate a test statistic: \(x = 8\) (observed number correct)
  3. Write down hypotheses:
    • Null Hypothesis: Paul was just guessing: \(p = 0.5\)
    • Alternative Hypothesis: Paul is psychic: \(p > 0.5\)
  4. Sampling Distribution of the test statistic, assuming null hypothesis is true.
  5. p-value: probability of getting a test statistic at least as extreme as what we observed, assuming null hypothesis is true.
  6. Conclusion: Compare the p-value to the significance level \(\alpha\). If the p-value is small, it's unlikely that Paul would get 8/8 right if he was just guessing, so we reject the null

Example: Body Temperatures

  • It's generally believed that the average body temperature is 98.6 degrees Farenheit (37 degrees Celsius).
  • Let's investigate with measurements of the temperatures of 130 adults.

  • Hypotheses:
    • \(H_0\): \(\mu = 98.6\)
    • \(H_A\): \(\mu \neq 98.6\)
  • What should our test statistic be?

A Key Result from Last Class

  • \(\bar{X} \sim \text{Normal}(\mu, \sigma / \sqrt{n})\)
    • Across all samples, on average the sample mean is equal to the population mean \(\mu\).
    • The standard deviation of \(\bar{X}\) is \(\frac{1}{\sqrt{n}}\) as much as the standard deviation \(\sigma\) of values in the population.
  • \[\frac{\bar{X} - \mu}{\sigma / \sqrt{n}} \sim \text{Normal}(0, 1)\]
    • \(\frac{\bar{X} - \mu}{\sigma / \sqrt{n}}\) is the distance of \(\bar{X}\) from \(\mu\), in units of \(SD(\bar{X})\).
  • \[\frac{\bar{X} - \mu}{s / \sqrt{n}} \sim t_{n-1} \text{ (replace $\sigma$ with its estimate, $s$).}\]
    • \(\frac{\bar{X} - \mu}{s / \sqrt{n}}\) is the distance of \(\bar{X}\) from \(\mu\), in units of \(SE(\bar{X})\).

Test Statistic for a Mean

  • Let's define our test statistic to be \[t = \frac{\bar{X} - \mu_0}{s / \sqrt{n}} \text{, where}\] \(\mu_0\) is the value of \(\mu\) specified in \(H_0\) (98.6 in this case)
  • How far was the sample mean from the hypothesized population mean, in units of our best guess at the standard deviation of \(\bar{X}\)?
  • If the null hypothesis is true, then \[t = \frac{\bar{X} - \mu_0}{s / \sqrt{n}} \sim t_{n - 1}\]

Conditions to Check

  • Observations are independent
  • Population is nearly normal (unimodal, approximately symmetric)…
  • …and sample size \(n\) is large enough (how big depends on how asymmetric distribution is)

Back to Body Temperatures

Assumptions for hypothesis tests about means:

  • Independence
  • Data distribution is nearly normal (unimodal and symmetric)
  • Sufficient sample size

Hypotheses

  • Null Hypothesis (\(H_0\)): \(\mu = 98.6\) (where \(\mu\) is the population mean temperature)
  • Alternative Hypothesis (\(H_A\)): \(\mu \neq 98.6\)

Test Statistic

nrow(bodytemp)
## [1] 130
mean(bodytemp$temp)
## [1] 98.24923
sd(bodytemp$temp)
## [1] 0.7331832

\[ t = \frac{\bar{X} - \mu_0}{s / \sqrt{n}} = \frac{98.249 - 98.6}{0.733 / \sqrt{130}} = -5.460 \]

Test Statistic in R

\[ t = \frac{\bar{X} - \mu_0}{s / \sqrt{n}} \]

n <- nrow(bodytemp)
x_bar <- mean(bodytemp$temp)
s <- sd(bodytemp$temp)
mu_0 <- 98.6

t <- (x_bar - mu_0) / (s / sqrt(n))
t
## [1] -5.454823

P-value

  • Probability of getting a test statistic at least as extreme as what we observed, assuming the null hypothesis was true.
  • "At least as extreme" in either direction, since \(H_A: \mu \neq 98.6\)
  • \(t \sim t_{129}\) (since \(n = 130\) and the degrees of freedom is \(n - 1\))

Calculation of p-value

pt(-5.455, df = 129) # probability to the left of -5.455
## [1] 1.204343e-07
1 - pt(5.455, df = 129) # probability to the right of 5.455
## [1] 1.204343e-07
  • Combined p-value is 0.000000241

Alternative Calculation in R

t.test(bodytemp$temp, mu = 98.6, alternative = "two.sided")
## 
##  One Sample t-test
## 
## data:  bodytemp$temp
## t = -5.4548, df = 129, p-value = 2.411e-07
## alternative hypothesis: true mean is not equal to 98.6
## 95 percent confidence interval:
##  98.12200 98.37646
## sample estimates:
## mean of x 
##  98.24923

Conclusion

  • Compare the p-value to the significance level \(\alpha\). For example, if \(\alpha = 0.001\) then

\[0.000000241 < 0.001 \text{, so}\]

  • The data provide enough evidence to conclude that the mean temperature is not 98.6 degrees F, at the \(\alpha = 0.001\) significance level.

From Wikipedia

"The range for normal human body temperatures, taken orally, is 36.8 \(\pm\) 0.5 °C (98.2 \(\pm\) 0.9 °F). This means that any oral temperature between 36.3 and 37.3 °C (97.3 and 99.1 °F) is likely to be normal.

The normal human body temperature is often stated as 36.5-37.5 °C (97.7-99.5 °F). In adults a review of the literature has found a wider range of 33.2-38.2 °C (91.8-100.8 °F) for normal temperatures, depending on the gender and location measured."