Education and Mortality

The following data set has the mortality rate (deaths per 100,000 people) and the education level (average number of years in school) for 58 U.S. cities.

death <- read_csv("https://mhc-stat140-2017.github.io/data/sdm4/Education_and_mortality.csv")

## Parsed with column specification:
## cols(
##   Mortality = col_double(),
##   Education = col_double()
## )

head(death)

## # A tibble: 6 x 2
##   Mortality Education
##       <dbl>     <dbl>
## 1     921.9      11.4
## 2     997.9      11.0
## 3     962.4       9.8
## 4     982.3      11.1
## 5    1071.3       9.6
## 6    1030.4      10.2

nrow(death)

## [1] 58

1. Initial Questions

What are the observational units?
Are the variables quantitative or categorical?

SOLUTION:

2. Scatterplot

Make a scatterplot of the data, thinking of Education as the explanatory variable and Mortality as the response.

SOLUTION:

3. Are the assumptions for a linear model met?

Check all the assumptions you can check without actually fitting the model.

SOLUTION:

4. Regardless of your answer above, go ahead and fit the linear model.

SOLUTION:

5. Check any assumptions you couldn’t check before you fit the model. Should we go ahead with using this model?

You’ll need to make a plot of the residuals.

SOLUTION:

6. Explain in context what the regression says about the relationship between education levels and mortality. Interpret both the intercept and the slope in context.

SOLUTION:

7. Conduct a hypothesis test of the claim that there is no relationship between education levels and mortality. State all hypotheses, defining all involved population parameters. Use the p-value from the R output above, and verify that you know how to calculate the p-value using the qt function and the coefficient estimate and standard error. (When I did this, I got p-values of about 6.169e-08 to 6.228e-08, depending on exactly how I did the rounding).

SOLUTION:

8. Obtain a 90% confidence interval for the population slope, \(\beta_1\). Do this using the confint function, and verify that you also know how to do it using the qt function in R using the estimate and standard error. Interpret the confidence interval for \(\beta_1\) in context.

SOLUTION:

Stat 140: Inference for Simple Linear Regression Lab - Education and Mortality

Evan Ray

November 29, 2017

Education and Mortality

1. Initial Questions

2. Scatterplot

3. Are the assumptions for a linear model met?

4. Regardless of your answer above, go ahead and fit the linear model.

5. Check any assumptions you couldn’t check before you fit the model. Should we go ahead with using this model?

6. Explain in context what the regression says about the relationship between education levels and mortality. Interpret both the intercept and the slope in context.

8. Obtain a 90% confidence interval for the population slope, \(\beta_1\). Do this using the confint function, and verify that you also know how to do it using the qt function in R using the estimate and standard error. Interpret the confidence interval for \(\beta_1\) in context.