Supreme Court Case Buck v. Davis

Introduction

On October 5th, 2016, the United States Supreme Court heard arguments in the case of Buck v. Davis. The case was about possible differences in how often appeals are granted in different circuits of the US Federal Appeals Courts (each circuit has its own court that makes decisions about appeals for a group of several states – there’s a map here: http://www.uscourts.gov/sites/default/files/u.s._federal_courts_circuit_map_1.pdf). Here are some excerpts from an article at NPR about the case (you can read the full article at http://www.npr.org/2016/10/05/496769580/supreme-court-hears-indefensible-death-penalty-case-where-race-linked-to-violenc).

``The Supreme Court heard arguments Wednesday in the case of Duane Buck, a convicted Texas murderer sentenced to die after a psychologist testified that he was more likely to commit violent crimes in the future because he is black.

When a similar case involving the same psychologist went to the Supreme Court in 2000, Texas conceded error. It also found six more cases in which Quijano [the psychologist] linked race to violence, and it pledged to allow all seven defendants to bring appeals for new sentencing hearings. The state delivered on that promise, except in Buck’s case.

Inside the Supreme Court chamber Wednesday, there seemed little doubt that would change. The question was how.

Would the justices just say that the 5th Circuit Court of Appeals was wrong to deny Buck a sentencing appeal, a decision that would only affect Buck? Or would they rule that the Fifth Circuit is an outlier in death penalty appeals and that its whole approach is wrong?’’

In order to make an argument that the Fifth Circuit has unreasonable requirements for allowing death penalty appeals, Buck’s lawyers gathered data about the number of cases in each circuit where such appeals were denied over the last 5 years. These can be found in Appendix F of the petition to take the case to the Supreme Court, here: http://www.scotusblog.com/wp-content/uploads/2016/05/Filed-Buck-Cert.-Pet._3-2.pdf. I’ve summarized the data in the following table:

Circuit Number Number of Appeal Requests Number Denied
4th 12 0
5th 129 76
11th 111 11

Let’s use these data to compare the rates at which requests for appeals are denied by the 5th and 11th circuits. How strong is the evidence for Justice Kagan’s conclusion that “one of these two circuits is doing something wrong” – i.e., the rate of appeal request denials is different in the two circuits? (In the end, Buck was granted an appeal.)

Check Assumptions

There are five (!) assumptions to check when comparing proportions from two groups:

1. Two outcomes

SOLUTION:

There are two outcomes: each repeal request was either denied or not.

2. Same probability of success for all trials within each group (but the probability may be different in different groups)

SOLUTION:

It seems reasonable to think there is some overall rate at which appeal requests are denied or granted within each circuit.

3. Independence

Within each group, the data should be based on results for independent individuals. (If I know the outcome for one individual, does that change the probability of denial for another individual in the same circuit?)

SOLUTION:

This assumption may not hold exactly. We might imagine that there could be some cases where the defendants filing for appeals were involved in the same case, or had the same lawyer or the same expert witness. Decisions on appeal requests in cases like that could be connected. In fact, we saw one example of this in the NPR article: there were 7 cases with the same expert witness, and the result of the 2000 decision was that appeals should be granted in all 7 of those cases. Of course, it didn’t exactly work out that way, but appeals were granted in 6 of those 7 cases. This sort of connection between cases indicates a lack of complete independence.

4. Independent Groups

The two groups we’re comparing must be independent of each other.

SOLUTION:

This assumption seems more plausible. Across the two different circuits, there is less likely to be a connection between different court cases.

5. Sample Size

Within each group, is the sample size large enough? We need to expect at least 10 “successes” and at least 10 “failures” within each group to perform inference based on a Normal distribution.

SOLUTION:

This assumption holds. In both the 5th and 11th circuits, at least 10 appeal requests were granted and at least 10 were denied. Note that this condition does not hold for the 4th circuit, so we could not use the methods we’re dicussing here to analyze data from that circuit.

Hypotheses

Regardless of your answers about the assumptions, we will conduct a hypothesis test for the claim that the 5th and 11th circuits deny appeal requests at different rates. Here, state the null and alternative hypotheses. Define all symbols that you use (what are the parameters?)

SOLUTION:

Let’s define the parameter \(p_1\) to be the proportion of appeal requests in the 5th circuit where the appeal was denied and \(p_2\) to be the proportion of appeal requests in the 11th circuit where the appeal was denied.

\(H_0: p_1 - p_2 = 0\)

\(H_0: p_1 - p_2 \neq 0\)

Computations in R

Use the prop.test function in R to calculate a p-value for the test and a 95% confidence interval.

SOLUTION:

# Your code goes here
prop.test(x = c(76, 11), n = c(129, 111), conf.level = 0.95)
## 
##  2-sample test for equality of proportions with continuity
##  correction
## 
## data:  c(76, 11) out of c(129, 111)
## X-squared = 60, df = 1, p-value = 1e-14
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  0.3802 0.5999
## sample estimates:
## prop 1 prop 2 
## 0.5891 0.0991

Hypothesis Test

Conduct a hypothesis test for the claim that the 5th and 11th circuits deny appeal requests at different rates. Use the \(\alpha = 0.05\) significance level. Based on the p-value you calculated above, make a decision and interpret it in context.

SOLUTION:

The p-value for the test that the proportions of appeal requests denied is the same in the 5th circuit and the 12th circuit is \(9.993e-15 = 9.993 * 10^{-15} = 0.000000000000009993\). This is much less than \(\alpha = 0.05\), indicating that the data offer very strong evidence that the proportion of appeal requests denied is different in the 5th circuit and the 11th circuit.

Confidence Interval

Interpret the confidence interval you calculated above in the context of this problem. What does it mean that you are “95% confident”?

SOLUTION:

We are 95% confident that the difference in the proportion of appeal requests that are denied in the 5th circuit and in the 11th circuit is between 0.380 and 0.600.

If we imagined taking many different “random samples” of cases in each of these districts and calculating a different 95% confidence interval from each sample, about 95% of those confidence intervals would contain the underlying difference in proportions of appeal requests denied in these two circuits. It’s not totally clear what it means to take a random sample of appeal requests, but I’m imagining a population of many slightly different “potential cases” that could have occurred in each circuit.

Limitations

The evidence here is pretty compelling. Is it strong enough to bring to the Supreme Court? Discuss at least one reason why the analysis we’ve done here does NOT definitively prove that there is a problem with the way the Fifth Circuit makes decisions about appeals. What would you do to address those limitations if you were on Buck’s legal team?

SOLUTION:

Our findings certainly suggest that something is going on here. However, there are several potential limitations to this analysis:

  1. This is an observational data set, and with observational data there is always the possibility that observed differences are due to a lurking variable. Perhaps there is something fundamentally different between the types of cases that are appealed in the two circuits. For example, maybe defense attorneys are more aggressive in filing appeal requests in the 11th circuit, and so they file more appeal requests for cases that are likely to be denied. It might be worth looking into the details of the cases further, perhaps performing matching to compare outcomes in the different circuits that had a similar legal basis.

  2. Our analysis yielded the conclusion that there is a difference in the proportion of appeal requests that are denied between these two circuits. It doesn’t address the question of why there is a difference. Someone could argue that there is a problem with the way the 11th circuit handles appeals, not the 5th circuit (although this seems less likely given the results for the 4th circuit, which were not included in this analysis).

  3. As we discussed above, there may be problems with the assumption of independence. That would mean that our effective sample size isn’t as large as we thought (maybe those 6 cases that were decided all at once should only count as one decision, not 6 separate decisions). If that’s the case, we would have less actual evidence than the sample size suggests. That would mean that our computations of the confidence interval and the p-value were not quite right. However, the p-value is so small that I doubt a more careful accounting for dependence would change the results much.

  4. We could always have made a Type I Error (although the very small p-value says that this is unlikely).

I would want to do a more detailed analysis to try to address the concerns in point 1 above if I were going to bring these data to the Supreme Court.