Sorry for the delay between posts—I’m in the process of migrating this blog from jekyll to Hugo. In this post, we’re going to unpack a few more of Fisher’s ideas.

Our playground

We’re using the same set up as in Efron’s paper. In particular, we have the output of a randomized experiment given in the following table.

  yes no row sum
treatment 1 15 16
control 13 3 16
column sum 14 18  

The sample log-odds are

If we view the table as a multinomial model—as Fisher did—we have three degrees of freedom. The would-be fourth degree gets absorbed by the fact that we require the probabilities to sum to 1.

Conditional inference and approximate ancillarity

Fisher wanted make inferences about , so he developed a clever workaround to deal with the nuisance parameters by conditioning on the marginals of the table. The marginals bring some information to the estimation problem, but not much—this is why Fisher dubbed them approximate ancillary statistics.1 In other words, Fisher reduced the problem to something that only depends on , not the other stuff:

Said another way, Fisher changed the density of the MLE to a conditional density, i.e.,

Fisher’s final trick was to move from densities to likelihoods. Likelihoods are easy to calculate, so writing

where is a constant gives us a compact way to efficiently estimate translation families.

Caveat! The multinomial table is not a translation family. Obtaining the conditional distribution requires a less-than-pleasant calculation.

The magic formula

The magic formula is a modern variant of Fisher’s ratio of likelihood that includes a Fisher information term:

The term in square brackets has a Fisher information feel to it (perhaps unsurprisingly). Let’s unpack the formula with an example.

A normal example. Assume we have data from a normal distribution with mean and standard deviation . We’re interested in estimating the distribution’s mean when both and are unknown. In this case is a nuisance parameter, so let’s use the magic formula.

The density of a normal distribution is

and the log-likelihood is

The first and second derivatives of the log-likelihood are

Now, back to the magic formula:

So, conditional on , we now have something that only depends on .

Next up

… an introduction to empirical Bayes!

References

This post is related to material from:

  • “R.A. Fisher in the 21st Century” by Bradley Efron.
  • Computer Age Statistical Inference: Algorithms, Evidence, and Data Science by Bradley Efron and Trevor Hastie. A digital copy lives here: CASI.
  • An Introduction to the Bootstrap by Bradley Efron and Robert J. Tibshirani.

Footnotes

  1. According to Efron’s, Neyman put Fisher’s ideas on firmer, frequentist-style footing.