Comments on Kosuke Imai and David A. van Dyk
"Causal Inference with General Treatment Regimes: Generalizing the Propensity Score"

Philip A. Schrodt
Department of Political Science
Penn State University

Presented at the 20th Political Methodology Summer Conference
University of Minnesota

19 July 2003

What you need to have read to understand presentations at the summer conference

  • 1984-1987: Blalock, Social Statistics
  • 1988-1991: Johnston, Econometric Methods
  • 1992-1995: Greene, Econometric Analysis
  • 1996-1999: Hamilton, Time Series Analysis
  • 2000-2003: winBUGS manual

Hogwart’s School of Political Methodology - 1

  • History of Magic
  • Care of Magical Creatures
  • Charms
  • Arithmancy
  • Divination
  • Defense Against the Dark Arts

Hogwart’s School of Political Methodology - 2

Being a methodologist is a lot like being Harry Potter, except that you spend the summer at Hogwarts* and the academic year at the Dursley’s**

* ICPSR, EITM, Political Methodology summer conference

** your department

History of Magic

Originally developed in biostatistical literature to deal with problem of non-random assignment to treatment groups in medical experiments

Original propensity function dealt only with dichotomous treatments

Care of Magical Creatures(1)

Propensity function = Sorting Hat

Sorting Hat takes a high dimensional vector of personality covariates and assigns a scalar surrogate indicator in the form of the school–Gryffindor, Slytherin, Ravenclaw, Hufflepuff.

This indicator can subsequently be used to control for the effects of treatments–e.g. encounters with mountain trolls, unicorns, centaurs–even if the treatment is assigned non-randomly (2)

Charms - 1(6)

Randomized assignment is nearly impossible to achieve when dealing with human subjects

  • Practical considerations
  • Ethical considerations
  • Self-selection

Charms - 2

Non-random assignment can cause estimation of treatment effect to be affected by

  • Collinearity: inflates standard errors and may cause coefficient sign flips
  • Specification error: omitted variable bias can inflate apparent treatment effect

Example: early epidemiological studies of effects of smoking

(be afraid, be very afraid…)

Charms - 3

Method has been extended by the authors and others to deal with

  • Categorical treatments of any size
  • Ordinal treatment groups
  • Interval treatment groups

The method can be extended to work with multiple treatments

Arithmancy

Given a propensity function, cases are sorted into sub-groups, then treatment effect is estimated for each sub-class

Average effect is a weighted average of sub-class effects, with the weight equal to the sub-class size

In the case of ordinal and interval treatments, case choice ideally should be very similar on the propensity score while very different on the treatment

Divination (3)

Use of propensity scores reduces MSE of the bias by about 75% in Monte Carlo tests even when the functional form differs somewhat from the form used to generate the data

Defense Against the Dark Arts - 1

This method assumes that the propensity function exists!

cf. The "economist and the can-opener" joke (4)

Defense Against the Dark Arts - 2

Suggested sources of the propensity function:

  • Search your soul (cf. Bayesian priors) (5)
  • Consult an M.D. or area studies expert (7)
  • Obey the Natural-Order-of-the-Universe and use a linear model
  • Reverse engineer the Sorting Hat algorithm
  • Wilcox multivariate Bayesian WAG
    (Wild-assed guess)

Defense Against the Dark Arts - 3

Choice of the propensity function could affect the results

big time...

One can test to confirm that a particular propensity function provides balancing scores...

but this balancing diagnostic does not provide for uniqueness

The risks of correlating across sub-populations

[see PowerPoint presentation for graphics]

Propensity Classification 1

[see PowerPoint presentation for graphics]

Propensity Classification 2

[see PowerPoint presentation for graphics]

Clustering may be dependent on choice of co-variates

Best subject Familiar Wand subclass
Parents wizard /
potions
mixed /
owl
muggle /
phoenix
S1
Family income high /
charms
medium/
cat
low/
unicorn
S2
Nationality English /
divination
French /
rat
German /
dragon
S3
subclass S1 S2 S3  

Questions - 1

Are there situations where the poor choice of a propensity function will make the bias worse?

Probably, and this would seem to be more of a risk when there is specification error in the propensity function

Questions - 2

Is a weighted average the optimal way to aggregate the information or could information on the standard errors of the coefficients also be incorporated?

Are shrinkage techniques appropriate here?

Questions - 3

In the case of the continuous treatment, why do this rather than using alternative methods of dealing with correlation within the independent variables?

Alternatively, could any covariate be considered a "treatment"–is this another general method of dealing with collinearity?

Final thoughts - 1

"A man working alone can be pretty dumb, but for sheer rampant stupidity there ain’t nothing to beat teamwork."(8)

Edward Abbey, Monkey Wrench Gang

Final thoughts - 2

"A poorly chosen estimator can be pretty bad, but for sheer rampant bias there ain’t nothing to beat misspecification."

Gary King, 20th Political Methodology Summer Conference [paraphrase]

Thank you...

and have a nice day J

Footnotes

1. This should be required of every discussant (and/or presenter): we don't need a whole lot of detail about your method; just tell us what the magical part is. (Back)

2. Which, in fact, is much of the structure of the Harry Potter series: various stimuli are presented to Hogwart's as a whole, and individuals in the schools react differently. Rowling uses this as a literary method of illustrating the characteristics of the schools on the basis of the experiments -- the Sorting Hat is assumed to have made correct judgements and the effects of the "treatment" are unambiguous. An analysis using propensity scores, in contrast, assumes that those group characteristics are known and seeks to determine the effects of the experiment. (Back)

3. It is interesting to note that in the first four books in the Harry Potter series, the only form of magic that is treated with skepticism is divination. More generally, the principle driving all of Rowling's plots is that wizards and witches have nearly total control over their physical world, but lack extraordinary access to information critical to their lives. Harry can fly, transform his body, and become invisible, but must rely on the same techniques as Sherlock Holmes in determining who is friend and who is foe.

[Divination -- or at least "prophecy" -- gets slightly more respect in Order of the Phoenix, but even in that book accurate predictions are seen as rare and unusual things.]
(Back)

4. If you don't already know this joke, it is unlikely that much of anything else on this site is making sense either. (Back)

5. An extended panel discussion on Bayesian methods the previous day had concluded that this was the most common, if not necessary the most reliable, source of priors. (Back)

6. Okay, I'm cheating here and using the word "charms" in the sense of "charming" rather than in the wizarding context. Also I couldn't really figure out how to use "Potions" -- which would open many possiblilities for Snape comments -- but one doubtlessly exists. (Back)

7. This joke was funny in the context of the conference. But seriously, it is my humble opinion (and experience) that when dealing with countries with a GDP/capita less than $1,000, a typical area studies expert learns more useful information on the taxi ride in from the airport than is contained in most available data sets. On the other hand, I'll probably stay with the characteristization of M.D.s... (Back)

8. A nice, succinct summary of everything we know about bureaucratic politics. Arguably a nice, succinct summary of everything we know about political science. (Back)