Andrew Gelman and Guido Imbens recently posted a paper entitled “Why Ask Why? Forward Causal Inference and Reverse Causal Questions.” It completely made my day, primarily because it succinctly deals with the way people naturally arrive at research questions with the help of some statistical logic. While I liked the models and the logic, what I think is more important is the authors’ process for explaining the value of ‘why’ questions.
Generally the way statistical models are developed is to identify a dependent variable, and see how it changes in response to changes in an independent variable. For example, we might make ‘war outbreak’ a dependent variable, and then see what the changes are in the annual outbreak of wars based on changes in annual small arms production. Small arms production in this case would be an independent variable. Normally I would then go through an exhaustive round of control processes to test the statistical significance of small arms production as an explanatory variable of war outbreak. The goal is to test the how well arms production predicts war outbreak, to the exclusion of other intervening variables.
This method is often the favored one. Some of this is because of the scientific method, some because of path dependence (forward causality gets taught a lot, reverse causality doesn’t), and, in my opinion, partly because social science is in a phase were it’s vogue to be forward causal. What Gelman and Imbens point out, using both sound argument and mathematical logic, is that we need ‘why’ questions in order to ask better ‘what if’ questions. As someone who tends to be quantitative in nature, but also values exploratory and qualitative methods I found it refreshing to read, especially this paragraph:
“By formalizing reverse casual reasoning within the process of data analysis, we hope to
make a step toward connecting our statistical reasoning to the ways that we naturally think
and talk about causality. This is consistent with views such as Cartwright (2007) that causal
inference in reality is more complex than is captured in any theory of inference. The basic
idea expressed in this paper—that the search for causal explanation can led to new models—is hardly new….What we are really suggesting is a way of talking about reverse causal questions in a way that is complementary to, rather than outside of, the mainstream formalisms of statistics and econometrics.” -Pg. 6
While I think my friends who are harder-core quants than me would benefit from giving this a read, I would also suggest it to my friends who either don’t have much statistical training or are suspicious of predictive approaches to research since it makes the logic accessible. I think it should also be part of any serious discussion on mixed methods since we usually find our ‘why’ questions out in the world of human and environment interactions, where qualitative methods can be excellent for pointing us to new sources of data.