Resources for Causal Inference & Causality
Many of these introductory materials are FREE thanks to generosity of authors and community contributors
- “Causal Inference, The Mixtape” by Scott Cunningham Link to the e-book can be accessed here. It has code for implementation by three languages: Stata, R, Python.
- “Causal Inference” by Paul Rosenbaum. I was recommended this book after reading the second edition of another book he wrote, “Design of Observational Studies”. This book is good for curious members of the public, including non-technical audience just like “The Book of Why” (info) by Judea Pearl et al., both can be accessed here via Audible if you prefer to listen to non-technical contents when doing your daily routines/chores like me.
- Dr. Rosenbaum also had an introductory book called “Observation and Experiment, An Introduction to Causal Inference”. I have not read this book but heard that it is very accessible to layman readers.
- For professionals who are familiar with causal inference and simply want a handbook for timely reference, “Handbook of Matching and Weighting Adjustments for Causal Inference” can be a reference in your toolkit. Here’s the link to a book review published on ASA for the handbook published last year with the same co-author as above.
- E-book of “Causal Inference for the Brave and True”. This book is a nice Python companion.
- Link to an introductory textbook “Causal Inference: What If” by Miguel Hernan and James Robins: https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/
- “Causal Inference and Discovery in Python”. This one is not FREE, but if you are like me and prefer to have a paper copy supporting the author and publisher, grab a copy. For a list of popular open-source Python packages or toolboxes for performing causal analysis, check out Jane Huang’s post here published in Data Science at Microsoft, where the list followed by tables of comparison were shown towards the end.
- The same goes for one of the best selling books that I mentioned before even when I was jotting down thoughts about non-career stuffs: Second edition of “Causality” by Judea Pearl, the link for contents of both versions can be found here. I assume most of the technical readers already know these two, but it doesn’t hurt to include them.
- A selected chapter “causal inference using regression on the treatment variable”. This link gave me a little nostalgia, recalling the “good old days” (or “good quiz days” back then) of a course I took in 2008 when Prof. Gelman was teaching this as one of the core courses for first-year PhD students (I was not on PhD track but selected it among other doctoral courses as I was waived master’s level core courses in statistics upon entry), mainly on multilevel models: http://www.stat.columbia.edu/~gelman/arm/. Time flies, so glad that we are continuously learning new things by standing on the shoulders of fundamental teachings in the past. :)
- Treatment Effect section from a more traditional econometrics course at MIT can be accessed here. Notes from the Spring 2022 version of a similar class taught at Stanford GSB are available here. The professor who recently taught this course, Stefan Wager, also co-taught a class with Susan Athey, introducing causal inference more targeted towards applied social scientists, with a focus on machine learning, the lectures are free to watch on YouTube here. For more software details related to the topic, please check out the Machine Learning-based Causal Inference Tutorial.
- “The Effect: An Introduction to Research Design and Causality”. I came across it on a keyword search one day and thought the video series that comes with it by the author might help some beginners who are curious in getting a sense of what research design and causality potentially touches on. Speaking of videos, I am NOT going to include more videos in this post but would be happy to share my currently private YouTube playlists sometime after doing due diligence to distinguish them for different purposes. There are contents everywhere but not all are accurate, and I truly appreciate that Dr. Ron Kohavi put in efforts to mark in colors about what’s misleading and somewhat misleading on A/B testing videos in one of his public post shared with spreadsheet here.
- For those who are interested in time-dependent or treatment-dependent confounding but would like to have a light review of what’s there since Robins’ paper in 1986 and 1997, here’s “A Tutorial on Causal Inference in Longitudinal Data With Time-Varying Confounding Using G-Estimation” and for those who like a more introductory lecture note I include Dr. Fan Li’s lecture notes on this relevant topic here. Personally I only utilized these techniques and even extended usage by deriving mathematical formulae during an early job of mine working in a centralized statistician group in healthcare and only for selected non-pharmaceutical non-clinical trials (unless forPhase 0) related projects. Those were fun applications but required some academic rigor as well as fulfillment.
- An R package “CausalImpact” for causal inference using Bayesian structural time-series models: https://google.github.io/CausalImpact/CausalImpact.html
- Identifying Causal Effects with the R Package causaleffect: Please feel free to check out the guide along with the links that it references here.
- There are more R packages out there, but given the introductory nature of this post, “Fundamentals of Causal Inference: With R” would be a friendly companion for R users, especially those in qualitative disciplines such as psychology, epidemiology, and economics.
- For those who are interested in applications in public policy and would like an introduction to causal inference and data analysis with R, here’s a book for you called “Demystifying Causal Inference: Public Policy Applications with R”. Besides an introduction to causality using the potential outcomes framework and causal graphs, it also covers specific causal inference methods, including experiments, matching, panel data, difference-in-differences, regression discontinuity design, instrumental variables and meta-analysis, with the help of empirical case studies of policy issues.
- Want a little philosophical thought on causality along with some visual fun time? Causation: A Very Short Introduction. Don’t go through the book without checking out Dr. Ellie Murray’s cartoons with a cup of your favorite drink and laughter: Her tweets started from here: Twitter @EpiElli Yes I really like her cartoons especially those summarizing this book. :)
18. A book that’s good for a comprehensive starter: “Causal Inference: What If” (2020, CRC Press) by Miguel A. Hernán and James M. Robins. The most recent (Jan 2024) revision of this e-book can be accessed here (from the same site in the previous link) for free. For any reader intrigued by the third part of this book, causal inference from complex longitudinal data, and want a better dive into the sea of complex longitudinal studies — yes I had been there, please feel free to check out the second in the “Target Learning” series (or duet?), “Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies” (2018) by Mark J. van der Laan and Sherri Rose.
19. Last but not least, if you are already working on problems related to causality and experimentation and would like a reference or reads for entertainment with critical thinking, here are the a few sample books I have recently been referring to: “Controlled Experiments”, “Experimental and Quasi-Experimental Designs for Generalized Causal Inference” (good for those who foundational knowledge in basic experimentation at the level of “Quasi-Experimentation: A Guide to Design and Analysis”), and again, “Design of Observational Studies” aforementioned in #2. Of course, although causality and experimentation are closely related because causation can only be determined from an appropriately designed experiment, this post intends to focus more on causal inference than experimentation I would summarize for if there’s an interest for design of experiments (DOE) resources.
If you have anything you would like to add or share, please do not hesitate to leave a comment!
Update after completing this post:
Here are several FREE interesting talks on experimentation for two-sided marketplace.
Ed Merkle’s talk about “Bayesian Structural Equation Modeling & Causal Inference in Psychometrics” can be found here, the reason why I noticed this channel is that I listened to the episode in which the author of an entertaining book called “Bernoulli’s Fallacy: Statistical Illogic and the Crisis of Modern Science” was invited.