Matching

Causal inference

Overview of matching techniques

Author

Chi Zhang

Published

August 23, 2024

This overview summary is based on the review paper Stuart 2010: Matching methods for causal inference: a review and a look forward

Goal of matching: choosing well-matched samples of the original groups to reduce confounding - acquire treatment and control groups with similar covariate distributions.

Alternatives to matching: adjust for covariates in a regression model, instrumental variables, structural equation modeling etc.

Benefits of matching:

complementary to regression adjustment, can and should be used together
have straightforward diagnostics, hence performance can be assessed

Two settings to use matching:

when outcome values are NOT available, and matching is used to select subjects to follow up
outcome values are available, matching is to reduce bias in treatment effect estimation

(note that the outcomes are usually not used even when they are available)

History of matching methods

they’ve been used in 1940s, but a theoretical basis was not developed until 1970s (Rubin et al)
with multiple covariates it is difficult to find exact matches on even small number of covariates (Chapin 1947). The introduction of propensity score (1983, Rosenbaum and Rubin) helped.

Steps to implement matching methods

Define closeness: distance measure used to determine whether an individual is a good match for another
implement a matching method
assess the quality of a matching method. Might require iteration of step 1 and 2
analyse the outcome given the matched data

Step 1: define closeness

Step 2: implement matching

nearest neighbor matching

Subclassification, full matching, weighting

Step 3: diagnose matches

Step 4: analysis of the outcome

Vignette: Estimating effects after matching by Noah Greifer

Matching (old notes)

Exact matching:

perfect covariate balance; \(F(X_i|T_i = 1) = F(X_i|T_i=0)\)
infeasible when covariate is continuous, and when there are many covariates.

Probability of receiving treatment, \(\pi(X_i) = P(T_i = 1 | X_i)\)

Matching based on distance measures

Mahalanobis distance
Estimated propensity score, \(D(X_i, X_j) = |P(T_i = 1|X_i) - P(T_j=1 | X_j)|\)

Check covariate balance

ideally compare joint distribution of all covariates
practically check lower-dimensional summaries (standardized mean difference, variance ratio, empirical CDF difference)

Balance test

Matching would reduce number of observations

Software

MatchIt package

PS matching is ONE of the many matching techniques that uses PS as the difference.

Links

https://stats.stackexchange.com/questions/492218/should-the-choice-of-propensity-score-matching-versus-weighting-depend-on-the-de

https://stats.stackexchange.com/questions/553853/understanding-propensity-score-matching?rq=1

https://aetion.com/evidence-hub/understanding-propensity-score-weighting-methods-rwe/