Weighting

Causal inference

Propensity score, matching, weighting

Author

Chi Zhang

Published

September 3, 2023

Matching

Exact matching:

perfect covariate balance; \(F(X_i|T_i = 1) = F(X_i|T_i=0)\)
infeasible when covariate is continuous, and when there are many covariates.

Probability of receiving treatment, \(\pi(X_i) = P(T_i = 1 | X_i)\)

Matching based on distance measures

Mahalanobis distance
Estimated propensity score, \(D(X_i, X_j) = |P(T_i = 1|X_i) - P(T_j=1 | X_j)|\)

Check covariate balance

ideally compare joint distribution of all covariates
practically check lower-dimensional summaries (standardized mean difference, variance ratio, empirical CDF difference)

Balance test

Matching would reduce number of observations

MatchIt package

PS matching is ONE of the many matching techniques that uses PS as the difference.

Weighting

Weighting can be viewed as a generalization of matching

Used in weighting

ATE (population) \(w_{ATE} = \frac{Z_i}{p_i} + \frac{1-Z_i}{1-p_i}\)

ATT \(w_{ATT} = \frac{p_i Z_i}{p_i} + \frac{p_i(1-Z_i)}{1-p_i}\)

ATC \(w_{ATC} = \frac{(1 - p_i) Z_i}{p_i} + \frac{(1 - p_i)(1-Z_i)}{1-p_i}\)

Inverse probability weighting IPW

(In my use case it is for survey sampling, rather than causal inference. My project does not have treatment group, and do not care about treatment effects. )

Propensity score

(Rosenbaum and Rubin, 1983)

in observational studies, conditioning on propensity scores can lead to unbiased estimates of the exposure effect
given that there are no unmeasured confounders
every subject has a non-zero probability of receiving exposure

Fit a logistic regression:

binary outcome: exposure (1,0)
covariates: all but exposure

Predict the values (probability), they are the propensity scores.

PS can also be estimated using other methods that produce probabilities, not just logistic regression: random forest, lasso logistic regression etc.