Weighting
Propensity score, matching, weighting
Matching
Exact matching:
- perfect covariate balance; \(F(X_i|T_i = 1) = F(X_i|T_i=0)\)
- infeasible when covariate is continuous, and when there are many covariates.
Probability of receiving treatment, \(\pi(X_i) = P(T_i = 1 | X_i)\)
Matching based on distance measures
- Mahalanobis distance
- Estimated propensity score, \(D(X_i, X_j) = |P(T_i = 1|X_i) - P(T_j=1 | X_j)|\)
Check covariate balance
- ideally compare joint distribution of all covariates
- practically check lower-dimensional summaries (standardized mean difference, variance ratio, empirical CDF difference)
Balance test
Matching would reduce number of observations
MatchIt
package
PS matching is ONE of the many matching techniques that uses PS as the difference.
Weighting
Weighting can be viewed as a generalization of matching
Used in weighting
ATE (population) \(w_{ATE} = \frac{Z_i}{p_i} + \frac{1-Z_i}{1-p_i}\)
ATT \(w_{ATT} = \frac{p_i Z_i}{p_i} + \frac{p_i(1-Z_i)}{1-p_i}\)
ATC \(w_{ATC} = \frac{(1 - p_i) Z_i}{p_i} + \frac{(1 - p_i)(1-Z_i)}{1-p_i}\)
Inverse probability weighting IPW
(In my use case it is for survey sampling, rather than causal inference. My project does not have treatment group, and do not care about treatment effects. )
Propensity score
(Rosenbaum and Rubin, 1983)
- in observational studies, conditioning on propensity scores can lead to unbiased estimates of the exposure effect
- given that there are no unmeasured confounders
- every subject has a non-zero probability of receiving exposure
Fit a logistic regression:
- binary outcome: exposure (1,0)
- covariates: all but exposure
Predict the values (probability), they are the propensity scores.
PS can also be estimated using other methods that produce probabilities, not just logistic regression: random forest, lasso logistic regression etc.