Notes from book: What If (Part 1)

Causal inference

Causal inference notes: chapter 1 to 10

Author

Chi Zhang

Published

August 18, 2024

Objective for reading the first 10 chapters: build an overview of the concepts at a high level, so that details can fit in the big picture more easily.

Chapter 1 - 3

(Notes are written on paper)

Keywords:

  • exchangeability, conditional exchangeability
  • marginally randomized experiment, conditional and unconditional randomization
  • the equivalence of causal risk difference (or ratio) to associational risk difference
  • effect, and difference between standardization and IP weighting

Chapter 4 Effect Modification

A null average causal effect in the population does not imply a null average causal effect in a particular subset of the population, such as males and females.

Effect modifier: V is a modifier of the effect of A on Y, when the average causal effect on A on Y varies across levels of V.

Stratification is the natural way to identify effect modification. Effect modification is also called effect heterogeneity across strata of V.

Why care about it? If V modifies the effect of A on Y, then the average effect will differ between populations with different prevalence of V, e.g. the effect (size and direction) is more pronounced in the larger group. Also, it helps to identify the group that would benefit the most from an intervention.

When counterfactual outcomes \(Y^{a=1}, Y^{a=0}\) are unavailable, only observed outcomes are available: the use of stratification to detect effect modification will depend on the study design.

  • in an ideal marginally randomized experiment with unconditional randomization, the causal RD is the associational RD. Simply carry out a stratified analysis.

Summary

  • IP weighting and standardisation can be used to compute either marginal or conditional effects;
  • stratification / restriction and matching can only be used to compute conditional effects in certain subsets of the population.

All four approaches require exchangeability and positivity.

When there is no effect modification, the effect measures (RR, RD) computed via these four approches will be equal.

Chapter 6 Graphical representation

DAG: directed acyclic graphs. No cycles: a variable can not cause itself.

Causal diagram guides the data analysis. The opposite direction is called ‘causal discovery’.

For marginally randomized experiment, variable L is not in the DAG as L does not affect the assignment. Example: aspirin use A has preventative causal effect on risk of heart disease Y.

flowchart LR
  A[A] --> Y{Y}

For conditionally randomized experiment, L is a confounder so needs to be in the DAG (has arrow to both A and Y), as L affects assignment. Example: carrying a lighter A has no causal effect on risk of lung cancer Y; cigarette smoking L has a causal effect on both carrying a lung cancer Y and carrying a light A. If one examines the association between Y and A, there is an association linked by L.

flowchart LR
  L(L) --> A[A]
  A[A] --> Y{Y}
  L(L) --> Y{Y}
flowchart LR
  L(L) --> A[A]
  L(L) --> Y{Y}

DAG when L is a confounder (common cause)

Example: genetic haplotype A has no causal effect on becoming cigarette smoker Y. Both the haplotype A and cigarette smoking Y has a causal effect on the risk of heart disease L. L here is a collider. A and Y are independent.

flowchart LR
  A[A] --> L(L)
  Y{Y} --> L(L)

DAG when L is a collider (common effect)

Marginally association or not

Two variables are marginally associated if

  • one causes the other; or
  • they share common causes (linked by common causes)

Otherwise they are marginally independent (collider, linked by common effect). Note that if now you condition on levels of the collider, it opens the path and brings in association.

Conditional independence

d-separation: two variable are d-separated if all paths between them are blocked. d stands for directional.

flowchart LR
  A[Aspirin] --> B(Platelet aggregation)
  B(Platelet aggregation) --> Y{Heart disease} 

A is marginally associated with Y. A affects Y only through B; hence knowing levels of B is enough to predict Y, there no information added by knowing A. A and Y are conditionally independent given B.

flowchart LR
  L(Smoker) --> A[Carrying a lighter]
  L(Smoker) --> Y{Lung cancer}

L here is a confounder. A and Y are marginally associated, however A and Y are conditionally independent given L (i.e. within each level of L). For non-smokers (L=0), knowing that he carries a lighter or not does not help predict the risk of lung cancers; as we assume people carrying lighter tend to be smokers and we already know the smoking status, no information needed for whether they carry a lighter (opposite direction).

flowchart LR
  A[haplotype] --> L(heart disease)
  Y{smoker} --> L(heart diseas)

The path from A to Y is blocked by collider L, however A and Y will be conditionally associated within levels of their common effect L. Whether two variables (causes) are associated can not be influenced by an event in the future (effect), but they generally become associated once stratifying on the common effect. This is related to selection bias.

Chapter 7 Confounding

Confounding is the main shortcoming of observational studies.

Backdoor path: a non causal path between A and Y. \(A \leftarrow L \rightarrow Y\) that links A to Y through their common cause L is a backdoor path.

flowchart LR
  L(L) --> A[A]
  A[A] --> Y{Y}
  L(L) --> Y{Y}
  • When exchangeability holds, meaning that \(Y^a\) is independent from A, all individuals have the same probability of receiving treatment as in a marginally randomized experiment, the average causal effect \(E[Y^{a=1}] - E[Y^{a=10}]\) is equivalent to \(E[Y|A=1] - E[Y|A=0]\).
  • When exchangeability does not hold, but conditional exchangeability holds (conditional on L), need to adjust for L via standardization or IP weighting to get the population causal effect.

### Backdoor criterion

Minimal adjustment sets

Confounding adjustment

Adjustment or controlling for: any technique that removes the effect of the variables we are not interested in.

  • G-methods (generalized): standardization, IP weighting, g-estimation
  • Stratification based methods: stratification, matching