Clinical trial design


Notes related to clinical trial design. Constantly updating.


Chi Zhang


June 1, 2023


Types of trials designs

Phase 1: 10-30, identify tolerable dose, information on drug metabolism, extretion and toxicity. Often not controlled

Phase 2: 30-100, efficacy, safety and side effects,

Phase 3: 100+, often randomized

Phase 4: demonstration

Types of design

Population have the disease outcome of interest; not healthy voluteers vs diseased.

Randomisation unit: persons, two eyes of a person, or groups of persons

Comparison structure: parallel, crossover, group allocation

  • Parallel: simultaneous treatment and control groups, subjects randomly assigned to one group.
  • Crossover: randomize of order in which treatments are received; TC or CT. Each patient is his/her own control. Washout period: time between two treatments.
    • Variability reduced because less variability within patient than between patients. Fewer patients needed.
    • Disadvantages: only certain treatments can use crossover design, treatment can’t have permanent effects. Carry-over effects from first period; washout needs to be long enough. Dropouts more significant, analysis may be more difficult: correlated outcomes.
    • Constant intensity of underlying disease: chronic diseases (e.g. asthma, hypertension, arthritis) + short-term treatment effects (relief of signs or symptoms)
    • e.g. morning dose vs evening dose
  • Group allocation: a group of subjects (community, school, clinic).

Extensions of the parallel design: factorial, large simple

  • Factorial: two interventions tested simultaneously. Can be presented in a 2 by 2 table (treatment A +-, treatment B +-); or 3 by 2 etc.
    • Interested in main effect (if no interaction expected). A vs no A; B vs no B. The other treatment doesn’t matter.
  • Large simple: large number of patients, possibly from many study sites.

Tests other than superiority

  • Equivalency: intervention response is close to control group response

  • Non-inferiority: Treatment A (new) is at least as good as B (established). One-sided test, if A is worse than B, one can be rejected. Does not require as big sample size.

Adaptive design

Possible adaptations

  • randomization probabilities
  • sample size (e.g. group sequential methods)
  • visit schedule: shorten/lengthen follow-up time, change number of timing of visits, treatments (dose/duration, concomitant meds)
  • hypothesis tested

Randomisation and masking


  • avoid selection bias: prognostic factors related to treatment assignment
  • tends to produce comparable treatment groups


Simple randomization, restricted randomization, adaptive randomization

Simple rz

Each assignment is unpredictable, number of patients in each group should be equal in the long run.

Risks: imbalances in number assigned to treatment groups, or confounding factors (gender, disease severity) -> reduced power

Restricted rz

Schemes with constraints to produce expected assignment ratio

  • blocking
  • stratification

Blocking. Block of size 2 with treatment allocation ratio 1:1: A,B. Size 4: 2As, 2Bs. Need to be permuted: AABB, ABAB, … in total 6 combinations. Then choose one of the permutations.

Stratification. Ensure balance in treatment assignments with subgroups defined before rz. Limit to a few variables (highly related to outcome and/or logistical): e.g. clinic in a multicenter trial, surgeon (skills, procedures), stage of disease, demographic such as gender and age.

Use these two together.

Adaptive rz

Probability of assignment does not remain constant, but determined by the current balance and composition of the groups.

  • minimization: choose the design that gives the smallest imbalance.
  • play the winner: change allocation ratio or favor the better treatment based on the primary outcome. Need to evaluate outcomes relatively quickly.

Masking (blinding)

Treatment assignment is not known after rz.

  • patient, clinical personnel, evaluators, data processors, …
  • single (only participant), double (+ investigator), triple (+ data processors, …), quadruple …

Purpose: remove bias related to treatment effects.

Different levels of masking protects to different extent against bias in different aspects

  • data reporting
  • data collection / follow-up
  • testing, behaviors
  • outcome assessment

Decision to mask treatments

  • ethical?
  • possible? can you make the treatment seem identical so the participants do not know?
  • trial design features: more important to mask subjective ones (e.g. alive or dead is the least subjective, hence wouldn’t benefit much; however if participants need to report effects that are not objectively measureable, they might report that treatment is better in contrast to placebo group)
  • feasible? cost-benefit, practicality (adherence)

Sometimes investigators in a double blind study might know which treatment is being assigned to participants, if the effect of drug is very obvious (both good or bad).


  • Planned: inform participants once the trial finished
  • Unplanned (discouraged): in the event of adverse event

Outcomes and analysis

Outcome: endpoint. It is a quantitaive measure.

Objectives of the trial

  • efficacy / effectiveness
  • safety
  • process
  • costs

Example: evaluate treatment for asthma

Outcomes: exhaled nitrous oxide, lung function (spirometry measures), asthma symptoms (wheezing, night awakenings), …

Example: evaluate a procedure to reduce perioperative morbidity

Outcome considerations: time window (what is postoperative), specific events to be considered an outcome, procedures to establish outcomes, …

Metrics for events as outcomes

  • dichotomous: 1/0 for presence absense, normal abnormal; clinical state or cut-off value
  • time-to-event: in addition to dichotomous, add time dimension; allow for censoring. More powerful than dichotomous.
  • rates: 1/0 but allow for repeats, analyze count or rate. Events within a person are usually not independent, need to account for it.
  • continuous variables: value or change from baseline; standard units (lab values, scores). Need to define an important difference. Distributional assumptions more important.
  • ordinal scale: ranked categories (e.g. adverse event grading, 1-5). Difference between categories is usually qualitative.

Patients opinions are subjective

  • health status / change in status, e.g. pain relief, quality of life
  • masking is more important
  • hawthorne / placebo effect: effect of being studies, usually positive
  • quantify with standardized scales

Influence of outcomes on design

Efficacy vs effectiveness:

In a vaccine trial, efficacy is the clinical case with lab confirmation; effectivenenss is the clinical case of influenza in a larger population, may or may not be confirmed.

In asthma, efficacy is FEV1, effectiveness is the decrease of the hospitalizations/steroid courses.

Considerations (3Bs)

  • biology: does outcome reflect a clinically relevant fact/change
  • biostatistics: detectable difference between groups is plausible and practical
  • budget: afford total N and can measure it reliably in every participant

Example: HIV trial outcomes

  • survival (deaths; AIDS status)
  • immunologic response
  • virologic response
  • change in patient status (e.g QoL)
  • specified toxicity
  • other side effects

Choice of primary outcome depends on the objectives or stage of research

  • phase 1, emphasis on safety
  • phase 2, short-term efficacy
  • phase 3, long-term efficacy
  • phase 4, long-term effectiveness

Intention to treat ITT

Cross-overs after rz: some patients might have a treatment (yes or no) beyond what they were assigned to, e.g. refuse surgery or medical treatment

Non-adherence during followup: some in treatment group refuse of can not tolerate certain treatment; while some in placebo group require medication or take on their own

Subgroup analysis

Stratified analysis: estimate treatment effect separately in subgroups. Does not tell difference across different subgroups

Test for interaction: use of main effect and interaction.

Issue of multiple testing when doing a series of analyses

  • inflate sample size to plan for subgroup analysis
  • report number of subgroup analyses performed
  • possibly adjust for multiple comparisons
  • report CI instead of just p-values

Reporting results


Consolidated Standards of Reporting Trials

25 item checklist. Not a guideline of how to do the trial; but a guideline for writers on how to report clearly.


  • key design items
  • treatment(s) evaluated
  • disease or population studied


  • design, method, results, conclusions


  • backgrounds, rationale, establish equipoise (balance), systematic review, objectives / hypothesis


  • IRB review and approvals
  • trial design, allocation ratio
  • eligibility criteria
  • setting and location of the trial
  • intervention - detailed enough to allow replication
  • outcomes - primary, secondary, how assessed and defined
  • sample size - how determined, interim analyses
  • important changes during trial
  • randomization, allocation concealment, implementation
  • masking - who, how
  • statistical methods - primary and secondary; subgroup analyses

Convenient to use a flow-chart

Baseline characteristics by treatment group

p-values (common to not report in rct?)

Evaluate literature

  • Legitimacy, is it a fair comparison
  • Trustworthy investigators? conflict of interest
  • Adequate protections against bias
    • randomization
    • masking
    • follow-up design and execution
  • ITT analysis
    • have all events (outcomes) observed been counted in the treatment group assigned?
    • variations in denominators explained and consistent with good practice?
  • appropriate subgroup analysis interpretation
    • ad hoc or post hoc status