Parallel or cross-over? Choice of control - placebo, active control, historical control? Timing and frequency of measurements? Randomization method and blinding?
My works have an exploratory nature - the RCT works I’ve done all seems to be phase II (neither determining the safe dose, nor find deterministic evidence in the wider population) with small to medium sample size.
I have never followed an RCT from start to end, so I’ve never been involved in the design of the studies.
For the designs: mostly parallel designs with single measurement or longitudinal measurements, where subjects are measured multiple times. I’ve done one cross-over study (same patient, first tx then ct).
I have not done any factorial design (multiple treatments at the same time).
Selection criteria: clinicians have their own criteria for relevant subjects, usually age groups, sex and underlying conditions
Randomization: the ones I’ve done are mostly simple randomization, nothing with blocks or stratification
Masking (blinding): my works seem to have open labels, not blinded
Peanut allergy
(Reier-Nilsen 2019)
2 year follow up (T0, T1, T2)
Endpoint: 3 or 5-point likert scale for kids, 7 point likert scale for parents. Points are summed and reported
Selection: Enroll positive food challenge 5-15 yo, 96 in total. Exclude asthma and other severe chronic diseases
Randomization: Remove 19 ineligible, 77 2:1 block OIT vs controls (observation only). Possible reason for 2:1: many would drop out
Blinding: NA
Methadone
(Opdal 2023)
Selection: 18+, not pregnant, no serious illness, willingness to comply
Design: Cross over, 14 days washoff
Randomization: Cross over, subjects are their own controls
Blinding: Cardiologist is blinded to calculate QT measurement
Comparison: on-methadone QTc vs off
Sample size: n=9, clinically relevant difference 20, SD 15 (unit: microsecond), b = 0.9. Can be computed with pwr::pwr.t.test(d=20/15). See more in the next section.
Artery blood flow
(Callender 2023) n = 26; within-subject two legs; three levels of intervention (stimulus)
Selection: 13M, 13F, healthy, between age 18-50, free from medication influencing cardiovascular system
Design: Parallel
Randomization: two legs on the same patient, tx ct
Blinding: NA
Sample size: a=0.05, b=0.8; effect size based on previous works, computed by clinicians themselves
Power and sample size
clinicians provide information on the anticipated effect size, statistician calculate sample size needed to detect effect with adequate power while minimizing type I and II error.
Sample size computation:
Require:
test: t-test (one, two samples), tails
significance level
power
cohen’s d or f - effect size (e.g. standardized mean difference). If unavaiable, can use the Cohen’s guideline for small, medium and large effect sizes: (f = 0.1, 0.25, 0.4)
QoL stigma
(Simonsen 2023) n = 32; T0, T1, T2; 1:1; justify SS computation
If the study has a paired design with effect size (standardized after substracting the standard deviation) 0.5,
pwr::pwr.t.test(d =0.5, sig.level =0.05, power =0.8, type ='one.sample', alternative ='two.sided')
One-sample t test power calculation
n = 33.36713
d = 0.5
sig.level = 0.05
power = 0.8
alternative = two.sided
Methadone
(Opdal 2023) n=9; effect size provided (as clinically relevant difference and standard deviation); compute SS
In this case study, the computed sample size is as follow,
pwr::pwr.t.test(d =20/15, sig.level =0.05, power =0.9, type ='one.sample', alternative ='two.sided')
One-sample t test power calculation
n = 8.07235
d = 1.333333
sig.level = 0.05
power = 0.9
alternative = two.sided
The effect size is huge, hence the small sample size required.
Post-hoc power computation
It is not recommended, as the information is the same as what p-value provides: a small p-value that indicates a large effect size, that will give a high power.
Power analysis is a design tool, not a diagnostic tool. It is used to make sure that the study has enough samples to detect an effect of a given magnitude and the study has high probability of detecting a true effect if it exists. Once the study is done, analyzing power adds little beyond the p-value.
Observational
AHUS
Retrospective observational study, linked clinical data from EHR.
Selected as subjects have been in the hospital during a certain time (June 2018-19), above a certain age (18) and have been in certain wards (four selected).
Limitation
generalizability: four wards only, pre-covid data
error in data as EHR data goes
unmeasured confounders
COVITA
(Sarjomaa 2024) Case control with population
Aim is to assess risk factors for Covid infection:
compare PCR positive (400) vs negative (719) (case control)
them compare with population control (14509) (pre-pandemic)
Selection: adults who did PCR tests between 2020.2 to 12. PCR+- identified from test centers and hospitals, then contacted by telephone and invited to participate in research project.
Population controls are age matched. Pre-existing Telemark study data of 21-55 yo in 2018.
Results: sex (M) in TND (test negative)
Limitation (bias)
generalizability: location is limited, age group is limited
recall bias as it is questionnaire data; despite minimal missing
bias in healthcare seeking behavior
unmeasured confounders
Alcohol
(Kabashi 2019) Cross sectional, multi-site
Survey
Norkost
Analysis
Tests
t-test, wilcoxon
Multiple testing adjustment
NAV paper
Aim: compare sick leave and primary healthcare consultations in 2023 with pre-pandemic trends
Combined Norsyss (norwegian syndromic surveillance system) and NAV (norwegian labour and welfare administration on sick leave)
Results: general causese were 70% higher than expected, half was explained by the biggest factor is A04 (weakness/tiredness), and R(respiratory). Economic loss is 1.5 billion USD per year.
NAV data: number of working person-days lost to sick leave; weekly new cases
Norsyss data: number of consultation per ICPC2 code
Model for baseline computation
cases per 100k ~ year
cases per 100k ~ year as cubic spline (2 or 3 degrees of freedome)
Lowest AIC selected, then refit and predict
(Multiple testing handling)
compare the excess ratio with 1, produce p-value
then p.adjust(pval_list, method = 'fdr')
only report those below 0.05
Regression
Linear regression / regularization
HAI paper
Early warning system at a collective level, using predictors that are potentially interveneable.
This was a simulation study.
FHI Excess mortality
Bayesian linear regression
count data, but after doing EDA it can be approximated with linear with a single predictor: year
the task is not to explain (i.e. regression coefficient and intervals), but to predict
sometimes more convenient to use the per 100k number of death (or cases)
Logistic regression / ordinal
COVITA
Different results (in terms of risk factors) can come from
study design
different selection for case and control
geographical location (that affect the population)
Comparisons
Study I: 400 PCR+ vs 719 PCR-
Study II: 286 PCR+ (age 18-55) vs 14509 pop
Study III: 509 PCR- (age 18-55) vs 14509 pop -> other infections other than COVID
Missing data handling
low rate of missing (0-6.2% for most questions, roughly 15% for smoking) in questionnaire subjects.
no imputation was done
sensitivity analysis: excluded the missing, do the analysis again and see whether the conclusion differ
more detailed knowledge of highly connected hospital hubs and trajectories for infection prevention and control
Transfer chain analysis, 15258 patient stays that have more than one location. 1118 unique transfer chains (sequence of locations)
Skewed: 75% of patients followed one of the top 20 types: notably, ED to neurology; ED to gastrosurgery …. Many have the pattern of bed - ORblock - bed.
Transfer rate with Poisson regression to identify risk factors contributing to higher number of transfers
higher age, less transfers
higher mean first 48h NEWS, more transfers. (used categories 0-2, 3-4, 5-6 instead of numbers to capture the non-linear relationship)
no gender difference
after adjusting for HDU/ICU and OR, effect of age and NEWS becomes insignificant
interactions between department, surgery and antibiotics use are not significant
Missing data
Complete case only? Imputation? LOCF (last observation carried forward)? IIT vs per protocol