This is a course provided by Genentech (part of Roche) on Coursera.
Course link
Introduction to clinical trial
Clinical trial: aim to demonstrate that drug is safe and effective (safety, efficacy)
- phase 1: 10-20 people, focus on safety (healthy volunteers)
- phase 2: 100, study of side effects, determine best dose
- phase 3: 1000, demonstrate drug efficacy, fuller safety profile (common across multiple regions, ethnicities)
collecting data from different hospitals, hence important to ensure standards are being followed
- evidence must be submitted to health authorities (FDA, EMA European medicines agency)
- health authorities determine whether the drug is submitted to market
Submit the analysis plan in advance
Why share data
- regulatory requirements
- scientific community interest
- company internal research interest
- marketing materials
Data and results sharing
- Regulatory req (e.g. EMA require sharing clinical trial results to gain marketing authorization for pharma products, FDA require sharing data)
- scientific community (peer review check accuracy, perform additional analyses, derive new hypothesis)
- CDISC standards
- CDASH (clinical data acquisition standards harmonization)
- SDTM (study data tabulation model)
- format for ‘raw’ data, define datasets, structures, contents, variable attributes
- SEND (standard for exchange of non clinical data)
- ADaM (analysis data model)
- data format for data processed for analysis (e.g. converted, imputed, derived)
- Dictionary
- e.g. nose congestion, stuffy nose, … need to be standardized
- MedDRA: standard dictionary for medical conditions, events and procedures
- WHO drug dictionary (for pharma agents)
- SAP statistical analysis plan
- based on study protocol, focus on statistical methodology, is regulated
- Programming specification
- based on SAP, provides additional details on datasets and tables, listing and figures (TLFs) required for statistical analysis. focus on programming details. Not regulated
Quality assurance
- Good clinical practice (GCP), issued by ICH
- purpose: prevent mistakes, reduce inefficiencies/waste in a process, increase reliability/trustworthiness of the product of a process
- Clinical monitoring: performed by a clinical research assistant (CRA) at investigator sites, checks that study protocols are executed as intended, and site processes result in accurate data capture. Focus on trial subjects’ safety. Traditional goal: 100% source data verification
- Data quality checks (more relevant for data scientists). Checks data for technical conformance, and data plausibility. Focus on data quality. Traditional goal: 100% accurate and format compliant data
- Code review
- Double programming
Data access restrictions
- reasons
- data collected is very sensitive (health data), need data protection
- clinical trial data is a key asset and revenue predictor for pharma companies, high confidentiality levels
- scientific validity, data is ideally double blinded, no-one should know whether a subjecttreatment is as long as the data is still being collected
- pseudonymization: data de-identification
- use pseudonym (ID), link is recorded to allow re-identification
- anonymization: limit the risk of re-identification
- remove variables, remove values, replace more precise values with more general categories, replace personal identifiers with random identifiers
- FSP, CRO (out-sourcing), personnels require data access at different levels
- Unblinding