CDISC SDTM and ADaM: The Standard Structure for CRF Data and Their Essential Role in Clinical Trials

Christian Baghai
3 min readMay 17, 2023

--

Photo by National Cancer Institute on Unsplash

Clinical trials are complex endeavors that generate vast amounts of data. Organizing, interpreting, and presenting this data in a coherent and traceable manner is paramount. As such, the Clinical Data Interchange Standards Consortium (CDISC) has developed two structures, the Study Data Tabulation Model (SDTM) and the Analysis Data Model (ADaM), to standardize and streamline this process.

CDISC SDTM: The Standard Structure for Collected CRF Data

The CDISC SDTM is rapidly becoming the standard structure for collected Case Report Form (CRF) data. It contains a relatively low number of derived fields, including baseline flags, study day, and subject reference start date. However, as biostatisticians and statistical programmers often need to create a range of derived fields for analysis, there were initial attempts to incorporate all required derived fields for analysis into the SDTM structure.

The pilot project faced several challenges, leading to the creation of a separate structure, the ADaM structure, to serve as the final source for the analysis metadata.

The Relationship Between SDTM and ADaM

The relationship between SDTM and ADaM is vital for traceability from statistical output back to the analysis datasets and then to the raw data. Several SDTM variables are directly carried into the ADaM datasets used for analysis, and these variables must not be changed in any way. They must be copied without modification, meaning that the variable name, the variable attributes, and the variable value remain the same.

ADaM builds on the nomenclature of SDTM, but with additional attributes and variables for statistical analyses.

Differences Between SDTM and ADaM

Although SDTM and ADaM are complementary, they differ in several key aspects:

  1. Verticality: ADaM datasets may not always be vertical, especially in ADAMIG v0.7, and less so in ADAMIG v1.0.
  2. Redundancy for Analysis: ADaM datasets use redundancy for easy analysis. Common variables, such as population flags and subject identifiers, are found across all analysis datasets.
  3. Numeric Variables: ADaM datasets have a greater number of numeric variables, such as SAS-formatted dates and numeric representation of a character grouping variable from SDTM.
  4. Combination of Variables: ADaM datasets may combine variables across multiple domains.
  5. Naming: ADaM datasets are named AD followed by six characters (AD<xxxxxx>).

Metadata Components

The documentation of the analysis datasets provides a concise link from the CRF data to the analyses defined within the statistical analysis plan.

Analysis Dataset Metadata

The analysis dataset metadata provides information describing each analysis dataset, including its name (always beginning with “AD” as a prefix), a description, its location, its structure, its purpose, key variables, and documentation.

Analysis Variable Metadata

Analysis variable metadata describes the variables within the analysis dataset. It includes the dataset name, variable name, variable label, variable type, variable length, decodes (format name and values if applicable), origin, and the variable’s role in the analysis.

Analysis Results Metadata

The analysis results metadata provides a link to reviewers from a result in the report to metadata describing the analysis, reason for the analysis, analysis dataset(s), and program(s) used.

ADSL: The Minimum Requirement for ADaM

The minimum requirement for ADaM is that a subject-level dataset named ADSL exists. The purpose of ADSL is to provide a single location for key information for each subject in the trial. It contains a range of required and common variables, including study identifiers, subject demographics, population indicators, treatment variables, and trial dates.

In conclusion, the CDISC SDTM and ADaM structures are integral to the organization and analysis of clinical trial data. While SDTM standardizes the structure of collected CRF data, ADaM adds additional attributes and variables for statistical analyses. Together, they ensure that data from clinical trials is organized, traceable, and ready for analysis.

--

--

Christian Baghai
Christian Baghai

No responses yet