Merging Children's Oncology Group Data with an External Administrative Database Using Indirect Patient Identifiers: A Report from the Children's Oncology Group.

TitleMerging Children's Oncology Group Data with an External Administrative Database Using Indirect Patient Identifiers: A Report from the Children's Oncology Group.
Publication TypeJournal Article
Year of Publication2015
AuthorsLi Y, Hall M, Fisher BT, Seif AE, Huang Y-S, Bagatell R, Getz KD, Alonzo TA, Gerbing RB, Sung L, Adamson PC, Gamis A, Aplenc R
JournalPLoS One
Volume10
Issue11
Paginatione0143480
Date Published2015
ISSN1932-6203
Abstract

PURPOSE: Clinical trials data from National Cancer Institute (NCI)-funded cooperative oncology group trials could be enhanced by merging with external data sources. Merging without direct patient identifiers would provide additional patient privacy protections. We sought to develop and validate a matching algorithm that uses only indirect patient identifiers.

METHODS: We merged the data from two Phase III Children's Oncology Group (COG) trials for de novo acute myeloid leukemia (AML) with the Pediatric Health Information Systems (PHIS). We developed a stepwise matching algorithm that used indirect identifiers including treatment site, gender, birth year, birth month, enrollment year and enrollment month. Results from the stepwise algorithm were compared against the direct merge method that used date of birth, treatment site, and gender. The indirect merge algorithm was developed on AAML0531 and validated on AAML1031.

RESULTS: Of 415 patients enrolled on the AAML0531 trial at PHIS centers, we successfully matched 378 (91.1%) patients using the indirect stepwise algorithm. Comparison to the direct merge result suggested that 362 (95.7%) matches identified by the indirect merge algorithm were concordant with the direct merge result. When validating the indirect stepwise algorithm using the AAML1031 trial, we successfully matched 157 out of 165 patients (95.2%) and 150 (95.5%) of the indirectly merged matches were concordant with the directly merged matches.

CONCLUSIONS: These data demonstrate that patients enrolled on COG clinical trials can be successfully merged with PHIS administrative data using a stepwise algorithm based on indirect patient identifiers. The merged data sets can be used as a platform for comparative effectiveness and cost effectiveness studies.

DOI10.1371/journal.pone.0143480
Alternate JournalPLoS ONE
PubMed ID26606521
PubMed Central IDPMC4659568
Grant List1R01 CA16527 / CA / NCI NIH HHS / United States
U10 CA098543 / CA / NCI NIH HHS / United States
U10 CA180886 / CA / NCI NIH HHS / United States
U10 CA180899-02 / CA / NCI NIH HHS / United States
U10 CA98543-08 / CA / NCI NIH HHS / United States