Download PDFOpen PDF in browser

Using the Literature to Construct Causal Models for Pharmacovigilance

EasyChair Preprint no. 158

6 pagesDate: May 23, 2018


Causal discovery methods provide a means to ascertain causal attribution from observational data. Causal modeling at scale requires a method to populate models with relevant domain knowledge. We propose to use the biomedical literature to perform feature selection for drug/adverse drug event (ADE) models with clinical observational data derived from electronic health records (EHR) as our primary input data source. We reason that spurious (non-causal) drug-ADE associations from co-occurrence-based analyses should diminish conditional on sets of validated confounders identified in the literature. To evaluate this hypothesis, we used a publicly available reference data set to test the proposed methodology with 4 ADEs and 399 drug-ADE pairs. We calculated baseline scores using the rank order regression coefficients each drug-ADE pair. We then identified confounding variable candidates for each drug-ADE pair using relationship constraints based on normalized predicates to search knowledge extracted from the literature in the publicly available SemMedDB repository. To determine eligibility for inclusion, we checked whether or not there were directed edges pointing to both the drug and the ADE. Finally, we tested whether associations from co-occurrence in the clinical data are diminished conditional on sets of permutations of confounders identified in the literature. Confounder yield rate was ~ 90%, indicating that our method successfully identified confounders in the observational data. Causal models attained aggregate performance improvements of ~ 0.07 area under the curve and reduced the False Discovery Rate from 0.50 to 0.38 over purely statistical models using unadjusted logistic regression.

Keyphrases: adverse drug event, Adverse Drug Reaction, causal modeling, causality, clinical data, confounding, discovery pattern, domain knowledge, drug ade, drug ade relationship, Electronic Health Record, Electronic Health Records, feature selection, literature-based discovery, Natural Language Processing, observational clinical data, observational data, observational medical outcome partnership, observational studies, Pharmacovigilance, Predication-based Semantic Indexing, statistical model, Unified Medical Language System

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
  author = {Scott Malec and Assaf Gottlieb and Elmer Bernstam and Trevor Cohen},
  title = {Using the Literature to Construct Causal Models for Pharmacovigilance},
  howpublished = {EasyChair Preprint no. 158},
  doi = {10.29007/3rfr},
  year = {EasyChair, 2018}}
Download PDFOpen PDF in browser