Comparison of SDRs detected both in Social Media and in VigiBase spontaneous reporting system

Natalie Gavrielov, Data2Life, Tel Aviv, Israel (
Marie-Laure Kurzinger, Epidemiology And Benefit Risk Evaluation, Sanofi, Chilly-mazarin, France
Chihiro Nishikawa, Epidemiology And Benefit Risk Evaluation, Sanofi, Chilly-mazarin, France
Chunshen Pan, Epidemiology And Benefit Risk Evaluation, Sanofi, Bridgewater, Us
Julie Pouget, Information Technology And Solutions, R&d, Sanofi, Lyon, France
Limor BH Epstein, Data2life, Tel Aviv, Israel
Yan Golant, Data2life, Tel Aviv, Israel
Stephanie Tcherny-Lessenot, Epidemiology And Benefit Risk Evaluation, Sanofi, Chilly-mazarin, France
Stephen Lin, Global Pharmacovigilance, Sanofi, Bridgewater, Us
Bernard Hamelin, Medical Evidence Generation, Sanofi, Chilly-mazarin, France
Juhaeri Juhaeri, Epidemiology And Benefit Risk Evaluation, Sanofi, Bridgewater, US

Introduction: While various data sources are used by pharmaceutical companies and Regulators to monitor drug safety, Spontaneous Reporting Systems (SRS) have been the main source for Signal Detection (SD) since 1960s.  SRS’s limitations include underreporting, delay between adverse events (AE) reporting and analysis, and under-representation of patient perspective [1]. In recent years, there has been an increasing interest in additional data sources, like electronic health records and social media (SM). The latter allows real-time AE analysis and contains large amounts of patient-generated data. Accordingly, SM could potentially be used for SD and provide additional value. AE extraction from SM may be performed by various text-processing methods, with different processing complexity levels and outputs.

Objectives: 1) To compare the outputs of two text-processing methods, i.e. drug-event co-occurrence and feature-based statistical learning; 2) To detect signals of disproportionate reporting (SDRs) based on these two outputs and compare these results to traditional SRS.

Methods: Data were extracted for zolpidem and insulin glargine over the period of 2005-2015.  SM was represented by posts extracted from a list of > 1000 open on-line patient forums and support groups, and SRS was represented by the WHO VigiBase data. SM analysis included two steps. Initially AEs were identified in SM using two distinct methods, namely drug-event co-occurrence [2] and feature-based statistical learning method (hitherto designated NLP-processing) [2]. Next, SDRs were detected in these two AE populations using PRR, ROR and EBGM.

Results: 5686 AEs were identified in VigiBase, 2500 potential AEs were identified in SM by the co-occurrence method and 435 were identified by NLP-processing. A large proportion of AEs was identified both in SM and VigiBase. Specifically, 57% (1433/2500) of the co-occurrence AEs and 93% (403/435) of the NLP-processed AEs were identified among VigiBase AEs. The number of SDRs identified both in SM and VigiBase ranged depending on signal detection method between 31-166 SDRs for co-occurrence AEs and between 6-40 SDRs in NLP-processed AEs.

For some SDRs, number of observations was higher in SM than in VigiBase. Two such SDRs were examined using SM posts, namely “Visual acuity reduced” identified for insulin glargine and “Abnormal dreams” identified for zolpidem (see Table 1).

Conclusions: High consistency was observed between SM and VigiBase data, with some SDRs more frequently captured in SM than in VigiBase. The number of SDRs identified in SM depends on the method used to extract SM AEs and on the signal detection methods.


1. Alatawi YM, Hansen RA. Empirical estimation of under-reporting in the U.S. Food and Drug Administration Adverse Event Reporting System (FAERS). Expert Opin Drug Saf. 2017; 16(7):761-767.
2. Liu J, Zhao S, Zhang X. An ensemble method for extracting adverse drug events from social media. Artif Intell Med. 2016; 70:62-76.

Short Personal Biography

I am a Digital Health and a Public Health scientist. For over 10 years I have specialized in medical informatics, which implements Real World Evidence stemming from BigData repositories. As an epidemiologist, I aspire to find the appropriate use for passively collected datasets (such as Social Media, wearable devices and others) and convert them to actionable insights, from which both Patients and Clinicians will benefit.  My work materializes the Patient-Centric approach through Social Media application. I advocate for using Social Media-derived data to improve Patient Safety. 

Organized & Produced by:

POB 4043, Ness Ziona 70400, Israel
Tel.: +972-8-931-3070, Fax: +972-8-931-3071