By continuing to use our site, you consent to the processing of cookies, user data (location information, type and version of the OS, the type and version of the browser, the type of device and the resolution of its screen, the source of where the user came from, from which site or for what advertisement, language OS and Browser, which pages are opened and to which buttons the user presses, ip-address) for the purpose of site functioning, retargeting and statistical surveys and reviews. If you do not want your data to be processed, please leave the site.

Presentations, Reports & White Papers

This page is regularly updated with new videos, presentations, documents, articles, white papers and other information from Replica Analytics. Check back often to get the latest resource downloads.  

Reports & White Papers


Practical Synthetic Data Generation

Accelerating AI with Synthetic Data

Evaluating the Utility of Synthetic Data

Executive Update on Synthetic Data

Presentations & Webinars


Experiences Implementing Data Synthesis in a Global Life Sciences Company

June 16, 2021

A webinar from Stephen Bamford at Janssen describing their experiences with the implementation of synthetic data generation and where it fits in among other PETs.

What We Learned About Data Synthesis in 2020

May 20, 2020

In this webinar, Khaled El Emam provided his perspective on the changes in 2020 in data synthesis technology, the market, the adoption, the practical application, the governance, and the potential of this technology.

Empirical Assessment of Privacy Risks in Data

March 25, 2020

Presentation on our experiences with trying to identify individuals in datasets that have been de-personalized, especially health data. This is an overview of Motivated Intruder Attacks.

An Introduction to Synthetic Clinical Trial Data

October 4, 2019

How data synthesis provides reliable sharing of clinical trial data for secondary analysis while protecting participant privacy.

Data Synthesis: A Tool for Responsible Data Sharing

JUNE 16, 2021

This presentation by Khaled El Emam provides a general overview of synthetic data generation and its applications. 

Optimal Synthesis of Clinical Trial Data

Nov 4, 2020

Presentation describing a method for the synthesis of complex clinical trial data and explaining how to tune its hyperparameters, with results presented on multiple oncology clinical trial datasets.

Ten Things I Have Learned About Health Data Monetization

February 12, 2020

Sharing some key experiences working with organizations to monetize their data for commercial, academic, and public interest purposes. A practical journey with many lessons learned.

June 17, 2021

Presentation of Replica Analytics' joint offering with Aridhia: An integrated and managed FAIR data sharing platform. 

Synthetic Clinical Trial Data: Use Cases, Methods, and Experiences

December 9, 2020

Sharing experiences with the synthesis of clinical trial data and how that data has been used by a pharmaceutical company.

Stephen Bamford,
Janssen R&D

Lucy Mosquera,
Replica Analytics

Ten Things I Have Learned About De-identification

January 15, 2020

An overview of lessons learned and key observations from more than a decade of developing de-identification and other privacy enhancing technologies and applying them in practice globally.

Replica Analytics Summits & Conferences


2021 Summit - Synthetic Data: The Future of Data Sharing

The Alberta Synthetic Data Project

July 7, 2021

Dean Eurich
Professor, School of Public Health, University of Alberta

Lucy Mosquera
Director of Data Science,
Replica Analytics

Uses of Synthetic Data by the Life Sciences Industry

July 7, 2021

Virginie Giroux
Director, Patient Access & HEOR, Merck

Stephen Bamford
Head of Clinical Data Standards & Transparency, Janssen 

Janice Branson
Global Head of Advanced Methodology & Data Science, Novartis

How Synthetic Data will Transform Health Research and Innovation

July 7, 2021

Karen Cuenco
Bill and Melinda Gates Foundation

2020 Summit - Getting Access to COVID-19 Data

Responsibly Sharing COVID-19 Data

July 15, 2020

Dr. Khaled El Emam

The COVID-19 Research Database

July 15, 2020

Jason LaBonte, Ph.D., Datavant

GIS and COVID-19

July 15, 2020

Alex Miller, President, Esri Canada

The Pandemic Menace and the Future of Humanity Through the Eyes of a Realist

July 15, 2020

Gerry Stegmaier, Reed Smith

Technical Reports


Can Synthetic Data Be A Proxy For Real Clinical Trial Data? A Validation Study

April 2021

A detailed case study demonstrating the high utility of synthetic data generation for a colon cancer clinical trial dataset.

Optimizing the Synthesis of Clinical Trial Data Using Sequential Trees

November 2020

This paper describes the basic method for generating synthetic data for small source datasets, such as for clinical trials.

Evaluating the Utility of Synthetic COVID-19 Case Data

March 2021

A detailed case study demonstrating the high utility and low privacy risks of synthetic data generation for the Ontario COVID-19 case dataset.

Evaluating Identity Disclosure Risk in Fully Synthetic Health Data

November 2020

A privacy model is introduced and demonstrated that measures attribute disclosure conditional on identity disclosure for synthetic data. This combines these two types of disclosure risk within a single framework.