Replica Analytics - An Aetion Company

Session 5: Applications in Complex Healthcare Settings

Title: Generating Synthetic Data for the NHS

Speaker: Jonathan Pearson Lead Data Scientist, NHS England

Abstract: Synthetic Data Generation (SDG) is most powerful when created alongside a defined use-case and with clarity around the fidelity, privacy, and fairness of the data generated. However, the right balance of these factors is unclear (especially for healthcare data) and often synthetic data projects fail due to low confidence in using the generated data as one or more of these factors are ill-defined. This talk will discuss the range of use-cases in healthcare where synthetic data could have the most impact and the need to build in explainability to the generation process to ensure confidence in the data. We’ll highlight our current successes and failures in generating synthetic data and the learning from both. Finally, we’ll highlight areas which require further development and research and the associated opportunities.

Title: Generating Synthetic Longitudinal Data

Speaker: Lucy Mosquera Senior Director of Data Science, Replica Analytics

Abstract: Real world data provides immense opportunity to facilitate innovative health research, however data access remains challenging due to privacy constraints. One strategy to facilitate responsible sharing of health data is synthetic data generation. This presentation will describe the challenges of synthesizing complex longitudinal health data and introduce a deep learning model designed to address these challenges. The performance of this model will be illustrated using a recent case study of synthesizing real world data from a single payer system. The suitability of the synthetic data will be assessed using privacy metrics, generic utility metrics, and by comparing the analytic results for a specific analysis.

Following the acquisition of Replica Analytics by Aetion, the generative AI technology previously known as Replica Synthesis is now Aetion® Generate and continues to create privacy-enhancing synthetic data.