Synthetic data in finance: Opportunities and challenges
Nicholas Neary, Data Scouting Analyst (London)
The rise of data protection and regulation for consumers in the alternative data landscape has led to changes in data collection and sourcing practices and, in some cases, has affected the commercialisation of datasets. In this literature review, we summarise a paper that explores the effectiveness of synthetic data in mitigating data privacy risks, as well as the increasing business need for more data and the challenges and pitfalls of synthetic data.
QUICK VIEW
- In the study, synthetic data is defined as data obtained from a generative process that learns the properties of the real data.
- Such processes are strictly different from the commonly used data obfuscation techniques (e.g. anonymisation or removing certain sensitive attributes) as the intention is to synthesise new samples that replicate real data without privacy concerns.