Navigating data for AI training (UPDATED)

Paris Tung, Associate (London)

Post feature

The quality and diversity of training data can determine the success or failure of AI models. This article aims to facilitate data sourcing by exploring the growing market for AI training data, identifying providers and delving into the dynamic of commoditisation. We list several key players offering or working on off-the-shelf datasets for investment and general AI-building use cases.

Traditional data

We highlight providers with a track record of selling traditional data for AI model training, providing enterprise AI or showing a positive or neutral response to the request. Traditional data includes but is not limited to financial products, fundamentals, events (e.g. earnings calls and calendars), fund flow, ESG and macro.

The following tables show “availability” from high to low, denoting our discretionary assessment of the accessibility for institutional investors to acquire and use the data itself. For example, those that lack actual use cases or are not confirmed by vendors are scored “low”.