List of synthetic data startups and companies — 2021

synthetic data companies list
Synthetic data company ecosystem — Updated October 2021

1. Synthetic data providers for structured data

The companies listed below offer synthetic data that is generated from tabular data. It mimics real-life data stored in tables and can be used for behavior, predictive, or transactional analysis.

  • Betterdata: vendor of a privacy-preserving synthetic data solution for AI, data sharing, or product development.
  • Datomize: vendor of a synthetic data solution for the development, training and testing of AI/ML models, and applications.
  • Diveplane: vendor of Geminai, a solution to generate synthetic ‘twin’ datasets with the same statistical properties as the original data.
  • Facteus: vendor of Mimic™ a synthetic data engine to synthesize data assets that protect consumer privacy.
  • Generatrix: an AI-based privacy-preserving data synthesizing platform.
  • Gretel: vendor of a synthetic data generation library and APIs for developers and data practitioners.
  • Hazy: vendor of a synthetic data platform for financial institutions that want to conduct data analysis.
  • Instill AI: vendor of a solution for synthetic data generation leveraging Generative Adversarial Networks and differential privacy.
  • Kymera Labs: vendor Synthetic Data Fabrication Software, a solution that generates new data without relying on the ML/GAN approach.
  • vendor of a synthetic data platform for generating synthetic data using GANs, available in Community, Cloud or Enterprise editions.
  • Mostly AI: vendor of Mostly Generate, a synthetic data generator that provides as-good-as-real, yet fully anonymous data.
  • vendor of synthetic data API solution which is deployed internally.
  • Replica Analytics: vendor of Replica Synthesis, a software solution that ingests data and builds synthesis models to generate synthetic datasets.
  • Sarus technologies: vendor of ML software to help data practitioners leverage sensitive data assets for innovation with privacy guarantees.
  • Sogeti: vendor of Artificial Data Amplifier (ADA), a solution by the Sogeti Testing AI team that generates realistic data based on real data sets.
  • Statice: vendor of a software solution that generates privacy-preserving synthetic data that can be used as a drop-in replacement for an original dataset.
  • Syndata AB: vendor of a synthetic data generator to generate data sets that match the statistical attributes of real data but are entirely synthetic.
  • Synthesized: vendor of a DataOps platform enabling data sharing and collaboration across internal groups, remote teams, and external partners.
  • Syntheticus: Swiss vendor of a Swiss platform dedicated to generating synthetic data.
  • Syntho: vendor of AI software for generating synthetic data.
  • Tonic: vendor of a synthetic data generator to mimic production data.
  • Ydata: vendor of a synthesizer that mimics statistical information from real data and on new datasets without transforming the original data.
  • Kerus Cloud: software from Exploristics to generate synthetic datasets for use in a various analytics applications in the life sciences sector.
  • MD Clone: vendor of MDClone ADAMS, a self-service data analytics environment enabling healthcare collaboration, research, and innovation.
  • Octopize MD: vendor of “The Avatar solution”, a synthetic data software and service for healthcare data.
  • Pionic: vendor of software to transform medical data into tradable assets without compromising patient privacy.
  • Syntegra: vendor of a synthetic data engine, purpose-built for healthcare, to synthesize replicas of medical data.
  • vendor of an anonymization engine that can be used for data de-identification.
  • BizDatax’s Synthetic Data Generator: a data masking solution for production data with a synthetic data generator by Ekobit.
  • Test Data Manager: a test data management solution from Broadcom with synthetic data capabilities for production data.
  • Chatterbox’s Synthetic Data Generator: an AI software with a synthetic data generator to validate AI models.
  • Curiosity software synthetic test data: a test data automation solution to generate test data.
  • ExactData: vendor of Smart Data, a solution to reduce cost and time to develop, test, deploy, train, and maintain data processing systems.
  • GenRocket: vendor of enterprise synthetic test data solutions.
  • iData: a data quality tool from IDS that incorporates data generation and obfuscation capabilities.
  • Informatica: vendor of a test data solution with synthetic data capabilities.
  • Synth: a tool from OpenQuery for generating realistic data using a declarative data model.
  • TrialTwin: service provider for synthetic health data for patient and clinical trial analysis.
  • Smart noise: an open-source toolkit designed to be a layer between queries and data systems, relying on differential privacy.
  • Twinify: a software package for privacy-preserving generation of a synthetic twin to a given sensitive data set.
  • Synner: an open-source tool to generate real-looking synthetic data by visually specifying the properties of the dataset.
  • Synthea: an open-source, synthetic patient generator that models the medical history of synthetic patients.
  • Synthetig: an open-source platform where you can generate synthetic data.

2. Synthetic data provider for unstructured data

The companies listed below work with unstructured data, offering synthetic data products and services for vision and reconnaissance algorithm training.

  • AI Reverie: provider of synthetic simulated 3D environments.
  • Anyverse: vendor of synthetic data solution for perception models.
  • Autonoma IA: provider of AI training data.
  • Bifrost: provider of a synthetic data API that generates 3D worlds.
  • CVEDIA: vendor of synthetic computer vision solutions for object recognition.
  • Cognata: provider of simulations of ADAS and Autonomous Vehicle developers.
  • Coohom Cloud: synthetic data service provider and developer of EUS, a synthetic data platform for training indoor agent cognition.
  • Datagen: 3D simulated training data provider for Visual AI learning and development.
  • Deep Vision Data: provider for synthetic training data for supervised and unsupervised training of machine learning systems .
  • EdgeCase: Synthetic data provider for AI & image recognition.
  • Lexset: training data provider for computer vision systems.
  • Mindtech: vendor of a synthetic data platform for training data creation for AI vision systems.
  • Neurolabs: vendor of a synthetic data platform for Computer Vision.
  • Neuromation: vendor of a distributed synthetic data platform for deep learning applications.
  • OneView: vendor of a synthetic geospatial data generation and optimization platform.
  • Parallel domain: vendor of a synthetic data platform for autonomous system training and testing use cases.
  • Reinvent Systems: provider and vendor of synthetic data solutions for image generation, text data, and 3D objects.
  • Rendered AI: vendor of common application framework to produce physics-based synthetic datasets for AI training and validation.
  • Scale Synthetic: the synthetic data offering from Scale AI for ML model training.
  • Simerse: seller of annotated, synthetic, labeled training sets for AI learning.
  • Sky Engine: a platform for synthetic data generation to create data streams for deep learning in computer vision.
  • Synthesis AI: vendor of a data generation platform for computer vision.
  • Synthetaic: provider of AI training data.
  • Synthetic Data Pty Ltd: vendor of a simulation software that generates synthetic images.
  • Synthetik: AI service provider with computer vision and machine learning expertise, including synthetic data generation.
  • uSearch: a web search engine built entirely from AI-generated data.
  • Vypno: vendor of a software solution to recognize and classify objects in images from drones, cameras, satellites or mobile phones.​
  • provider of synthetic image and video for computer vision.
  • Zumo Labs: Training Data as a Service provider of synthetic training data for computer vision.

Market trends



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Elise Devaux

Elise Devaux


Tech enthusiast, digital marketing manager, interested by synthetic data, data privacy, and startups.