[New] List of synthetic data vendors— 2022

Elise Devaux
7 min readOct 6, 2022
  1. This article is the updated version of the “List of synthetic data company 2021”. It lists the companies providing structured or unstructured synthetic data products and services as of October 2022.
  2. For insights into market developments, check this other post, “Everything that happened on the synthetic data market in 2022
  3. For a list of open-source synthetic data solutions, head to the second part of “Synthetic data tools: Open source or commercial?’”
  4. For an up-to-date directory of synthetic data solution, check https://syntheticdata.carrd.co/

The first section lists structured synthetic data providers (tabular and test data). The second part presents providers of synthetic data for unstructured data (image, sound, and video).

Structured synthetic data vendors

  • Accelario: a DataOps software vendor that offers rule or schema-based synthetic test data generation capabilities.
  • Aindo: an Italian company providing a data platform for synthetic data generation.
  • Avo Automation: a test data management software that offers synthetic test data capabilities.
  • BizDatax’s Synthetic Data Generator: a data masking solution for production data with a synthetic data generator by Ekobit.
  • Betterdata: vendor of a privacy-preserving synthetic data solution for AI, data sharing, or product development.
  • Broadcom: vendor of Test Data Management software that comes with synthetic test data generation capabilities.
  • Bulian AI: vendor of a synthetic data API.
  • Chatterbox’s Synthetic Data Generator: an AI software with a synthetic data generator to validate AI models.
  • Clearbox AI: a synthetic data solutions provider for Analytics and AI projects.
  • Curiosity software synthetic test data: a test data automation solution to generate synthetic test data.
  • DataCebo: the spin-off from the MIT Laptop Science & Synthetic Intelligence Laboratory (CSAIL) that owns the Synthetic Data Vault.
  • DatProf: provider of data masking and test data software solution with synthetic test data capabilities.
  • Datomize: vendor of a synthetic data solution for the development, training and testing of AI/ML models, and applications.
  • Diveplane: vendor of Geminai, a solution to generate synthetic ‘twin’ datasets with the same statistical properties as the original data.
  • Esito: vendor of the privacy product g9 that offers synthetic test data generation capabilities.
  • ExactData: vendor of Smart Data, a solution to reduce cost and time to develop, test, deploy, train, and maintain data processing systems.
  • Facteus: vendor of Mimic™ a synthetic data engine to synthesize data assets that protect consumer privacy.
  • FinCrime Dynamics: vendor of Synthesizer®, a synthetic data generation tool for the finance industry and fraud detection use cases.
  • Generatrix: an AI-based privacy-preserving data synthesizing platform.
  • GenRocket: vendor of enterprise synthetic test data solutions.
  • Gretel: vendor of a synthetic data generation library and APIs for developers and data practitioners.
  • Hazy: vendor of a synthetic data platform for financial institutions that want to conduct data analysis.
  • iData: a data quality tool from IDS that incorporates data generation and obfuscation capabilities.
  • Informatica: vendor of a test data solution with synthetic data capabilities.
  • Instill AI: vendor of a solution for synthetic data generation leveraging Generative Adversarial Networks and differential privacy.
  • IvySys: vendor of a synthetic data generation tool for synthetic threat transactions.
  • Kerus Cloud: software from Exploristics to generate synthetic datasets for using in various analytics applications of the life sciences sector.
  • Kymera Labs: vendor Synthetic Data Fabrication Software, a solution that generates new data without relying on the ML/GAN approach.
  • MD Clone: vendor of MDClone ADAMS, a self-service data analytics environment enabling healthcare collaboration, research, and innovation.
  • Mirry.ai: vendor of a synthetic data platform for generating synthetic data using GANs, available in Community, Cloud or Enterprise editions.
  • Mostly AI: vendor of Mostly Generate, a synthetic data generator that provides as-good-as-real, yet fully anonymous data.
  • Neutigers: an AI analytics platform vendor offering synthetic data generation capabilities relying using TUTOR Deep Neural Networks.
  • Nuvanitic: vendor of Nuvanitic IntelliHealth TM, a solution for the pharma industry specializing in synthetic clinical trial data.
  • Octopize MD: vendor of “The Avatar solution”, a synthetic data software and service for healthcare data.
  • Oscillate.ai: vendor of synthetic data API solution which is deployed internally.
  • Particle Health: an API platform offering patient synthetic records.
  • Replica Analytics: vendor of Replica Synthesis, a software solution that ingests data and builds synthesis models to generate synthetic datasets.
  • Sarus technologies: vendor of ML software to help data practitioners leverage sensitive data assets for innovation with privacy guarantees.
  • Sogeti: vendor of Artificial Data Amplifier, a solution to generate realistic data based on real data sets.
  • Smart Data Foundry: a vendor using Agent Simulation to generate synthetic financial data for UK financial institutions.
  • Statice: vendor of a software solution that generates privacy-preserving synthetic data that can be used as a drop-in replacement in ML applications, analytics or data sharing.
  • Syndata AB: vendor of a synthetic data generator to generate data sets that match the statistical attributes of real data but are entirely synthetic.
  • Syntegra: vendor of a synthetic data engine, purpose-built for healthcare, to synthesize replicas of medical data.
  • Synthesized: vendor of a DataOps platform enabling data sharing and collaboration across internal groups, remote teams, and external partners.
  • Syntheticus: Swiss vendor of a Swiss platform dedicated to generating synthetic data.
  • Synth: a tool from OpenQuery for generating realistic data using a declarative data model.
  • Syntho: vendor of AI software for generating synthetic data.
  • Syntonym: technology company offering synthetic faces solution.
  • Test Data Manager: a test data management solution from Broadcom with synthetic data capabilities for production data.
  • Tonic: vendor of a synthetic data generator to mimic production data.
  • TrialTwin: service provider for synthetic health data for patient and clinical trial analysis.
  • Ydata: vendor of a synthesizer that mimics statistical information from real data and on new datasets without transforming the original data.
  • Validata: vendor of ConnectIQ, a DataOps platform for financial services and banking with synthetic test data generation capabilities.
  • Veil.ai: vendor of an anonymization engine that can be used for data de-identification.

Unstructured synthetic data vendors

The companies listed below work with unstructured data, offering synthetic data products and services for vision and reconnaissance algorithm training.

  • AI Reverie: provider of synthetic simulated 3D environments.
  • Alethea AI: editor of decentralized synthetic content network for AI-generated media.
  • Anyverse: vendor of synthetic data solution for perception models.
  • Autonoma IA: provider of AI training data.
  • Bifrost: provider of a synthetic data API that generates 3D worlds.
  • Bitext: a NLP data software vendor focusing on synthetic language training data.
  • CNAI is a South Korean artificial intelligence (AI) data company that commercializes synthetic data software for synthetic images.
  • CVEDIA: vendor of synthetic computer vision solutions for object recognition.
  • Cognata: provider of simulations of ADAS and Autonomous Vehicle developers.
  • Coohom Cloud: synthetic data service provider and developer of EUS, a synthetic data platform for training indoor agent cognition.
  • CNAI: a South Korean company that offers synthetic data to train corporate AI systems.
  • Datagen: 3D simulated training data provider for Visual AI learning and development.
  • Datagrid: a Japanese software company, editor of Synthetic AI, a synthetic AI platform.
  • Deci: a synthetic data platform for AI models development (no website).
  • Deepsync: a synthetic audio software company providing digital voice/ synthetic audio for podcast and advertisements.
  • Deep Vision Data: a provider for synthetic training data for supervised and unsupervised training of machine learning systems .
  • EdgeCase: a synthetic data provider for AI & image recognition.
  • Indika AI: a Mumbai-based startup working on a platform to generate synthetic data to test and train AI models.
  • Infinite AI is a spin-out vendor from Edge Analytics that created Pixelate API, a synthetic data API and associated marketplace for computer vision teams.
  • Kroop AI: a company offering audio-visual synthetic data generation capabilities.
  • Lexset: training data provider for computer vision systems.
  • Mindtech: vendor of a synthetic data platform for training data creation for AI vision systems.
  • Mirage Vision: vendor of annotated synthetic image data.
  • Neurolabs: vendor of a synthetic data platform for Computer Vision.
  • Neuromation: vendor of a distributed synthetic data platform for deep learning applications.
  • Omniverse™ Replicator: NVIDIA’s 3D synthetic data generation SDK.
  • OneView: vendor of a synthetic geospatial data generation and optimization platform.
  • Parallel domain: vendor of a synthetic data platform for autonomous system training and testing use cases.
  • Reinvent Systems: provider and vendor of synthetic data solutions for image generation, text data, and 3D objects.
  • Rendered AI: vendor of common application framework to produce physics-based synthetic datasets for AI training and validation.
  • SBX Robotics: a vendor of synthetic training data for computer vision models providing on-demand synthetic data.
  • Scale Synthetic: the synthetic data offering from Scale AI for ML model training.
  • Simerse: seller of annotated, synthetic, labeled training sets for AI learning.
  • Sky Engine: a platform for synthetic data generation to create data streams for deep learning in computer vision.
  • Synthesis AI: vendor of a data generation platform for computer vision.
  • Synthetaic: provider of AI training data.
  • Synthetic Data Pty Ltd: vendor of a simulation software that generates synthetic images.
  • Synthetik: AI service provider with computer vision and machine learning expertise, including synthetic data generation.
  • Syntric AI: solution to generate hyper-realistic Synthetic Data for biometrics use cases.
  • uSearch: a web search engine built entirely from AI-generated data.
  • vAIsual: a technology company selling algorithms and solutions to generate synthetic licensed stock media.
  • Vypno: vendor of a software solution to recognize and classify objects in images from drones, cameras, satellites or mobile phones.​
  • Yet Analytics: develops Profilesynt, a commercial platform for event-based synthetic data modeling and simulation for learning and training systems expected to be released in 2023.
  • Yuva.ai: provider of synthetic image and video for computer vision.
  • Zumo Labs: Training Data as a Service provider of synthetic training data for computer vision.

--

--

Elise Devaux

Personal blog of a tech enthusiast, digital marketer interested in synthetic data, data privacy, and climate tech. Currently works at cozero.io