KOHb - Getty Images

Latest from Vast Data aims to simplify, speed AI development

SyncEngine has the potential to be a differentiator for the vendor, combining capabilities usually performed by specialized tools to provide pipelines with relevant information.

Vast Data on Thursday launched SyncEngine, a new feature that combines cataloging, migrating and preparing data to make it faster and easier to feed AI pipelines with relevant data.

The tool is now part of the Vast Data AI Operating System (OS) at no additional cost to customers.

While model complexity and lack of GPUs are reasons many AI projects never make it into production, the readiness of data is another. Data is often distributed across numerous systems, isolated in SaaS applications and not cleansed and validated to make it usable.

Vast Data aims to ensure that an enterprise's data is prepared for AI with SyncEngine. Given that the new feature combines data integration and preparation for AI pipelines, it is valuable for Vast Data customers, according to Michael Ni, an analyst at Constellation Research.

SyncEngine tackles one of the most overlooked barriers in enterprise AI, which is fragmented, inaccessible data. By collapsing cataloging, migration and pipeline prep into the Vast AI OS, it gives enterprises one vendor -- versus multiple -- to deliver scattered files and SaaS apps into AI-ready intelligence.
Michael NiAnalyst, Constellation Research

"SyncEngine tackles one of the most overlooked barriers in enterprise AI, which is fragmented, inaccessible data," he said. "By collapsing cataloging, migration and pipeline prep into the Vast AI OS, it gives enterprises one vendor -- versus multiple -- to deliver scattered files and SaaS apps into AI-ready intelligence."

In addition, because SyncEngine combines capabilities that normally require more than one tool, it is differentiated from traditional data catalog offerings, Ni continued.

"SyncEngine looks like a data catalog, but it's more than that," he said. "Most catalogs stop at metadata. Vast folds in high-speed migration and vectorization, so you don't just discover data, you make it AI-ready in one step. That tight integration with the Vast AI OS sets it apart from standalone catalogs."

New capabilities

Despite enterprises continuing to increase their investments in AI development, AI projects still fail more than 80% of the time, according to some estimates. While Vast Data's new feature doesn't eradicate hindrances such as bias and lack of infrastructure resources, it does address problems like poor data quality and lack of relevant data.

Motivation for developing SyncEngine came from observing the trouble customers were having discovering and feeding distributed data into AI pipelines, according to Aaron Chaisson, Vast Data's vice president of product and solutions marketing.

"The impetus was a direct response to the 'last mile' problem hindering customer AI strategies," he said. "While the Vast AI OS provides automated AI data pipeline services, customers were challenged with finding and then moving all of their distributed data sources into the Vast pipeline."

Meanwhile, because it directly addresses one of the main barriers to successful AI development, Stephen Catanzano, an analyst at Enterprise Strategy Group, now part of Omdia, like Ni called SyncEngine a compelling addition to Vast Data's AI OS.

"Vast's SyncEngine is a significant addition for customers that want to eliminate the 'last mile' problem of data fragmentation that has become one of the biggest constraints in AI deployment," he said. "Allowing organizations to unify scattered data … without additional costs or complex third-party tools is critical."

SyncEngine includes the following features:

  • Data migration capabilities built for the massive files and data sets AI models and applications require to reduce the occurrence of AI hallucinations.
  • Metadata indexing, enabling enterprises to catalog hundreds of trillions of files.
  • Throughput levels that are limited by only source and target systems.
  • Parallel processing for input/output workloads to reduce bottlenecks and improve performance.

Combined, the features enable users to build a catalog that connects data from disparate systems such as object storage and enterprise applications, migrate and synchronize data at scale, and ultimately speed the process of feeding AI pipelines.

That combination, rather than any one or two individual capabilities, is what is most significant about SyncEngine, according to Ni.

"Enterprises continue to struggle to find, contextualize and feed their petabytes of siloed data into [AI] workflows," he said. "Vast folds those steps into the OS, eliminating the hand-coded, often brittle, tool chains and making [data] searchable and AI-ready."

By folding the different steps into its AI OS -- leading to high performance but also possibly fears regarding vendor lock-in -- Vast Data is taking a different approach to feeding AI pipelines than many other data management vendors, Ni continued.

For example, Snowflake and Databricks lay governance and intelligence on top of data while separating compute from storage. Collibra and Informatica, meanwhile, excel at metadata management but don't specialize in data migration and preparation capabilities.

"Vast is deliberately separating itself from the pack by collapsing categories … into one AI operating system," Ni said. "This consolidation delivers high performance and low-latency pipelines for real-time and agentic AI, but it also challenges buyers to embrace a single vendor spanning so much ground."

Catanzano likewise suggested that Vast Data is taking a unique approach to data management and AI pipeline development.

"Vast differentiates by offering a unified platform that combines storage, database and compute capabilities specifically optimized for AI workloads, rather than retrofitting traditional storage systems for modern AI requirements," he said.

Meanwhile, regarding SyncEngine, Catanzano highlighted the significance of metadata indexing.

"Metadata indexing stands out as particularly valuable because it enables cataloging and searching hundreds of trillions of files within the Vast DataBase," he said.

Looking ahead

Over the remainder of 2025, Vast Data plans to continue building an OS for AI that includes InsightEngine and AgentEngine in addition to SyncEngine, according to Chaisson.

Specific initiatives include adding a Model Context Protocol tool set for AgentEngine, he said.

Other initiatives Vast Data could take to improve its AI OS include developing industry-specific tools with prebuilt models, as well as adding more partnerships to create a broader ecosystem, according to Catanzano.

"Developing deeper integrations with popular AI development frameworks and cloud services [could] create a more seamless experience across hybrid environments," he said.

Ni, meanwhile, suggested that to remain competitive with other data platform vendors, Vast Data needs to expand beyond focusing on fast AI pipelines to provide semantic modeling and governance capabilities.

"Enterprises don't just want data in motion," he said. "They want data whose meaning is understood, trusted and governed to be able to scale to drive real business decisions."

Eric Avidon is a senior news writer for Informa TechTarget and a journalist with more than 25 years of experience. He covers analytics and data management.

Dig Deeper on Data integration