Everpure Data Stream preps data pipeline for enterprise AI

The new platform, released at Pure Accelerate 2026, is part of a broader feature set aimed at helping customers prepare data so it can be better used by AI.

Consumer tech has raised the bar for the enterprise with ChatGPT and other GenAI services answering user questions and completing tasks.

Workers and their managers are asking why their organizations can't enjoy similar services, customized for the company's own internal data. But running AI requires a constant flow of correctly formatted data for models to digest. That's a huge ask for the IT department.

The Everpure Data Stream provides a data pipeline that connects the data storage to GPUs, providing a crucial first step in building out a pipeline, as demonstrated at Everpure's user conference last week. It is part of the storage giant's evolving AI Data Platform. 

Data Stream conquers the challenge of "allowing end users to actually consume from an AI perspective all this messy unstructured data that they have across all their sources in their environment," said Kaycee Lai, Everpure vice president of AI and analytics.

 The software sets up a base for running an AI factory, one capable of feeding a wide variety of data to GPUs for Retrieval‑Augmented Generation (RAG) and other tasks.

ETL for the AI factory

The current bottleneck for running AI in the enterprise has been the state of its data, the majority of which is unstructured and varies widely in formats, Lai said. Data content also varies widely by industry. Formatting the data has traditionally been the work of ETL (extract, transform and load) processes, but the universe of data types is much broader, and time requirements are much more stringent these days.

Lai said the job can be tackled through a multi-modal approach, where different models are used to interpret different formats. Thus, a multi-modal approach is required. A PDF or text document in manufacturing is very, very different from one for, say, insurance, so I have to find the right AI model. Otherwise, the output won't make sense, Lai said.

Data Stream is built on NVIDIA NIM (NVIDIA Inference Microservices). Introduced a year ago, NIM is a set of microservices bundled into containers that provide the supporting infrastructure for running specialized models.

Some help from NVIDIA

NVIDIA offers a set of NIM blueprints to assemble these microservices for specific tasks, such as setting up agents for enterprise use, running RAG pipelines, and deploying an LLM router to pick the best model for the job. It also has reference architectures for individual industries such as retail and security.

Data Stream provides the orchestration engine for NIM packages, relieving the developer from setting up all the different components individually, ensuring that the data flows correctly and that actions are executed in the correct order.

Of course, NIM is based on using NVIDIA GPUs with Tensor Cores for pipeline acceleration. The type of AI workload determines the type of GPU needed. The pricing for using Data Stream is calculated from the output tokens created by each job, by a semantic search or some other task.

Everpure Data Intelligence is the engine to accelerate the 'understanding' of all enterprise data, so the right data can be fed into the AI data platform in an efficient and trusted manner. This could be transformational for many organizations
Simon RobinsonOmdia principal analyst for storage and data infrastructure

"Everpure Data Stream is designed to provide organizations with a 'fast track' option to get their AI efforts up and running quickly, but also with the confidence that it meets their corporate security and governance requirements," said Simon Robinson, Omdia principal analyst for storage and data infrastructure.

Robinson noted that given the complexities of both the data and needed infrastructure, Everpure "has done much of the architectural heavy-lifting in the background -- such as integration with multiple NVIDIA components -- so customers can focus on getting up and running."

Surprisingly, you don't need an Everpure storage array or cloud service to run Data Stream, Lai said. But there are some good reasons for doing so. Running on the Everpure FlashBlade//SS array, Data Stream can easily parse billions of documents, with the processing being executed quickly on non-volatile RAM (NVRAM).

Start at the storage layer

"Other vendors want you to copy all your data into their system. This means AI is always working on a copy. And a copy is always behind," said Shawn Rosemarin, vice president of customer engineering, in his keynote talk during Everpure's annual Pure Accelerate user conference last week.

Long known for selling high performance flash storage arrays, Everpure changed its name from PureStorage in February, and, with it, came a change in strategy. Now it is addressing a broader set of concerns around how to prepare data so it can be used by AI.

In this view, storage becomes a "unified, virtualized cloud of data, governed by an intelligent control plane," asserted a company news post.

To this end, Everpure is building out a unified data platform for AI. In addition to Data Stream, Everpure acquired data management start-up 1Touch in May, quickly revising and rebranding that company's technology as Data Intelligence.

Data Intelligence preps the data to make it ready for AI use, as well as ensures data access rules are strictly enforced. A complement to Data Stream, Data Intelligence preps the data for AI ingestion, executing tasks such as cleansing, curation, and semantic tagging.

"Everpure Data Intelligence is the engine to accelerate the 'understanding' of all enterprise data, so the right data can be fed into the AI data platform in an efficient and trusted manner. This could be transformational for many organizations," Robinson said.

Dig Deeper on Storage architecture and strategy