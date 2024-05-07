DENVER -- Red Hat's OpenShift AI has its sights set on hybrid cloud infrastructure management for generative AI apps with a series of updates this week.

OpenShift AI, launched at Red Hat Summit last year, replaced the cloud-only OpenShift Data Science platform for MLOps with a hybrid cloud version that supported edge and on-premises infrastructure and supported distributed application management. It will also act as a base layer of infrastructure automation for the newly released RHEL AI product.

With OpenShift AI version 2.9 this week, Red Hat strengthened connections between generative AI frameworks and large language model (LLM) serving tools and the Kubernetes substrate and application development tools that comprise OpenShift AI. Now, customers can manage predictive AI and generative AI together, according to Steven Huels, vice president and general manager of Red Hat's AI Business Unit.

"OpenShift Data Science targeted predictive AI workloads and was a cloud service. … We heard from a lot of our customers that they needed the ability to deploy on-prem and in disconnected environments and have more control over the systems themselves when we launched OpenShift AI," Huels said during a press briefing May 1. "That's when we also got into the generative AI space. … What we put into OpenShift Data Science and the core [OpenShift AI] platform has been able to roll straight forward into the generative AI [version]."

OpenShift AI 2.9 updates include the following:

Support for multiple model servers to run generative AI and predictive AI/ML apps together in the same OpenShift AI infrastructure. This feature is built on updates to the KServe open source custom resource definitions (CRD) package of Knative, Istio and Kubernetes that supports container-based AI workloads. New integrations for KServe include support for vLLM, an open source library for LLM serving; Text Generation Inference Server, an IBM-led fork of the Hugging Face TGI toolkit for LLM inferencing; and Caikit, a set of Python AI application development tools.

to run generative AI and predictive AI/ML apps together in the same OpenShift AI infrastructure. This feature is built on updates to the KServe open source custom resource definitions (CRD) package of Knative, Istio and Kubernetes that supports container-based AI workloads. New integrations for KServe include support for vLLM, an open source library for LLM serving; Text Generation Inference Server, an IBM-led fork of the Hugging Face TGI toolkit for LLM inferencing; and Caikit, a set of Python AI application development tools. Distributed AI app orchestration via the integration of KubeRay, a Kubernetes operator for the open source Ray Python AI distributed application framework and CodeFlare, an OpenShift-specific software stack that handles queueing, resource quotas, management of batch jobs and on-demand cluster resource scaling for distributed applications.

via the integration of KubeRay, a Kubernetes operator for the open source Ray Python AI distributed application framework and CodeFlare, an OpenShift-specific software stack that handles queueing, resource quotas, management of batch jobs and on-demand cluster resource scaling for distributed applications. Expanded support for AI model development and visualization tools in technical preview, such as VS Code, RStudio and Nvidia's Compute Unified Device Architecture. Red Hat also plans to integrate OpenShift AI with Nvidia's NeMo Inference Microservices through future updates to KServe.

in technical preview, such as VS Code, RStudio and Nvidia's Compute Unified Device Architecture. Red Hat also plans to integrate OpenShift AI with Nvidia's NeMo Inference Microservices through future updates to KServe. Single-node OpenShift support for edge deployments of OpenShift AI, in technical preview.

for edge deployments of OpenShift AI, in technical preview. An accelerator profiles feature in the OpenShift AI management interface for configuring hardware accelerators such as Intel's Gaudi 3 and AMD's Instinct. Red Hat will also integrate with Intel's Arc and AMD GPUs.

in the OpenShift AI management interface for configuring hardware accelerators such as Intel's Gaudi 3 and AMD's Instinct. Red Hat will also integrate with Intel's Arc and AMD GPUs. New partner tools integrations such as a certified OpenShift Operator for Run.ai's Kubernetes GPU scheduling utility; support for Elastic Inc.'s Elasticsearch Relevance Engine vector search and transformer models for retrieval augmented generation; and support for models from Stability AI.