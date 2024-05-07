DENVER -- Red Hat execs proclaimed that open source AI is too difficult for most companies to contribute to and incorporate into their specific applications. RHEL AI, a new open source AI platform, is a bid to change that.

Red Hat Enterprise Linux AI includes Red Hat's RHEL operating system packaged as a bootable container image using the bootc Linux utility, which makes it portable across infrastructures. RHEL AI also folds in a newly open-sourced series of IBM Granite large language models (LLMs), a subset of the LLMs that underpin IBM's Watsonx and Red Hat's Ansible Lightspeed coding and chat assistants, along with InstructLab AI alignment tools. InstructLab, also released to open source under an Apache license by IBM Research and Red Hat this week, lets users fine-tune pre-trained Granite models using a knowledge and skills taxonomy that generates a synthetic data set.

"You can now teach a foundation model a new skill … with five examples that before might have taken 5,000," said Red Hat President and CEO Matt Hicks, during a keynote presentation to kick off Red Hat Summit 2024 this week. "With the ability to teach smaller models the skills relevant to your use case, everything gets better -- training costs are lower; inference costs are lower; deployment options expand."

These updates represent a shift in stance from a year ago, when Hicks said at Red Hat Summit 2023 that Red Hat did not plan to get into AI models -- a shift acknowledged at a press session here this week.

"The context was quite different," Red Hat CTO Chris Wright said of last year's remarks. "The industry was really rallied around proprietary models and we don't deliver proprietary solutions like that. The transition over the last year has been more openness in that model space."

Red Hat still isn't developing its own open source AI models, but supporting IBM Research projects along with other open source AI development and deployment tools, Wright said.

RHEL AI takes on open source AI issues RHEL AI aims to address some of the common problems with open source AI as it has emerged alongside proprietary LLMs such as OpenAI's GPT. While opportunities for collaboration are richer with such models, the data sets used to train them are often not available along with the model's source code. With this week's update, IBM Research is also releasing Granite Code Instruct models and disclosing the data sets used to train them, which include metadata from IBM's CodeNet, according to a company blog post. Training custom open source AI on internal infrastructure also typically requires massive resources that most mainstream companies can't afford or manage. Cloud providers have filled in this gap so far with hosted LLM services, although early adopters have had to exercise caution to avoid cost overruns. RHEL AI, by contrast, targets large organizations that have AI workloads at the edge, on premises and in multiple clouds with portable RHEL container images and OpenShift hybrid cloud automation tools, while InstructLab open source is meant to make fine-tuning AI models more accessible to the masses by requiring fewer data inputs than other hosted LLMs. "InstructLab being open source is unique," said Rob Strechay, lead analyst at enterprise tech media company TheCube, in an interview with TechTarget Editorial this week. "It's hard to get simulated data to train on -- InstructLab can connect the dots between data science tooling and data. Organizations want that easy button." Red Hat officials said they believe that open source AI can also overcome some common problems with LLMs in general, such as meeting standards of objectivity and appropriateness of results through community collaboration. As with proprietary LLMs and associated services from Microsoft and GitHub, Red Hat will keep users' data sets private and indemnify users of open source Granite LLMs. RHEL AI is in developer preview and it remains unclear at this early stage how far IBM and Red Hat plan to take indemnification. Asked whether it will indemnify users of Granite open source AI tools against prompt injection attacks, Red Hat officials said they'd follow IBM's Watsonx indemnification policies; these are described as protections against copyright and IP infringement in a Sept. 2023 IBM press release. This could be problematic with AI training tools out in the open, said one industry analyst during this week's press session. "From the stage today, there was this talk about aggregating a whole bunch of information from patients, from customers, in order to train models," said Bret Ellis, an analyst at Forrester Research, during the session. "So what happens when you have this aggregation point as a target for a cyber gang, and they can find ways to push the model to disclose information -- how are you putting guard rails around that specifically?" The industry as a whole, whether proprietary or open source, is still figuring out the answers to that question, Wright said. "The way we've done work in Linux with a defense-in-depth model, providing mandatory access controls all the way down the operating system with SELinux or ACS, integrating StackRox directly into Kubernetes … will play out here," he said. "It's just that the tools are less well understood at this point."