The widespread popularity of generative AI applications such as ChatGPT and Midjourney is partly driven by their broad knowledge bases. For businesses, however, the next step in generative AI adoption looks to be smaller, more flexible systems specialized for enterprise IT architectures and use cases.
Organizations' rush to adopt generative models was evident earlier this week at MIT Technology Review's conference, EmTech Digital 2023. But the success of those enterprise deployments requires understanding how to integrate generative AI into existing IT architectures in a secure, customized way.
Access to infrastructure limits broader adoption of AI
As interest in generative AI skyrockets, supporting model training and operations has become challenging at the hardware and infrastructure level.
Massive models with billions of parameters, such as GPT-4, require a highly optimized underlying infrastructure that can be costly and difficult to build. "The really expensive and hard things to build with new systems [are] these memory systems that are both big and can feed data at high speed," said Danner Stodolsky, senior vice president of cloud at enterprise AI platform vendor SambaNova Systems.
For many organizations, the challenge is balancing security and compliance needs with the compute necessary to run generative AI at scale. Stodolsky said SambaNova's current customers are primarily interested in hosting systems in their own data centers, typically for privacy and security reasons, but the company has recently seen growing interest in cloud.
"Right now, getting the hardware is surprisingly hard," said Dror Weiss, co-founder and CEO at AI coding assistant vendor Tabnine. "Everybody wants the GPUs. They want hundreds. ... For us, it's easiest and fastest to roll everything on our cloud and have everybody use it, but that doesn't meet the security requirements of all of our customers that we work with."
Private cloud deployments could be an option for companies with strict compliance requirements that prohibit sending data to external parties. For example, Weiss said users can run Tabnine's AI prediction server in their virtual private clouds and inside the corporate network, or even on their own hardware if they have the right GPUs.
For certain use cases, lower latency could also be a strong argument in favor of local deployments. In the session "Off-the-Shelf AI," Prabhdeep Singh, vice president of software products and engineering at SambaNova, gave the example of analyzing data from 4K cameras for defect detection on an assembly line.
"The amount of data that comes in and the inference that needs to happen, almost in real time, puts a lot of strain on these systems," Singh said in his presentation. "You just don't have enough bandwidth to send the data into the cloud, and that's why you need on-prem systems to be able to do this at scale and speed."
Evaluating open source vs. proprietary generative AI
Due in part to these infrastructure challenges, an increasing number of developers and researchers support open sourcing AI models.
In line with the classic argument for open source software, proponents of open AI hope that crowdsourced knowledge and input will lead to better models. In addition, large generative AI models are expensive to train and refine, putting them out of reach for many smaller organizations. An open source approach could let users without the resources to train customized models themselves take advantage of generative AI's capabilities.
With open source, "you don't have to manage any of the infrastructure for these complex generative models," said Bill Marino, principal product manager at Stability AI, an open source generative AI company, in the session "One Future for AI." Instead, users could tap in to baseline models via an API endpoint and fine-tune a customized model on their own data.
Having a diverse range of open models could democratize access to generative AI and, in doing so, reduce bias in AI systems. "Having a diversity of opinions on what things to prioritize and what values should be at play is really critical to figuring out which way we should be headed," said Margaret Mitchell, chief ethics scientist at Hugging Face, in the same session.
Providing open access to AI systems' code and training data can clarify the reasons for -- and, ideally, prevent -- discriminatory or otherwise harmful model output. "Doing this out in the open gives everyone more insight into why the models are making the decisions that they're making," Marino said in an interview.
But despite the advantages of open source models, proprietary models might be necessary for enterprises. Because generative AI is so new to many businesses, the ability to experiment with minimal risk is important for successful adoption.
"Let's be honest here: None of us actually know how these things are going to be useful in our individual companies," Singh said in his presentation. "So, you need a safe environment and a sandbox first to play with these things, to see where can these things actually be useful."
Using consumer-oriented tools to conduct that kind of exploration entails security and privacy risks. In interactions with ChatGPT, for example, every input becomes a part of the training data. This is an unacceptable risk for enterprise users, who have already seen the fallout from using proprietary information in prompts.
"We see all these problems with data leakage and provenance," Stodolsky said. "There's a lot of utility in controlling the risk and provenance and understandability of your system." Making a private coding assistant using code that the enterprise wrote, for example, gives users much more confidence that model output won't violate copyright, he said.
Smaller models fine-tuned to enterprise use cases
In addition to addressing security and compliance concerns, fine-tuned models for individual companies could also address the problem of inadequate infrastructure to support generative AI deployments.
Rather than use huge models with broad knowledge bases such as GPT-4, organizations are increasingly beginning to consider smaller, lighter systems trained on and specialized to specific domains. "We think the big opportunity in front of everyone is in smaller models that are fine-tuned, really laser-focused on particular use cases," Marino said in his presentation.
More generalized models have their advantages, such as doing initial exploratory work in a new domain. But in many instances, they aren't the right fit for enterprise use cases, such as answering detailed questions in a contact center or generating content for marketing campaigns.
In part, this is due to businesses' more stringent requirements for accuracy in model output. Whereas an individual using ChatGPT for personal purposes might find inaccurate responses irritating, an enterprise model that fails on tasks such as customer service or quality control could cause real financial and reputational repercussions.
By building narrowly tailored models that are easier to fine-tune and evaluate, "we can give people and enterprises things that are more reliable [and] understandable and have lower risk," Stodolsky said.
With tools such as ChatGPT, factual or logical errors might be more difficult for users to recognize because generative AI "produces wrong answers in a different way than we're used to," he said. But if a smaller system with a clearly defined purpose starts to give responses outside its intended domain, it's much easier for users to see that the system is "hallucinating."
In addition, narrower models are more agile and easier to fit into existing enterprise infrastructure, which makes them easier to specialize to organizations' data -- a key factor for enterprise adoption. "For bigger customers, this degree of control, of extensibility, of flexibility is important," Weiss said.
Thus, a promising option for enterprises is to customize a baseline model to fit their needs and workflows. SambaNova, for example, gives customers access to open source models, which companies can then train on their own data to create a model more closely aligned to their use case.
Similarly, Weiss said, Tabnine is working on functionalities to let customers connect their codebase to Tabnine to build a more customized version of its AI coding assistant for enterprise users. "They ingest the private code to Tabnine, and we provide this way more specialized code suggestion," he said. "For us, the private code front is particularly exciting."
Moving forward, adopting enterprise generative AI safely and effectively will require looking closely at each business's use cases and risks. "Everyone's excited about these generative models now, and there's no doubt about that," Marino said. "But what I think lies ahead now for a lot of enterprises and other integrators of these models is this process of undergoing a deep evaluation."