What do large language models do in AI?
To capitalize on generative AI, business IT leaders must understand the features of large language models.
Advancements in AI and machine learning technology show how far the world of tech has come since the days of basic chatbots on company websites. A deep dive into large language models exemplifies this changing industry.
The big buzz this year is about ChatGPT, an open source large language model (LLM) by Open AI, an AI research company. However, ChatGPT is far from the only LLM. Another large but lesser known open source initiative is BLOOM from the BigScience project, a consortium of roughly 1,000 volunteer AI researchers. Other LLMs include Google's Bard and LaMDA and Nvidia's NeMo.
In sum, 2023 could be the year of the LLM, though LLM as a term isn't well-known yet. As the concept gains traction, potential users should know what constitutes an LLM, its key features and how to use it effectively.
What is an LLM?
An LLM performs natural language processing (NLP) tasks. It might answer conversational questions, translate from one language to another, or generate or classify textual (as opposed to visual or mathematical) data. Google Translate is a form of natural language processing, for instance, and so are the various tools that predict your next words when texting on smartphones.
The large descriptor refers to the fact that the language model can change a large number of parameters as it learns. Parameters are the components of the model that define its skill. More parameters make for better models. For perspective, one can look at OpenAI's Generative Pre-Trained Transformer (GPT) releases. GPT-1 had roughly 100 million parameters; GPT-2 had 1.5 billion; both GPT-3 and GPT-3.5 had 175 billion. Open AI announced GPT-4 has roughly the same number of parameters as GPT 3 and 3.5.
Like many other forms of AI, an LLM requires training. The LLM creates text, reviews it (oftentimes with human oversight) and accepts corrections. This process repeats until the output is both semantically and factually correct. An LLM can perform a variety of tasks once it's trained, including generating text, classifying text, answering questions (like ChatGPT), responding to email or social media posts and translating from one language to another.
Key features of LLMs
There are a few key shared attributes of LLMs regardless of vendor or level of maturity. Those include the following:
- LLMs create text or outputs across a range of functions. LLMs can generate code, including scripting and automation for operating infrastructure. They create text, such as for documenting code or processes, and translate languages.
- LLMs are far from accurate. Much has been made of the fact that LLMs like ChatGPT have passed bar and medical licensure exams. Less has been made of the fact that ChatGPT is often wildly incorrect.
- LLMs can devolve into unacceptable dialogue. LLMs can devolve quickly into misogyny, racism and other forms of hate speech.
- LLMs are only as good as their training and data sources. At present, LLMs are often trained by developers who then release trained code to users.
- LLMs require mature and sophisticated governance. Deploying an LLM within your organization takes focus on digital ethics, the art of determining policies, and processes for managing and governing AI.
- LLMs are maturing rapidly. Developers are upgrading their LLMs multiple times per year. Significant upgrades mean major expansions in both parameter count and capabilities.
Recommendations regarding LLMs for businesses
Regardless of industry or sector, there are a handful of steps businesses should take to optimize their rollout of LLMs, if and when it occurs. First, compile a list of potential use cases, both in and outside of IT. In IT-specific use cases, businesses might do the following:
- Generate code or scripts.
- Generate documentation for code or other IT functions.
- Serve as a chatbot for user help desks.
- Create messages for critical processes.
- Create presentations to inform management and users of IT initiatives.
Second, document risks and create risk mitigation strategies. Each use case for LLMs should be categorized by risk. For example, a mistake in code documentation may be low-risk. A chatbot communicating in an unacceptable fashion may be medium-risk. Using an LLM-generated script to automate crucial processes -- particularly those affecting machinery or people -- would be high-risk.
Based on the above categorizations, businesses should deploy LLMs for low-risk functions such as code documentation but hold off on medium- to high-risk functions until quality control issues are figured out. Additionally, any LLM deployment should include risk mitigation strategies. For example, the company needs a strategy to protect against errors in LLM-generated documentation.
Finally, a business should consider launching a digital ethics team. It must comprise technologists as well as legal, HR, risk management and regulatory specialists. The team should immediately wargame potential scenarios to determine how the organization will respond. Such scenarios might include an LLM insulting a customer by exposing them to unacceptable speech or creating a flawed automation process. Planning for these scenarios should be one output of the digital ethics team. Another output can be a digital ethics policy defining what the company can and cannot do with LLMs.
The bottom line is that an LLM is both a boon and a risk for enterprises. It's time to take both aspects seriously when planning to implement LLMs while simultaneously developing risk-mitigation strategies.