Getty Images/iStockphoto

Optimize AI models to generate more bang for your buck

AI models must be fully optimized to align with business goals, provide valuable insights and produce positive ROI. Check out these proven, common sense model optimization methods.

Business decision-makers responsible for the success of their company's AI initiatives are leaving no stone unturned in their search for a positive ROI. For many, that search should start and end with their AI models and the overlooked business value they can produce.

AI models influence numerous business workflows, including customer interactions, payment processing, and warehouse and supply chain automation. They drive the generative AI (GenAI) technologies that help businesses manage, search and summarize content and power agentic AI applications and services.

AI models optimized to achieve their full potential can result in faster, more accurate and cost-effective processes. If these models aren't fully optimized to address a company's unique business needs, they can hallucinate, generate inaccurate information and struggle to make context-aware decisions. As a result, business operations are disrupted and the company can be exposed to security or compliance risks. To optimize their AI technologies and achieve a positive ROI, businesses apply various optimization techniques depending on the application.

Fully optimized AI models improve customer and employee interactions

Some chatbots use "generic" third-party AI models that can search a product database and make recommendations in response to customer inquiries. But it's unlikely that an out-of-the-box chatbot would have access to a customer's purchasing history and other data to generate more effective recommendations. Using retrieval augmented generation (RAG), for example, could provide the model access to additional customer data.

Employees could benefit from AI models trained to guide them through a company's internal business processes. This model would train on the process documentation within its organization to understand processes across all departments, from finance to R&D. But consider a situation in which an employee prompts a question specific to HR. To provide the best answer, the chatbot might spend excessive time and consume more compute resources parsing all the company's data instead of focusing on the data that directly pertains to HR. In this case, an optimization technique called model pruning instructs the chatbot to ignore certain data deemed irrelevant, resulting in a trimmer model that's smaller, faster and requires fewer compute resources to operate.

Graphic showing the business benefits of generative AI.
Much of GenAI's business benefits originate from fully optimized AI models.

Methods to optimize AI models

Whether a company uses self-built models, third-party models or both, there are optimization techniques that can align AI with business needs and improve the model's effectiveness.

1. Retrieval-augmented generation

RAG is a technique that provides trained models with access to additional data that they didn't have during training. This additional data can help improve model accuracy, especially for use cases that require more contextual awareness than a model's training data provides.

2. Compression

Model compression reduces the size of a trained model by decreasing the total amount of data the model parses or the effort it expends when parsing data. Pruning is one method of compression that reduces a model's operational costs while improving accuracy when accessing a relatively narrow set of data. Quantization is another compression method that speeds up processing and reduces costs, but it also reduces precision.

3. Retraining

Like RAG, retraining provides models access to data that wasn't initially available during training. Retraining, however, digs deeper. It enables the model to discover new data relationships and patterns. This capability differs from RAG, which enables a model to interpret additional data based on the patterns it already recognized within its training data. Though more costly and complicated than RAG, retraining is more flexible when a business process has fundamentally changed and the model requires updated data. It's also useful when a model initially trained on low-quality data is upgraded with higher-quality data.

4. Rehosting and redeploying

In some cases, the root cause of suboptimal model behavior isn't the model itself nor the data the model accesses. Instead, the model might lack the necessary computational resources to perform effectively. Rehosting or redeploying the model on new infrastructure can improve model performance. AI accelerators, for example, could speed up the model's inference capabilities, resulting in faster responses to prompts.

5. Filtering input and output

Users can optimize a model without modifying the model itself. Filtering techniques can intercept and modify prompts by users or AI agents and the model's response. If AI operational costs are high because users submit lengthy prompts, filtering would strip out irrelevant parts of the prompt, reducing the amount of data the model needs to process and thereby lowering processing costs.

Schematic showing the inner workings of retraining an AI model.
Using retraining, an AI model can discover new data relationships and patterns to adjust for changing business processes.

AI model optimization best practices

Improving AI models can be complex and challenging, but the following best practices can mitigate risk and reduce the resources necessary to achieve model optimization:

  • Choose the right optimization technique. Optimization techniques such as retraining can be complicated and costly, while RAG and input filtering are simpler and less expensive. When time or resources are limited, filtering might be the best option.
  • Ensure adequate technical resources. Model optimization requires specialized expertise. Businesses without AI engineers on staff should consider working with a vendor to meet their optimization needs.
  • Support experimentation. The first attempt to optimize an AI model doesn't always achieve the desired outcome. It might be necessary to retrain a model multiple times before it reaches a target accuracy rate.
  • Define processes for regular model assessment. Make continuous model improvement a systemic part of an overall AI project strategy. Establish guidelines for periodic reviews of an AI model to determine if optimization is necessary.

 Chris Tozzi is a freelance writer, research adviser, and professor of IT and society. He has previously worked as a journalist and Linux systems administrator.

Next Steps

AI risks businesses must confront and how to address them

Democratization of AI creates benefits and challenges

AI regulation: What businesses need to know

Will AI replace jobs? Job types that might be affected

The history of artificial intelligence: Complete AI timeline

Dig Deeper on AI business strategies