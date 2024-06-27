Google on Thursday opened new fronts in the generative AI technology war, releasing grounding and context caching capabilities for its multimodal Gemini large language models to better ensure accuracy of results and use less compute power.

The tech giant also launched Imagen 3, the latest version of its diffusion imaging model, in early preview for users of the Vertex AI platform, with faster processing, better prompt comprehension and digital watermarking, and made generally available the Gemini 1.5 Flash model with a 1 million-token context window.

The moves come amid a frenzy of technology advances as Google and rival vendors Microsoft and its partner OpenAI, Meta, AWS and smaller independent AI vendors vie for supremacy in the surging GenAI market.

LLM output evidence Grounding, or providing citations or links to sources underlying LLM outputs, has become a buzzword in the GenAI universe as vendors and users look for ways to reduce or eliminate the hallucinations, or inaccuracies, LLMs are prone to producing. On the grounding front, Google has jumped ahead of its main GenAI competitors, said Andy Thurai, an analyst at Constellation Research. With grounding, context caching and size, they announced some things that others aren't thinking about. Andy ThuraiAnalyst, Constellation Research "With grounding, context caching and size, they announced some things that others aren't thinking about," Thurai said. "They are urging others to catch up." Google's grounding approach begins with Google Search. With the grounding feature, Google provides a percentage-based accuracy score. "This assumes that your Google Search results are accurate, which Google says they are," Thurai said. "But if the search results are bad, then your model output will be bad." Thurai said he has higher hopes for third-party grounding, expected to be available on Vertex AI later this year with Moody's for financial data, Thomson Reuters for news and ZoomInfo for company data. High-fidelity mode grounding, now in experimental preview and powered by a version of Gemini 1.5 Flash, will let users pick their own data confirmation source. It's likely that grounding will become an industry standard method for reducing and eliminating LLMs' inaccuracies, according to some observers. "If we don't ground and try to fix the hallucinations, then AI will not be successful," said Sanjeev Mohan, principal and founder of SanjMo, a data trend advisory firm.

GenAI competition Bursting into public view with OpenAI's introduction of ChatGPT in November 2022, the GenAI race has become a monthly leapfrog exercise, with the vendors striving to outdo each other on LLM features, size, power and other attributes. AWS has an event in New York City on July 10 at which it's expected to roll out GenAI releases to try to catch up with Google and OpenAI. OpenAI made its own splash last month with GPT-4o and the acquisition of streaming database vendor Rockset last week, and is expected to make another big move soon. Meanwhile, smaller AI vendors are touting the virtues of non-compute-intensive, highly customizable small language models. At a media and analyst briefing on June 26, Google Cloud CEO Thomas Kurian touted the Gemini 1.5 Flash model -- aimed at midmarket enterprises looking for speed, affordability and a large context window -- as superior to OpenAI's GPT-3.5. Google's Gemini 1.5 Pro model has what is accepted to be the industry's biggest context window for entering prompt information into an LLM: 2 million tokens. "The generally available Gemini 1.5 Flash is the fastest model at the best price-to-performance option on the market," Kurian said.