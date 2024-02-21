Google Cloud on Wednesday unveiled its latest large language model, Gemma, becoming the latest tech giant to jump into the open source generative AI arena.

Gemma is a family of open source models built from the same research and technology used to create Google's Gemini foundation models. It comes in two sizes: Gemma 2B and Gemma 7B.

Each size comes with pre-trained and instruction-tuned variants. They can run on developers' laptops, workstations or Google Cloud Platform.

The sizes are also small compared to Gemini Ultra 1.5 Ultra, with its 1.5 trillion parameters.

Small language models Google's new models show that 2024 is the year of small language models (SLMs) as well as large language models (LLMs), Gartner analyst Chirag Dekate said. "In a GenAI era, when enterprises need to be able to create value not just from LLMs but also from SLMs that can be contextualized in their data context, at a lower price point, which matters to them, that becomes incredibly important," Dekate said. Smaller language models enable enterprises to get higher levels of precision, said R "Ray" Wang, founder of Constellation Research. "It's going to be in these smaller language models where people are actually going to start putting stuff together," Wang said. "The large language models are going to capture all the stuff that's publicly available, but there's going to be data in private networks that's going to require smaller language models." Since Gemma is a small model, it could become more popular than Meta's open source model family, Llama, AI analyst Mark Beccue said. "They've published a two billion-parameter model; that's the smallest one I've seen," Beccue said. Despite its competitive relationship with Meta, Google still offers Llama 2 on Google Cloud. That shows how Google is committed to offering generative AI model options to enterprises, Dekate added. "What Google is trying to communicate here is you get access to not just open models, but lower cost models, and more importantly access to better innovation faster through a Google Cloud ecosystem," he said.

Iris.AI and Gemma Iris.AI is an AI startup interested in Google Gemma. Iris provides a platform and creates models that use generative AI technology to analyze scientific documents for researchers. Iris' current generative model is based on the Llama 13 billion-parameter architecture and pre-trained on scientific documents. "What we find very nice about the Llama 2 models and Llama architecture is that it's quite easy to operate," said co-founder and CTO Victor Botev. However, based on Google's benchmark result on its new Gemma models, Iris plans to evaluate how the 7B model compares to the Llama 13 billion model. Gemma 7B will enable Iris to use only one GPU if it performs as promised, driving the cost down for the startup. "Anything above that requires you to actually set up multiple GPUs, which is not easy to operate," Botev added. Working with multiple GPUs requires special infrastructures, libraries and software stacks. "It's not going to be just, let's say, two times cheaper, but multiple times cheaper, if it shows the results," Botev said. "It can fit into one GPU, it can allow us to generate multiple answers because the memory will be enough and that will drive the cost significantly ... if the reasoning capabilities are exactly as they say and if it's easy enough to fine-tune and modify for [our] particular use case." The Gemma 2B model, by comparison, seems comparable to the Llama 7 billion model, which does not have good reasoning capabilities, Botev said. "That would be interesting the moment they actually reach the reasoning capabilities of the current 13 billion-parameters models," he said.