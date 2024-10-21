IBM on Monday released its new family of Granite language models under a fully permissive open source Apache 2.0 license.

The Granite 3.0 models include general-purpose language AI models such as Granite-3.0-8B-Instruct, Granite-3.0-2B-Instruct, Granite-3.0B-Base and Granite-3.0-2B-Base; guardrail and safety models such as Granite-Guardian-3.0-8B and Granite-Guardian-3.0-2B; and mixture-of-experts models including Guardian-3.0-3B-A800M-Instruct, Granite-3.0-1B-A400M-Instruct, Granite-3.03B-A800M-Base and Granite-3.0-1B-A400M Base.

The language models were trained on over 12 trillion tokens of data from 12 different languages and 116 programming languages, according to IBM. The 8B and 2B models will include support for extended 128K-context length and understand multi-modal documents by the end of the year.

The open source Granite Guardian 3.0 models enable developers to use safety guardrails by checking how an AI model responds to risks such as social bias, hate, violence and hacking. The models were also trained using Nvidia H100 GPUs, according to IBM.

Granite 3.0 models will support applications such as customer service, IT automation and cybersecurity.

The open source approach The new Granite line comes as more vendors are gearing toward small language models and focusing on open source. "Over the last 25 years, the gold standard for open source is an Apache license," IBM senior vice president and chief commercial officer Rob Thomas said during a media briefing about the new models. "We chose that for a very good reason." IBM is betting that the future of AI is open, Constellation Research analyst Andy Thurai said. "They offer smaller, more efficient, transparent models that are trained ethically and created responsibly to be the differentiator," Thurai said. While IBM is not trying to make money by licensing the models, it wants organizations to use its Watsonx platform to run the models or fine-tune or build a new derivative model, Thurai added. Compared to previous generations, the Granite 3.0 models appear to be more efficient and accurate, Moor Insights strategy analyst Patrick Moorhead said. "This makes sense to me as the models weren't trained on 'world data,'" Moorhead said. World data includes the internet, entertainment and consumer video. Instead, IBM used enterprise data such as data from documents and spreadsheets.