metamorworks - stock.adobe.com
Microsoft this week made its AI foundation model Florence available in preview for the public.
The tech giant revealed on Tuesday that Florence, trained with text-to-image pairs, is now integrated as production-ready computer vision service in Azure Cognitive Service for Vision, a service from Microsoft that provides computer vision capabilities such as analyzing images, reading text and detecting faces with prebuilt image tagging.
Microsoft introduced Florence in 2021 as a foundation model that expands representations across computer vision, changing scenes from coarse to fine and movement from static to dynamic.
With the availability of Florence in preview, Microsoft also revealed that Reddit will use its newly improved Vision Services to generate captions for images on its social media platform. In addition, the tech giant also plans to apply its the new Vision Services to Microsoft 365 apps suite of business productivity applications.
"It's really an example of how Microsoft because of their broad and successful reach into the enterprise for productivity applications can take something like an image processor and make it universally available across their entire suite," said Karl Freund, an analyst at Cambrian AI.
Karl Freund Analyst, Cambrian AI
In the past several months, Microsoft has applied OpenAI's large language model across several applications, moving quickly in adopting foundation and large language models such as GPT compared with its chief rival in AI technology, Google.
"[Google] can't get out of their own way to get product to market and get out of research mode," Freund said.
Google has a more extensive search brand and more to lose if it were to take the same risks Microsoft takes in applying these experimental models to all its applications, Freund said.
Microsoft's integration of the foundation model into Azure Cognitive Services also benefits developers, said Arun Chandrasekaran, an analyst at Gartner. Since the services are available as an API, developers can easily consume them and embed them into whatever applications they are building.
Risks to mitigate
However, foundational models like these come with risks that enterprises must mitigate.
One of those hazards is presented by the training data, Chandrasekaran said.
"The training data sets for these AI services need to be free of copyrighted and other forms of proprietary content," he said.
Moreover, the output delivered by these services must be monitored through extensive pilots, Chandrasekaran added.
This will help with accuracy, adaptation and privacy.
Nonetheless, AI foundation models such as Florence will appeal to enterprises in industries that use large amounts of images and videos, such as automotive, media, retail and manufacturing, Chandrasekaran said.