Box releases Box Extract, its AI metadata agent

Line-of-business Box users can now tag contracts, reports and other commonly used docs with plain-language instructions, which an agent processes.

In a vast sea of unstructured enterprise forms, records and documents, there is valuable information that drives business processes in legal, sales, marketing and HR scenarios, among others -- if one can only find it.

In many organizations, the task of structuring those files is typically left to content managers, business analysts, data scientists and similar experts. Box has released a generative AI agent, Box Extract, which could change that dynamic and make data discoverable and taggable by workers closer to the front lines -- the ones who sometimes first spot the use cases for specific pieces of data in their workflows.

The agent could also streamline processes such as data discovery, adding proper structure to new document types as they are launched, and updating analytics dashboards for expert users.

Box Extract uses large language models such as Anthropic Claude Opus 4.5, Google Gemini 3 and OpenAI GPT 5.2 to turn plain-language declarations into document extraction rules that can be either searchable in Box or integrated with workflows in other apps, including Microsoft Office, Oracle Fusion Cloud HCM, Salesforce, ServiceNow, Slack and Workday. The AI metadata agent is a good first step toward empowering customers to sort unstructured data, said Alan Pelz-Sharpe, founder of Deep Analysis, an independent technology research firm.

"It's all very well storing stuff, but I want to get value from [unstructured data]," Pelz-Sharpe said. "Box users who actually use it in day-to-day work -- as opposed to using it as a storage system -- they'll get a lot of value from Box Extract."

Box Extract screenshot
Box Extract can handle plain-language instructions to customize how it creates metadata from business files for discoverability and collaboration.

An opportunity for Box, Pelz-Sharpe added, could be custom versions of Box Extract that cater to common customer verticals such as insurance, healthcare and finance. Each industry has its own data model and specific needs; Box could reduce friction for large customers by creating accelerators that take into account the complexities of, for example, medical or auto insurance claims.

"There's a huge growth area. People want something really specific that understands their niche," Pelz-Sharpe said. "Box is, by design, quite horizontal. But growth will come from at least building some of the company out into these rich verticals."

Box Extract was previewed at last fall's BoxWorks user conference. It grew in part from technology acquired from Alphamoon in 2024. Box customers will be able to manage security policies to ensure that employees cannot build Box Extract agents that gain unauthorized access to protected information in places such as financial, health and personnel records.

The tool, along with generative AI content creation and summarization tools in Box, represents the company's tactical application of AI to perform specific operations -- as opposed to the large AI platform approach of other vendors such as ServiceNow, Salesforce and Microsoft.

"If you try to do too many crazy things with AI, and if you try too hard to make it do everything, oftentimes, just like a person, you would struggle," said Ben Kus, chief technology officer at Box. "But if you have a practical set of hard problems, problems you weren't able to solve before, that's a great way to have successful AI projects."

Don Fluckinger is a senior news writer for Informa TechTarget. He covers customer experience, digital experience management and end-user computing. Got a tip? Email him.

Dig Deeper on Content management software and services