This content is part of the Essential Guide: Human-like AI quest drives general AI development efforts

Humans and AI tools go hand in hand in analytics applications

Companies are keeping data analysts and other workers in the loop with AI applications to check the results generated by automated algorithms for accuracy, relevance and missing info.

AI tools may be intelligent, but they aren't all-knowing -- and they can learn a thing or two from people.

That's the view of analytics and engineering managers whose teams apply a human touch to the work of machine learning algorithms and other forms of AI. Pairing up humans and AI software provides information that the technology can't deliver on its own -- and it prevents organizations from blindly following algorithms down the wrong business paths.

Referred to by proponents as human in the loop, the idea is to tap knowledgeable data analysts or business users to give feedback on the findings of AI tools, particularly in cases where there's uncertainty about the validity of what they find. The resulting feedback loop supports so-called Active learning approaches designed to eliminate errors or fill in missing info, and can then train algorithms to produce better results in the future.

For example, O'Reilly Media Inc. uses a combination of humans and AI to label and categorize the content of the videos recorded at the technology conferences it runs. Data analysts can't handle that task themselves, according to Paco Nathan, director of the company's online learning unit. In a presentation at this month's Strata Data Conference in San Jose, Calif., Nathan noted that O'Reilly was recording about 200 hours of video there -- and the event was just one of the 20 or so conferences it puts on each year.

Word games lead to AI uncertainty

However, the natural language processing (NLP) algorithms that the Sebastopol, Calif., company uses to parse the video content often get confused by words with multiple meanings or ambiguous contexts, Nathan said. When such cases are identified, a person from his team must step in to figure out the correct meaning and to label the content accurately.

The output of the NLP-driven machine learning models is stored as a log file in a Jupyter Notebook, which the analysts can review and update. "It's really a two-way street, and you end up with documents that are collaborative, partly done by machines and partly done by people," Nathan said.

2018 Strata Data Conference in San Jose
The value of augmenting AI with human intelligence was one of the big discussion topics at the 2018 Strata Data Conference in San Jose, Calif.

He added that O'Reilly, which jointly organizes the Strata conference with big data vendor Cloudera, is seeing more than 90% accuracy with content labeling due to the combined AI and human efforts.

Pinterest Inc. uses a group of machine learning applications to drive the operations of its image search and bookmarking website, including the search process and things like ad placement and content labeling. But the San Francisco-based company relies on human evaluators to check what the algorithms produce for relevance and accuracy.

They also take part in A/B testing of algorithm-generated user interface designs, rating the different options based on personal preference, said Veronica Mapes, a technical program manager at Pinterest who is in charge of the human evaluation effort.

After initially doing all of the evaluation work on third-party crowdsourcing platforms, Pinterest built its own human evaluation system in 2016, and it used that to significantly expand the rating efforts last year, according to Mapes. The company still uses outside platforms, too, but it now has full-time employees and internal contractors involved in the process, a step designed to both reduce costs and improve the quality of the evaluations, Mapes said in a Strata session.

The intermingling of humans and AI applications is part of a broader effort pushed by Pinterest executives to ensure that the website provides useful info to visitors, said Garner Chung, engineering manager for the Pinterest human evaluation team. The human input acts as a counterbalance to standard engagement metrics that track what users click on and how much time they spend on pages, he said.

We really don't just want to be serving up clickbait and turning our users into zombies.
Garner Chungengineering manager, Pinterest

Engagement is the signal that the machine learning algorithms are based on, "but that's not always the best metric to use," Chung said after the session. "We really don't just want to be serving up clickbait and turning our users into zombies."

Chung cited the example of links to content on lowering body fat showing up in the results of a search for chicken recipes. That might seem like a logical connection to an algorithm, but it's one that a human evaluator should flag as not directly relevant to the search, he said.

AI's limits leave humans in the loop

In general, AI software isn't close to being ready to fully take over the work that people do, said Michael Chui, a partner at the McKinsey Global Institute who leads research on how technology innovation affects businesses and society. "There are real limitations to AI now," Chui said at the Strata conference. "Don't think these technologies can do everything."

Mark Stange-Treager, Ebates Inc.Mark Stange-Treager

Ebates Inc., which runs a shopping rewards program for consumers, is in the early stages of AI adoption. The San Francisco-based company uses semi-autonomous machine learning models to rank member preferences to better target cash-back offers and to help detect odd buying behavior or other potentially fraudulent activities, said Mark Stange-Treager, its vice president of analytics. In the next 12 to 18 months, he expects to start running full-fledged AI algorithms against the company's Hadoop data lake, which has AtScale's data management platform layered on top.

Even then, though, Ebates will likely continue to combine the work of humans and AI in analytics applications, Stange-Treager said in a post-Strata interview. For example, he said he sees a continuing need for manual reviews by workers at the company to determine whether member behavior flagged as anomalous by an algorithm is problematic.

"I envision a scenario where we're using algorithms to do a lot of the work, but not just letting them go off and do their own thing," Stange-Treager said. "I'm not saying we won't get there in the future, but I think that's well off in the future from our point of view."

Dig Deeper on Enterprise applications of AI

Business Analytics
Data Management