This content is part of the Conference Coverage: IBM Think 2019 coverage: Spotlight on data and AI analytics

IBM Project Debater crowdsources to boost its AI, NLP skills

Project Debater, an IBM AI system, creates viewpoints with the help of people who type in arguments on topics. That may also help drive future business uses.

Machine versus man. It's a trope that has long powered science-fiction novels and movies, and one that has often played out in such works in a sinister way.

Yet, at today's IBM, the contest between man and machine isn't evil or creepy, and it isn't the stuff of fiction. With IBM Project Debater, the match is a war of words, a battle of ideas and a real-time intellectual competition between humans and a collection of AI algorithms. The ultimate winner has yet to be decided. But, either way, IBM envisions the technology eventually being used by businesses for analytics and decision-support applications.

Unveiled during a live debate in June 2018, IBM Project Debater is an AI system that can debate humans in real time on almost any topic. Project Debater will be prominently featured at the tech giant's flagship conference, IBM Think 2019, to be held Feb. 12 to 15 in San Francisco. The system will take on a professional debater the night before Think officially starts, and it's scheduled to be used during the conference for debates that attendees and other people can participate in online to help it produce pro-and-con viewpoints on topics.

Possessing a corpus of millions of articles and papers, IBM Project Debater has been developed over the past six years at the IBM Research lab in Haifa, Israel. It's trained to use AI and natural language processing (NLP) technology to automatically cluster relevant pieces of information together and extract core points from the data.

Arguing algorithms

Using sophisticated machine learning and natural language generation algorithms, the system is then able to construct coherent and original sentences from the information it has analyzed. To compete in a debate setting, researchers trained the system on some of the basic principles of rhetoric, including how to make arguments both factual and emotionally compelling.

"This is a fascinating application of NLP and tackles fairly complex challenges," said Zunaid Kazi, a former IBM data scientist who now is co-founder and CTO at AI software vendor Infolytx Inc., based in New York.

"One of the most exciting parts of this project is where IBM has taken natural language understanding," he continued. "Ingesting huge amounts of text and then distilling nuanced opinions is impressive."

In the 1990s, Kazi was based at IBM's Thomas J. Watson Research Center in Yorktown Heights, N.Y., where he worked on NLP, focusing mainly on information extraction and retrieval and question-answering systems.

This was before IBM's Jeopardy-playing Watson system and the full internet boom, Kazi said.

At that time, "NLP was rapidly moving away from handcrafted grammar and linguistic-based rules to relying more on statistical processing," he said. "New statistical models and machine learning tools were starting to come to the fore."

Of course, much has changed since then, including the growth of the web and what Kazi called the "deep learning revolution."

"More data powered by more compute and application of deep learning is revolutionizing NLP both at IBM and globally," he said.

Over the past several months, IBM Project Debater has toured tech and trade shows, competing against professional and award-winning debaters on a litany of subjects.

However, at the big CES 2019 consumer technology show that was held in Las Vegas in early January, the system debated a quite different type of opponent -- the general public -- using a new program that IBM calls Project Debater -- Speech by Crowd.

"This was a way to engage people not only in the debating machine, but also for providing arguments for Debater," said Aya Soffer, director of AI tech at IBM's Haifa lab.

IBM Project Debater Speech by Crowd CES 2019
IBM's Project Debater engaged the public in debates at CES 2019

Bringing in the crowd

Throughout the conference, attendees and the general public were asked to provide textual, online arguments for or against various questions, such as whether gambling should be illegal or whether the government should ban the sale of violent video games to minors.

Hundreds of responses came in through the website the team had set up, Soffer said. "Taking more or less the same technology that we employed before," she said, the IBM Project Debater system autonomously compiled and clustered the responses, and then used them to create two unique speeches: one in support of a question and the other against it.

The level of human input in the process, Soffer noted, was kept to a bare minimum; humans simply checked for inappropriate responses and removed any they found. The system has a way to automatically do that, as well, but she said the team didn't want to risk introducing errors with such a big audience.

Soffer explained that, generally, the Project Debater system automatically does a "quality check," removing poorly constructed or irrelevant responses from its pool of total responses. Then, it removes redundant responses before extracting the text from any remaining responses.

This is a fascinating application of NLP and tackles fairly complex challenges.
Zunaid Kazico-founder and CTO at Infolytx

Project Debater then clusters that information in different supporting or opposing points and automatically chooses a supporting and opposing theme to argue. It picks some of the clusters it created to expound on those themes before constructing the page-long arguments

The arguments, though generally fairly well-constructed, sometimes have a few errors, Soffer said, as some of the points are "taken verbatim and some mined from collective arguments."

For example, in one case, the system wrote about a "metal disease," rather than a "mental disease," drawing on a typo in one of the human responses.

The process is fairly well-defined on the IBM Project Debater site, with the arguments containing footnotes pointing to where certain points came from.

Using debates in the nonartificial world

Beyond serving as an elegant marketing vehicle, crowdsourcing debates could have a few practical applications, Soffer asserted.

For one, the technology helps participants think critically about a problem and craft their own responses to it, as well as view perhaps previously unthought-of responses that the system compiles. This could have applications in schools, for example, Soffer said.

IBM said another aim of the endeavor is to eventually open IBM Project Debater to business organizations, which could use the underlying technology to, say, survey employees in depth or help corporate executives debate difficult decisions.

Project Debater "has taken question answering to a whole different level," Kazi said, adding that he is "impressed with how well it does in generating a coherent and compelling narrative without being trained on specific topics."

Besides Project Debater, the IBM Think event will feature demonstrations of other IBM AI products, like IBM Watson, and keynotes from high-profile figures and executives. Scheduled speakers include IBM CEO Ginni Rometty, AT&T Communications CEO John Donovan and skateboarding legend Tony Hawk, who owns the skateboarding company Birdhouse and has worked with Activision on numerous skateboarding video games bearing his name.

Dig Deeper on Artificial intelligence platforms

Business Analytics
Data Management