IBM Watson supercomputer

Watson is an IBM supercomputer that combines artificial intelligence (AI) and sophisticated analytical software for optimal performance as a "question answering" machine. The supercomputer is named for IBM's founder, Thomas J. Watson.

The Watson supercomputer processes at a rate of 80 teraflops (trillion floating point operations per second). To replicate (or surpass) a high-functioning human's ability to answer questions, Watson accesses 90 servers with a combined data store of over 200 million pages of information, which it processes against six million logic rules. The system and its data are self-contained in a space that could accommodate 10 refrigerators.

Watson's key components include:

  • Apache Unstructured Information Management Architecture (UIMA) frameworks, infrastructure and other elements required for the analysis of unstructured data.
  • Apache's Hadoop, a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment.
  • SUSE Enterprise Linux Server 11, the fastest available Power7 processor operating system.
  • 2,880 processor cores.
  • 15 terabytes (TB) of RAM.
  • 500 gigabytes (GB) of preprocessed information.
  • IBM's DeepQA software, which is designed for information retrieval that incorporates natural language processing (NLP) and machine learning.

Applications for Watson's underlying cognitive computing technology are almost endless. Because the device can perform text mining and complex analytics on huge volumes of unstructured data, it can support a search engine or an expert system with capabilities far superior to any previously existing.

In May 2016, BakerHostetler, an Ohio-based law firm, signed a contract for a legal expert system based on Watson to work with its 50-person bankruptcy team. That system, called Ross, can mine data from about a billion text documents, analyze the information and provide precise responses to complicated questions in less than three seconds. Natural language processing allows the system to translate legalese to respond to the lawyers' questions.

As Ross' creators add more legal modules, similar expert systems are transforming medical research.

Watson in healthcare

Healthcare was one of the first industries to which Watson technology was applied. The first commercial implementation of Watson came in 2013 when the Memorial Sloan Kettering Cancer Center began using the system to recommend treatment options for lung cancer patients to ensure they received the right treatment while reducing costs. Since that time, providers such as Cleveland Clinic, Maine Center for Cancer Medicine and Westmed Medical Group have also implemented Watson tools.

However, not every implementation has gone smoothly. The MD Anderson Cancer Center in Houston launched a project in 2013 to build a decision support system powered by Watson technology to help doctors determine the best treatment options. But after spending more than $62 million on the project over the course of four years, hospital administrators canceled the project, saying it had failed to meet its goals.

Healthcare remains a primary focal point for IBM as it tries to prove Watson technology, and the company continues to forge partnerships with healthcare organizations. In May 2018, for example, India's largest specialty healthcare systems, Apollo, agreed to adopt Watson for Oncology and Watson for Genomics. The two IBM cognitive computing platforms will help doctors make decisions for personalized cancer care.

IBM's use of Watson to solve some of the biggest problems around patient care and using data-driven insights to recommend treatment options would prove the value of Watson technologies.

Watson Analytics

Watson Analytics is one of the primary implementations of Watson technology. It is a platform for exploring, visualizing and presenting data that utilizes Watson's cognitive capabilities to automatically surface data-driven insights and recommend ways of presenting the data.

The platform is made up of an exploration component, which allows users to upload their data, automatically recommends potentially correlated variables and builds comparisons; a prediction tool that allows users to get answers to complex questions based on their data; and a reporting tool that supports dashboard and report development.

IBM's path to cognitive computing.

Each component is accessed using a graphical user interface (GUI), which minimizes the need for advanced data science training. The platform is intended to make advanced analytics accessible to workers with limited technical knowledge. The cost of Watson Analytics depends on the version; there is a free version which includes the ability to upload spreadsheets, get visualizations, get insights and build dashboards. The "Plus" edition includes the capabilities in the free version along with 2 GB of storage and data sources, including databases, starts at $30 per user, per month. A "Professional" edition with all of the above features, as well as a multiuser tenant to collaborate, 100 GB of storage and more data, costs $80 or more per user, per month. (2018 pricing sourced from IBM Watson Analytics website).

Watson APIs let businesses build AI applications

IBM has published a range of application program interfaces (APIs) on its cloud that allow users to build their own AI applications that utilize Watson's core technology on the back end. There are APIs that support popular development frameworks like Java, Python and others.

IBM also has API connectors to pretrained deep learning algorithms that allow users to build applications for things like natural language processing, image recognition and tone analysis. One API supports the development of smart assistants using Watson technology on the back end.

IBM Watson's history

In a fall 2010 AI Magazine article, IBM researchers reported on their three-year journey to build a computer system that could compete with humans in answering questions correctly in real time on the TV show Jeopardy! This project led to the design of IBM's DeepQA architecture and Watson.

In 2011, Watson challenged two top-ranked players on Jeopardy! -- champions Ken Jennings and Brad Rutter -- and famously beat them. The Watson avatar sat between the two contestants, as a human competitor would, while its considerable bulk sat on a different floor of the building. Like the other contestants, Watson didn't have internet access.

IBM Watson avatar on 'Jeopardy!' in 2011
IBM Watson on 'Jeopardy!' in 2011

In the practice round, Watson demonstrated a human-like ability for complex wordplay, correctly responding, for example, to the answer clue, "Classic candy bar that's a female Supreme Court justice," with, "What is Baby Ruth Ginsburg?" Rutter noted that although the retrieval of information is "trivial" for Watson and difficult for a human, the human is still better at the complex task of comprehension. Nevertheless, machine learning allows Watson to examine its mistakes against the correct answers to see where it erred and inform future responses.

IBM researchers concluded that DeepQA proved to be an effective and extensible architecture which could be used to combine, deploy, evaluate and advance a wide range of algorithmic techniques in the field of question answering.

This was last updated in January 2023

Dig Deeper on AI infrastructure

Business Analytics
Data Management