Browse Definitions :

History and evolution of machine learning: A timeline

Call it what you like: AI's offshoot, AI's second banana, AI's sidekick, AI's lesser-known twin. Machine learning lacks the cache bestowed upon artificial intelligence, yet just about every aspect of our lives and livelihoods is influenced by this "ultimate statistician" and what it hath wrought since it was merely a twinkle in the eyes of neuroscientists Walter Pitts and Warren McCulloch eight decades ago. Their mathematical modeling of a neural network in 1943 marks machine learning's consensus birth year.

What is machine learning?

Machine learning is about the development and use of computer systems that learn and adapt without following explicit instructions. And it uses algorithms and statistical models to analyze and yield predictive outcomes from patterns in data.

In some regards, machine learning may well be AI's puppet master. Much of what propels generative AI comes from machine learning in the form of large language models that analyze vast amounts of input data to discover patterns in words and phrases.

Many of AI's unprecedented applications in business and society are supported by machine learning's wide ranging capabilities, whether it's analyzing mammograms or digesting Instagrams, assessing risks or predicting failures, navigating the roadways or thwarting the cyber attacks we never hear about. Machine learning's omnipresence impacts the daily business operations of most any industry, including e-commerce, manufacturing, finance, insurance services and pharmaceuticals.

Walk along the machine learning timeline

Through the decades after the 1940s, the evolution of machine learning includes some of the more notable developments:

  • Pioneers named Turing, Samuel, McCarthy, Minsky, Edmonds and Newell dotted the machine learning landscape in the 1950s, when the Turing test, first artificial neural network, and the terms artificial intelligence and machine learning were conceived.
  • The Stanford cart video-controlled remote vehicle, Eliza the first chatbot, Shakey the first mobile intelligent robot, and the foundations of deep learning were developed in the 1960s.
  • Programs that recognize patterns and handwritten characters, solve problems based on natural selection, seek appropriate actions to take, create rules to discard unimportant information, and learn like a baby learns to pronounce words highlighted the 1970s and 1980s.
  • Programs capable of playing backgammon and chess threatened the domains of top-tier backgammon players and the reigning world chess champion in the 1990s.
  • IBM Watson defeated the all-time Jeopardy! champion (do you recognize a pattern here?) plus personal assistants, generative adversarial networks, facial recognition, deepfakes, motion sensing, autonomous vehicles, and content and image creation have emerged so far in the 2000s.


Logician Walter Pitts and neuroscientist Warren McCulloch published the first mathematical modeling of a neural network to create algorithms that mimic human thought processes.


Donald Hebb published the seminal book in machine learning development The Organization of Behavior: A Neuropsychological Theory on how behavior and thought in terms of brain activity relate to neural networks.


Alan Turing published "Computing Machinery and Intelligence," which introduced the Turing test and opened the door to what would be known as AI.


Marvin Minsky and Dean Edmonds developed the first artificial neural network (ANN) called SNARC using 3,000 vacuum tubes to simulate a network of 40 neurons.


Arthur Samuel created the Samuel Checkers-Playing Program, the world's first self-learning program to play games.


John McCarthy, Marvin Minsky, Nathaniel Rochester and Claude Shannon coined the term artificial intelligence in a proposal for a workshop widely recognized as a founding event of the AI field.

Allen Newell, Herbert Simon and Cliff Shaw wrote Logic Theorist, the first AI program deliberately engineered to perform automated reasoning.


Frank Rosenblatt developed the perceptron, an early ANN that could learn from data and became the foundation for modern neural networks.


Arthur Samuel coined the term machine learning in a seminal paper explaining that the computer could be programmed to outplay its programmer.

Oliver Selfridge published "Pandemonium: A Paradigm for Learning," a landmark contribution to machine learning that described a model capable of adaptively improving itself to find patterns in events.


Mechanical engineering graduate student James Adams constructed the Stanford Cart to support his research on the problem of controlling a remote vehicle using video information.


Donald Michie developed a program called MENACE (Matchbox Educable Noughts and Crosses Engine), which learned how to play a perfect game of tic-tac-toe.


Edward Feigenbaum, Bruce G. Buchanan, Joshua Lederberg and Carl Djerassi developed the first expert system, DENDRAL, which assisted organic chemists in identifying unknown organic molecules.


Joseph Weizenbaum created computer program Eliza, capable of engaging in conversations with humans and making them believe the software has human-like emotions.

Stanford Research Institute developed Shakey, the world's first mobile intelligent robot that combined AI, computer vision, navigation capabilities and natural language processing (NLP). It became known as the grandfather of self-driving cars and drones.

Graphic of the evolution of chatbots from Eliza to Bard.
Natural language processing has come a long way since Eliza's first conversations with humans.


The nearest neighbor algorithm provided computers with the capability for basic pattern recognition and was used by traveling salespeople to plan the most efficient routes via the nearest cities.


Arthur Bryson and Yu-Chi Ho described a backpropagation learning algorithm to enable multilayer ANNs, an advancement over the perceptron and a foundation for deep learning.

Marvin Minsky and Seymour Papert published Perceptrons, which described the limitations of simple neural networks and caused neural network research to decline and symbolic AI research to thrive.


James Lighthill released the report "Artificial Intelligence: A General Survey," which led to the British government significantly reducing support for AI research.


Kunihiko Fukushima released work on neocognitron, a hierarchical, multilayered ANN used for pattern recognition tasks.


Gerald Dejong introduced explanation-based learning in which a computer learned to analyze training data and create a general rule for discarding information deemed unimportant.


Terry Sejnowski created a program called NetTalk, which learned to pronounce words like the way a baby learns.


Yann LeCun, Yoshua Bengio and Patrick Haffner demonstrated how convolutional neural networks (CNNs) can be used to recognize handwritten characters, showing that neural networks could be applied to real-world problems.

Christopher Watkins developed Q-learning, a model-free reinforcement algorithm that sought the best action to take in any current state.

Axcelis released Evolver, the first commercially available genetic algorithm software package for personal computers.

Graphic comparing CNN vs. GAN neural networks.
The cornerstone of machine learning, neural networks have differing characteristics.


Gerald Tesauro invented a program capable of playing backgammon called TD-Gammon, based on an ANN and rivaling top-tier backgammon players.


Sepp Hochreiter and Jürgen Schmidhuber proposed the Long Short-Term Memory recurrent neural network, which could process entire sequences of data like speech or video.

IBM's Deep Blue defeated Garry Kasparov in a historic chess rematch, the first defeat of a reigning world chess champion by a computer under tournament conditions.


A team led by Yann LeCun released a data set known as the MNIST (Modified National Institute of Standards and Technology) database, which became widely adopted as a handwriting recognition evaluation benchmark.


University of Montreal researchers published "A Neural Probabilistic Language Model," which suggested a method to model language using feed-forward neural networks.


The first open source machine learning library, Torch, was released, providing interfaces to deep learning algorithms implemented in C.


Psychologist and computer scientist Geoffrey Hinton coined the term deep learning to describe algorithms that help computers recognize different types of objects and text characters in pictures and videos.

Fei-Fei Li started to work on the ImageNet visual database (introduced in 2009), which became a catalyst for the AI boom and the basis of an annual competition for image recognition algorithms.

Netflix launched the Netflix Prize competition with the goal of creating a machine learning algorithm more accurate than Netflix's proprietary user recommendation software.

IBM Watson originated with the initial goal of beating a human on the Jeopardy! quiz show. In 2011, the question-answering computer system defeated the show's all-time (human) champion Ken Jennings.

A rose by any other name ...

The term machine learning may not trigger the same kind of excitement as AI, but ML has been handed some sexy synonyms that rival artificial intelligence -- among them, cybernetic mind, electrical brain and fully adaptive resonance theory. And countless machine learning algorithms that shape ML models and their predictive outcomes run the gamut of the alphabet from Apriori to Z-array.


Microsoft released the Kinect motion-sensing input device for its Xbox 360 gaming console, which could track 20 different human features 30 times per second.

Anthony Goldbloom and Ben Hamner launched Kaggle as a platform for machine learning competitions.


Jürgen Schmidhuber, Dan Claudiu Ciresan, Ueli Meier and Jonathan Masci developed the first CNN to achieve "superhuman" performance by winning the German Traffic Sign Recognition competition.


Geoffrey Hinton, Ilya Sutskever and Alex Krizhevsky introduced a deep CNN architecture that won the ImageNet challenge and triggered the explosion in deep learning research and implementation.


DeepMind introduced deep reinforcement learning, a CNN that learned based on rewards and played games through repetition, surpassing human expert levels.

Google researcher Tomas Mikolov and colleagues introduced word2vec to identify semantic relationships between words automatically.


Ian Goodfellow and colleagues invented generative adversarial networks, a class of machine learning frameworks used to generate photos, transform images and create deepfakes.

Google unveiled the Sibyl large-scale machine learning project for predictive user recommendations.

Diederik Kingma and Max Welling introduced variational autoencoders to generate images, videos and text.

Facebook developed the deep learning facial recognition system DeepFace, which can identify human faces in digital images with near-human accuracy.

Graphic of machine learning's contributions to AI applications.
AI's business applications are supported by machine learning technologies.


Uber started a self-driving car pilot program in Pittsburgh for a select group of users.


Google researchers developed the concept of transformers in the seminal paper "Attention is all you need," inspiring subsequent research into tools that could automatically parse unlabeled text into large language models (LLMs).


OpenAI released GPT (Generative Pre-trained Transformer), paving the way for subsequent LLMs.


Microsoft launched the Turing Natural Language Generation generative language model with 17 billion parameters.

GoogleAI and Langone Medical Center deep learning algorithm outperformed radiologists in detecting potential lung cancers.


OpenAI introduced the Dall-E multimodal AI system that can generate images from text prompts.


DeepMind unveiled AlphaTensor "for discovering novel, efficient and provably correct algorithms."

OpenAI released ChatGPT in November to provide a chat-based interface to its GPT 3.5 LLM.


OpenAI announced the GPT-4 multimodal LLM that receives both text and image prompts.

Elon Musk, Steve Wozniak and thousands more signatories urged a six-month pause on training "AI systems more powerful than GPT-4."

Beyond 2023

Machine learning will continue to synergistically ride the coattails and support the advancements of its overarching behemoth parent artificial intelligence. Generative AI in the near term and eventually AI's ultimate goal of artificial general intelligence in the long term will create even greater demand for data scientists and machine learning (ML) practitioners.

Machine learning will make further inroads into creative AI, distributed enterprises, autonomous systems, hyperautomation and cybersecurity. In the process, business models and job roles could change on a dime.

Expect to see continued advances in the following areas as machine learning becomes more democratized and its models more sophisticated:

  • AutoML for better data management and faster model building.
  • Embedded ML, or TinyML, for more efficient use of edge computing in real-time processing.
  • MLOps for streamlining the development, training and deployment of machine learning systems.
  • Low-code/no-code platforms for developing and implementing ML models without extensive coding or technical expertise.
  • Unsupervised learning for data labeling and feature engineering without human intervention.
  • Reinforcement learning for dishing out rewards or penalties to algorithms based on their actions.
  • NLP for more fluent conversational AI in customer interactions and application development.
  • Computer vision for more effective healthcare diagnostics and greater support for augmented and virtual reality technologies.

In addition, neuromorphic processing shows promise in mimicking human brain cells, enabling computer programs to work simultaneously instead of sequentially.

In the midst of all these developments, business and society will continue to encounter issues with bias, trust, privacy, transparency, accountability, ethics and humanity that can positively or negatively impact our lives and livelihoods.

Editor's note: Linda Tucci and Wesley Chai contributed to the machine learning timeline.

Dig Deeper on Artificial intelligence

  • local area network (LAN)

    A local area network (LAN) is a group of computers and peripheral devices that are connected together within a distinct ...

  • TCP/IP

    TCP/IP stands for Transmission Control Protocol/Internet Protocol and is a suite of communication protocols used to interconnect ...

  • firewall as a service (FWaaS)

    Firewall as a service (FWaaS), also known as a cloud firewall, is a service that provides cloud-based network traffic analysis ...

  • identity management (ID management)

    Identity management (ID management) is the organizational process for ensuring individuals have the appropriate access to ...

  • fraud detection

    Fraud detection is a set of activities undertaken to prevent money or property from being obtained through false pretenses.

  • single sign-on (SSO)

    Single sign-on (SSO) is a session and user authentication service that permits a user to use one set of login credentials -- for ...

  • change management

    Change management is a systematic approach to dealing with the transition or transformation of an organization's goals, processes...

  • IT project management

    IT project management is the process of planning, organizing and delineating responsibility for the completion of an ...

  • chief financial officer (CFO)

    A chief financial officer (CFO) is the corporate title for the person responsible for managing a company's financial operations ...

  • core HR (core human resources)

    Core HR (core human resources) is an umbrella term that refers to the basic tasks and functions of an HR department as it manages...

  • HR service delivery

    HR service delivery is a term used to explain how an organization's human resources department offers services to and interacts ...

  • employee retention

    Employee retention is the organizational goal of keeping productive and talented workers and reducing turnover by fostering a ...

Customer Experience
  • martech (marketing technology)

    Martech (marketing technology) refers to the integration of software tools, platforms, and applications designed to streamline ...

  • transactional marketing

    Transactional marketing is a business strategy that focuses on single, point-of-sale transactions.

  • customer profiling

    Customer profiling is the detailed and systematic process of constructing a clear portrait of a company's ideal customer by ...