Browse Definitions :
Definition

What is a perceptron?

A perceptron is a simple model of a biological neuron used in an artificial neural network. Frank Rosenblatt introduced the concept in 1957, when he demonstrated how it could be a building block in a single-layer neural network. The perceptron is considered one of the earliest algorithms created for the supervised learning of binary classifiers.

The perceptron algorithm was designed to classify visual inputs, grouping them into one of two categories. The algorithm assumes the data is linearly separable, that is, it can be naturally separated into two distinct categories. This concept can be visualized as a two-dimensional plane with two sets of data points. If those data sets can be separated by a straight line, they are said to be linearly separable. Otherwise, they're nonlinearly separable, as shown in Figure 1.

perceptron algorithm classifies visual inputs
Figure 1: Plots of linearly separable and nonlinearly separable data.

Classification is an important part of machine learning and image processing. Machine learning algorithms use different methods to find and categorize data set patterns. One of those methods is the perceptron algorithm. It performs binary classification by finding the linear separation between the data points received through the perceptron's input.

How does the perceptron work?

The perceptron is an artificial neuron that attempts to approximate a biological neuron. The perceptron takes one or more weighted inputs and returns a single binary output, either 1 or 0. Figure 2 shows the perceptron algorithm, in which x is the input value, w is the input's weight, and b is the specified bias (threshold).

perceptron algorithm diagram
Figure 2: How the perceptron algorithm works.

The perceptron algorithm can be represented in various ways, often using different naming conventions, but the basic concepts are the same. The perceptron is composed of these components:

  • Inputs. The perceptron receives data from one or more inputs. Each input has a numerical value that represents a data attribute.
  • Weights. Each input value is assigned a numerical weight. The weight determines the input's relative strength, as it pertains to the perceptron as a whole. The input value and weight are multiplied together to come up with a weighted value for that input.
  • Summation function. The function adds together the weighted values from all the inputs. For this reason, the summation function is sometimes referred to as a net input function.
  • Bias (threshold). A specific numeric value is assigned to the perceptron to control the output independently of the inputs, resulting in greater flexibility.
  • Activation function. The function performs a calculation on the input sum and bias to determine whether to return a binary 1 or 0. The exact approach will depend on the type of function. A step function is commonly used for the activation stage, although there are multiple other types of activation functions.
  • Output. The binary result of the activation function.

To better understand how these components work together, consider a situation in which a decision-maker wants to determine whether to purchase a product based on four attributes. Each attribute represents a binary input, in the form of true or false (1 or 0, respectively). Figure 3 lists the attributes, their input values and their weights.

The perceptron algorithm table
Figure 3: Calculating a product rating with weighted values for four attributes.

In this scenario, the warranty attribute (the third one in Figure 3) is given the greatest weight and the sale attribute (the last one) is given the least. Not surprisingly, the way in which the inputs are weighted can greatly impact the outcome. The summation function adds the weighted inputs together with this formula:

z = x1w1 + x2w2 + x3w3 + x4w4

The summation function returns a single value, which the activation function then uses. To determine the input sum for the attributes, their input values and weights can be plugged into a formula:

z = x1w1 + x2w2 + x3w3 + x4w4

z = (1 x 0.7) + (0 x 0.6) + (1 x 0.8) + (1 x 0.3)

z = 0.7 + 0 + 0.8 + 0.3

z = 1.8

Here, the summation function (z) returns a value of 1.8. This value can be used in the activation function (a), along with the bias (b), which in this case has a value of 2. Activation functions use different logic to calculate the binary output value. This example applies the following logic to the summation and bias:

a = 1 if z > b else 0

The formula states that input sum must be greater than the bias for the function to return a 1. Otherwise, it will return a 0. You can apply this formula to the example above:

a = 1 if z > b else 0

a = 1 if 1.8 > 2 else 0

a = 0

The input sum does not exceed the threshold, so the perceptron returns a value of 0. This simple example demonstrates the basic concepts of how the perceptron works and applies straightforward logic.

A perceptron can also be trained via a learning algorithm. During the training period, the perceptron undergoes a period of supervised learning that uses labeled data in which the categories are already known. Based on the results of this process, the perceptron's weights and biases are then adjusted to improve the perceptron's accuracy.

A brief history of the perceptron

The artificial neuron idea predates Rosenblatt's work. In 1943, neurophysiologist Warren McCulloch and logician Walter Pitts published a seminal paper "A logical calculus of the ideas immanent in nervous activity," which provides the first mathematical model of a neural network. Their artificial neuron --dubbed the McCulloch-Pitts (MCP) neuron -- receives binary inputs and produces a binary output.

A chart conveying a timeline of the evolution of machine learning
Machine learning has had a long, interesting evolution.

When developing the perceptron, Rosenblatt built on the concepts of the MCP neuron, but he enhanced the model to increase its flexibility. He worked on the perceptron in 1957 at Cornell Aeronautical Laboratory, where he received funding from the United States Office of Naval Research. Rosenblatt first ran the algorithm on an IBM 704 computer that weighed five tons and filled an entire room. The project represented the first step toward a machine implementation for image recognition.

The following year, Rosenblatt and his collogues built their own machine: the Mark I Perceptron. The computer contained an array of 400 photocells connected to perceptrons. The perceptron weights were recorded in potentiometers, as adjusted by electric motors. The Mark I Perceptron was one of the first artificial neural networks ever created. It now resides at the National Museum of American History, which is part of the Smithsonian Institute.

When the perceptron was introduced, many believed it represented a significant milestone in the push toward AI. However, the technical limitations of the perceptron soon became apparent because single-layer perceptrons can group data only if it is linearly separable. Later, data scientists discovered that by using multilayered perceptrons, they could classify nonlinearly separable data, allowing them to solve problems that single-layer algorithms could not.

Training neural nets to mirror the human brain enables deep learning models to apply learning to data they've never seen before. Learn how neural network training methods are modeled after the human brain. Also, explore how supervised, unsupervised, semisupervised and reinforcement learning compare to each other.

This was last updated in July 2024

Continue Reading About What is a perceptron?

Networking
  • What is wavelength?

    Wavelength is the distance between identical points, or adjacent crests, in the adjacent cycles of a waveform signal propagated ...

  • subnet (subnetwork)

    A subnet, or subnetwork, is a segmented piece of a larger network. More specifically, subnets are a logical partition of an IP ...

  • Transmission Control Protocol (TCP)

    Transmission Control Protocol (TCP) is a standard protocol on the internet that ensures the reliable transmission of data between...

Security
  • What is exposure management?

    Exposure management is a cybersecurity approach to protecting exploitable IT assets.

  • intrusion detection system (IDS)

    An intrusion detection system monitors (IDS) network traffic for suspicious activity and sends alerts when such activity is ...

  • cyber attack

    A cyber attack is any malicious attempt to gain unauthorized access to a computer, computing system or computer network with the ...

CIO
  • What is a startup company?

    A startup company is a newly formed business with particular momentum behind it based on perceived demand for its product or ...

  • What is a CEO (chief executive officer)?

    A chief executive officer (CEO) is the highest-ranking position in an organization and responsible for implementing plans and ...

  • What is labor arbitrage?

    Labor arbitrage is the practice of searching for and then using the lowest-cost workforce to produce products or goods.

HRSoftware
  • organizational network analysis (ONA)

    Organizational network analysis (ONA) is a quantitative method for modeling and analyzing how communications, information, ...

  • HireVue

    HireVue is an enterprise video interviewing technology provider of a platform that lets recruiters and hiring managers screen ...

  • Human Resource Certification Institute (HRCI)

    Human Resource Certification Institute (HRCI) is a U.S.-based credentialing organization offering certifications to HR ...

Customer Experience
Close