deep learning Prep with 20 machine learning interview questions and answers
X
Definition

neural network

What is a neural network?

A neural network is a machine learning (ML) model designed to mimic the function and structure of the human brain. Neural networks are intricate networks of interconnected nodes, or neurons, that collaborate to tackle complicated problems.

Also referred to as artificial neural networks (ANNs) or deep neural networks, neural networks represent a type of deep learning technology that's classified under the broader field of artificial intelligence (AI).

Neural networks are widely used in a variety of applications, including image recognition, predictive modeling and natural language processing (NLP). Examples of significant commercial applications since 2000 include handwriting recognition for check processing, speech-to-text transcription, oil exploration data analysis, weather prediction and facial recognition.

How do artificial neural networks work?

An artificial neural network usually involves many processors operating in parallel and arranged in tiers or layers. The first tier -- analogous to optic nerves in human visual processing -- receives the raw input information. Each successive tier receives the output from the tier preceding it rather than the raw input -- the same way neurons further from the optic nerve receive signals from those closer to it. The last tier produces the output of the system.

Each processing node has its own small sphere of knowledge, including what it has seen and any rules it was originally programmed with or developed for itself. The tiers are highly interconnected, which means each node in Tier N will be connected to many nodes in Tier N-1 -- its inputs -- and in Tier N+1, which provides input data for those nodes. There could be one or more nodes in the output layer, from which the answer it produces can be read.

Layers of neural networks
Each layer in a neural network consists of small individual neurons.

Artificial neural networks are noted for being adaptive, which means they modify themselves as they learn from initial training and subsequent runs provide more information about the world. The most basic learning model is centered on weighting the input streams, which is how each node measures the importance of input data from each of its predecessors. Inputs that contribute to getting the right answers are weighted higher.

Applications of artificial neural networks

Image recognition was one of the first areas in which neural networks were successfully applied. But the technology uses have expanded to many more areas:

  • Chatbots.
  • NLP, translation and language generation.
  • Stock market predictions.
  • Delivery driver route planning and optimization.
  • Drug discovery and development.
  • Social media.
  • Personal assistants.

Prime uses involve any process that operates according to strict rules or patterns and has large amounts of data. If the data involved is too large for a human to make sense of in a reasonable amount of time, the process is likely a prime candidate for automation through artificial neural networks.

How do neural networks learn?

Typically, an ANN is initially trained or fed large amounts of data. Training consists of providing input and telling the network what the output should be. For example, to build a network that identifies the faces of actors, the initial training might be a series of pictures, including actors, non-actors, masks, statues and animal faces. Each input is accompanied by matching identification, such as actors' names or "not actor" or "not human" information. Providing the answers allows the model to adjust its internal weightings to do its job better.

For example, if nodes David, Dianne and Dakota tell node Ernie the current input image is a picture of Brad Pitt, but node Durango says it's George Clooney, and the training program confirms it's Pitt, Ernie decreases the weight it assigns to Durango's input and increase the weight it gives to David, Dianne and Dakota.

In defining the rules and making determinations -- the decisions of each node on what to send to the next tier based on inputs from the previous tier -- neural networks use several principles. These include gradient-based training, fuzzy logic, genetic algorithms and Bayesian methods. They might be given some basic rules about object relationships in the data being modeled.

For example, a facial recognition system might be instructed, "Eyebrows are found above eyes," or, "Moustaches are below a nose. Moustaches are above and/or beside a mouth." Preloading rules can make training faster and the model more powerful faster. But it also includes assumptions about the nature of the problem, which could prove to be either irrelevant and unhelpful or incorrect and counterproductive, making the decision about what, if any, rules to build in important.

Further, the assumptions people make when training algorithms cause neural networks to amplify cultural biases. Biased data sets are an ongoing challenge in training systems that find answers on their own through pattern recognition in data. If the data feeding the algorithm isn't neutral -- and almost no data is -- the machine propagates bias.

Bias in AI systems
The problem with biased data sets exists in the training of neural systems.

Types of neural networks

Neural networks are sometimes described in terms of their depth, including how many layers they have between input and output, or the model's so-called hidden layers. This is why the term neural network is used almost synonymously with deep learning. They can also be described by the number of hidden nodes the model has or in terms of how many input layers and output layers each node has. Variations on the classic neural network design enable various forms of forward and backward propagation of information among tiers.

Specific types of artificial neural networks include the following:

Feed-forward neural networks

One of the simplest variants of neural networks, these pass information in one direction, through various input nodes, until it makes it to the output node. The network might or might not have hidden node layers, making their functioning more interpretable. It's prepared to process large amounts of noise.

This type of ANN computational model is used in technologies such as facial recognition and computer vision.

Recurrent neural networks (RNNs)

More complex in nature, RNNs save the output of processing nodes and feed the result back into the model. This is how the model learns to predict the outcome of a layer. Each node in the RNN model acts as a memory cell, continuing the computation and execution of operations.

This neural network starts with the same front propagation as a feed-forward network but then goes on to remember all processed information to reuse it in the future. If the network's prediction is incorrect, then the system self-learns and continues working toward the correct prediction during backpropagation.

This type of ANN is frequently used in text-to-speech conversions.

Convolutional neural networks (CNNs)

CNNs are one of the most popular models used today. This computational model uses a variation of multilayer perceptrons and contains one or more convolutional layers that can be either entirely connected or pooled. These convolutional layers create feature maps that record a region of the image that's ultimately broken into rectangles and sent out for nonlinear processing.

The CNN model is particularly popular in the realm of image recognition. It has been used in many of the most advanced applications of AI, including facial recognition, text digitization and NLP. Other use cases include paraphrase detection, signal processing and image classification.

Deconvolutional neural networks

Deconvolutional neural networks use a reversed CNN model process. They try to find lost features or signals that might have originally been considered unimportant to the CNN system's task. This network model can be used in image synthesis and analysis.

Modular neural networks

These contain multiple neural networks working separately from one another. The networks don't communicate or interfere with each other's activities during the computation process. Consequently, complex or big computational processes can be performed more efficiently.

Advantages of artificial neural networks

Artificial neural networks offer the following benefits:

  • Parallel processing abilities. ANNs have parallel processing abilities, which means the network can perform more than one job at a time.
  • Information storage. ANNs store information on the entire network, not just in a database. This ensures that even if a small amount of data disappears from one location, the entire network continues to operate.
  • Non-linearity. The ability to learn and model nonlinear, complex relationships helps model the real-world relationships between input and output.
  • Fault tolerance. ANNs come with fault tolerance, which means the corruption or fault of one or more cells of the ANN won't stop the generation of output.
  • Gradual corruption. This means the network slowly degrades over time instead of degrading instantly when a problem occurs.
  • Unrestricted input variables. No restrictions are placed on the input variables, such as how they should be distributed.
  • Obsevation-based decisions. Machine learning means the ANN can learn from events and make decisions based on the observations.
  • Unorganized data processing. Artificial neural networks are exceptionally good at organizing large amounts of data by processing, sorting and categorizing it.
  • Ability to learn hidden relationships. ANNs can learn the hidden relationships in data without commanding any fixed relationship. This means ANNs can better model highly volatile data and non-constant variance.
  • Ability to generalize data. The ability to generalize and infer unseen relationships on unseen data means ANNs can predict the output of unseen data.

Disadvantages of artificial neural networks

Along with their numerous benefits, neural networks also have some drawbacks, including the following:

  • Lack of rules. The lack of rules for determining the proper network structure means the appropriate artificial neural network architecture can only be found through trial, error and experience.
  • Hardware dependency. The requirement of processors with parallel processing abilities makes neural networks dependent on hardware.
  • Numerical translation. The network works with numerical information, meaning all problems must be translated into numerical values before they can be presented to the ANN.
  • Lack of trust. The lack of explanation behind probing solutions is one of the biggest disadvantages of ANNs. The inability to explain the why or how behind the solution generates a lack of trust in the network.
  • Inaccurate results. If not trained properly, ANNs can often produce incomplete or inaccurate results.
  • Black box nature. Because of their black box AI model, it can be challenging to grasp how neural networks make their predictions or categorize data.

History and timeline of neural networks

The history of neural networks spans several decades and has seen considerable advancements. The following examines the important milestones and developments in the history of neural networks:

  • 1940s. In 1943, mathematicians Warren McCulloch and Walter Pitts built a circuitry system that ran simple algorithms and was intended to approximate the functioning of the human brain.
  • 1950s. In 1958, Frank Rosenblatt, an American psychologist who's also considered the father of deep learning, created the perceptron, a form of artificial neural network capable of learning and making judgments by modifying its weights. The perceptron featured a single layer of computing units and could handle problems that were linearly separate.
  • 1970s. Paul Werbos, an American scientist, developed the backpropagation method, which facilitated the training of multilayer neural networks. It made deep learning possible by enabling weights to be adjusted across the network based on the error calculated at the output layer.
  • 1980s. Cognitive psychologist and computer scientist Geoffrey Hinton, along with computer scientist Yann LeCun, and a group of fellow researchers began investigating the concept of connectionism, which emphasizes the idea that cognitive processes emerge through interconnected networks of simple processing units. This period paved the way for modern neural networks and deep learning.
  • 1990s. Jürgen Schmidhuberand Sepp Hochreiter, both computer scientists from Germany, proposed the Long Short-Term Memory recurrent neural network framework in 1997.
  • 2000s. Geoffrey Hinton and his colleagues pioneered RBMs, a sort of generative artificial neural network that enables unsupervised learning. RBMs opened the path for deep belief networks and deep learning algorithms.

It wasn't until around 2010 that research in neural networks picked up great speed. The big data trend, where companies amass vast troves of data and parallel computing gave data scientists the training data and computing resources needed to run complex artificial neural networks. In 2012, a neural network named AlexNet won the ImageNet Large Scale Visual Recognition competition, an image classification challenge. Since then, interest in artificial neural networks has soared and technology has continued to improve.

Generative adversarial networks and transformers are two independent machine learning algorithms. Learn how the two methods differ from each other and how they could be used in the future to provide users with greater outcomes.

This was last updated in August 2023

Continue Reading About neural network

Dig Deeper on AI infrastructure

Business Analytics
CIO
Data Management
ERP
Close