Browse Definitions :

normal distribution

What is normal distribution?

A normal distribution is a type of continuous probability distribution in which most data points cluster toward the middle of the range, while the rest taper off symmetrically toward either extreme. The middle of the range is also known as the mean of the distribution.

The normal distribution is also known as a Gaussian distribution or probability bell curve. It is symmetric about the mean and indicates that values near the mean occur more frequently than the values that are farther away from the mean.

Normal distribution explained

Graphically, a normal distribution is a bell curve because of its flared shape. The precise shape can vary according to the distribution of the values within the population. The population is the entire set of data points that are part of the distribution.

Regardless of its exact shape, a normal distribution bell curve is always symmetrical about the mean. A symmetrical distribution means that a vertical dividing line drawn through the maximum/mean value will produce two mirror images on either side of the line, in which half the population is less than the mean and half is greater. However, the reverse is not always true; that is, not all symmetrical distributions are normal. In the bell curve, the peak is always in the middle, and the mean, mode and median are all the same.

normal distribution bell curve
A normal distribution bell curve is always symmetrical about the mean.

Basic examples of normal distribution: Height and weight

Height is one simple example of values that follow a normal distribution pattern. Most people are of average height -- whatever that may be for a given population. If the heights of these people are represented in graphical format along with the heights of people who are taller and shorter than the average, the distribution will always be a normal distribution. This is because the people of average height will be clustered near the middle, while those who are taller and shorter will be farther away.

Further, these latter groups will consist of very small numbers of people. The number of people who are extremely tall or extremely short will be even smaller, so they will be the farthest away from the mean.

Similarly, weight can also follow a normal distribution if the average weight of the population under consideration is known. Like height, the weight outliers will be those who weigh more or less than the average. The bigger the deviation from the average, the farther away those data points will be on the distribution graph.

Importance of normal distribution

The normal distribution is one of the most important probability distributions for independent random variables for three main reasons.

First, normal distribution describes the distribution of values for many natural phenomena in a wide range of areas, including biology, physical science, mathematics, finance and economics. It can also represent these random variables accurately.

In addition to height and weight, normal distributions are also used to represent many other values, including the following:

  • measurement error
  • blood pressure
  • IQ scores
  • asset prices
  • price action

Second, the normal distribution is important because it can be used to approximate other types of probability distribution, such as binomial, hypergeometric, inverse (or negative) hypergeometric, negative binomial and Poisson distribution.

Third, normal distribution is the key idea behind the central limit theorem, or CLT, which states that averages calculated from independent, identically distributed random variables have approximately normal distributions. This is true regardless of the type of distribution from which the variables are sampled, as long as it has finite variance.

Normal distribution formula and empirical rule

The formula for the normal distribution is expressed below.

normal distribution formula
The formula for the normal distribution.

Here, x is value of the variable; f(x) represents the probability density function; μ (mu) is the mean; and σ (sigma) is the standard deviation.

The empirical rule for normal distributions describes where most of the data in a normal distribution will appear, and it states the following:

  • 68.2% of the observations will appear within +/-1 standard deviation of the mean;
  • 95.4% of the observations will fall within +/-2 standard deviations; and
  • 99.7% of the observations will fall within +/-3 standard deviations.

All data points falling outside of three standard deviations (3σ) indicate rare occurrences.

Parameters of normal distribution

Since the mean, mode and median are the same in a normal distribution, there's no need to calculate them separately. These values represent the distribution's highest point, or the peak. All other values in the distribution then fall symmetrically around the mean. The width of the mean is defined by the standard deviation.

In fact, only two parameters are required to describe a normal distribution: the mean and the standard deviation.

1. The mean

The mean is the central highest value of the bell curve. All other values in the distribution either cluster around it or are at some distance away from it. Changing the mean on a graph will shift the entire curve along the x-axis, either toward the left or toward the right. However, its symmetricity will still be maintained.

2. The standard deviation

In general, standard deviation is a measure of variability in a distribution. In a bell curve, it defines the width of the distribution and shows how far away from the mean the other values fall. In addition, it represents the typical distance between the average and the observations.

Changing the standard deviation will change the distribution of values around the mean. A smaller deviation will reduce the spread -- tightening the distribution -- while a larger deviation will increase the spread and produce a wider distribution. As the distribution gets wider, it becomes more likely that values will be farther away from the mean.

Skewness and kurtosis in a normal distribution

Skewness represents a distribution's degree of symmetry. Since the normal distribution is perfectly symmetric, it has a skewness of zero. In other distributions with a skewness less than or greater than zero, the left tail (left skewness) or the right tail (right skewness) will be longer, respectively.

Kurtosis measures the thickness of each tail end of a distribution vis-à-vis the tails of a normal distribution. For a normal distribution, kurtosis is always equal to 3. In a distribution with kurtosis greater than 3, the tail data will exceed the tails of the normal distribution, resulting in a phenomenon called fat tails. In financial markets, fat tails describe tail risk -- the chance of a loss due to some rare event. Distributions with kurtosis less than 3 show tails that are skinnier than the tails of a normal distribution.

See also: statistical analysis, histogram, dependent variable, data, data scientist, big data, data classification, data mining, data context and time-series analysis in IT environments.

This was last updated in December 2022

Continue Reading About normal distribution

  • unshielded twisted pair (UTP)

    Unshielded twisted pair (UTP) is a ubiquitous type of copper cabling used in telephone wiring and local area networks (LANs).

  • Multiprotocol Label Switching (MPLS)

    Multiprotocol Label Switching (MPLS) is a switching mechanism used in wide area networks (WANs).

  • computer network

    A computer network is a group of interconnected nodes or computing devices that exchange data and resources with each other.

  • three-factor authentication (3FA)

    Three-factor authentication (3FA) is the use of identity-confirming credentials from three separate categories of authentication ...

  • cyber espionage

    Cyber espionage (cyberespionage) is a type of cyber attack that malicious hackers carry out against a business or government ...

  • role-based access control (RBAC)

    Role-based access control (RBAC) is a method of restricting network access based on the roles of individual users within an ...

  • knowledge-based systems (KBSes)

    Knowledge-based systems (KBSes) are computer programs that use a centralized repository of data known as a knowledge base to ...

  • Sarbanes-Oxley Act

    The Sarbanes-Oxley Act of 2002 is a federal law that established sweeping auditing and financial regulations for public companies.

  • project charter

    A project charter is a formal short document that states a project exists and provides project managers with written authority to...

  • employee engagement

    Employee engagement is the emotional and professional connection an employee feels toward their organization, colleagues and work.

  • talent pool

    A talent pool is a database of job candidates who have the potential to meet an organization's immediate and long-term needs.

  • diversity, equity and inclusion (DEI)

    Diversity, equity and inclusion is a term used to describe policies and programs that promote the representation and ...

Customer Experience
  • sales development representative (SDR)

    A sales development representative (SDR) is an individual who focuses on prospecting, moving and qualifying leads through the ...

  • service level indicator

    A service level indicator (SLI) is a metric that indicates what measure of performance a customer is receiving at a given time.

  • customer data platform (CDP)

    A customer data platform (CDP) is a type of software application that provides a unified platform of customer information that ...