Browse Definitions :
Definition

Zipf's Law

Zipf’s Law is a statistical distribution in certain data sets, such as words in a linguistic corpus, in which the frequencies of certain words are inversely proportional to their ranks. Named for linguist George Kingsley Zipf, who around 1935 was the first to draw attention to this phenomenon, the law examines the frequency of words in natural language and how the most common word occurs twice as often as the second most frequent word, three times as often as the subsequent word and so on until the least frequent word. The word in the position n appears 1/n times as often as the most frequent one.

When words are ranked according to their frequencies in a large enough collection of texts and then the frequency is plotted against the rank, the result is a logarithmic curve. (Or if you graph on a log scale, the result is a straight line.)

The most common word in English is “the,” which appears about one-tenth of the time in a typical text; the next most common word (rank 2) is of,” which appears about one-twentieth of the time. In this type of distribution, frequency declines sharply as the rank number increases, so a small number of items appear very often, and a large number rarely occur.

A Zipfian distribution of words is universal in natural language: It can be found in the speech of children less than 32 months old as well as in the specialized vocabulary of university textbooks. Studies show that this phenomenon also applies in nearly every language.

Individually, neither syntax nor semantics is sufficient to induce a Zipfian distribution on its own. However, syntax and semantics work together for a Zipfian distribution.

Only recently has Zipf’s Law been tested rigorously on databases large enough to ensure statistical validity. Researchers at the Centre de Recerca Matematica, part of the Government of Catalonia's CERCA network, who are attached to the Universitat Autonoma de Barcelona Department of Mathematics, analyzed the full collection of English-language texts in the Project Gutenberg, a free database with more than 30,000 works. When the rarest words were left out, Zipf’s Law applied to more than half of the words.

The law can be applied to fields other than literature. Zipfian distributions have been found in the population ranks of cities in various countries, corporation sizes, income rankings and ranks of the number of people watching the same TV channel.

This was last updated in January 2018

Continue Reading About Zipf's Law

Networking
  • SD-WAN security

    SD-WAN security refers to the practices, protocols and technologies protecting data and resources transmitted across ...

  • net neutrality

    Net neutrality is the concept of an open, equal internet for everyone, regardless of content consumed or the device, application ...

  • network scanning

    Network scanning is a procedure for identifying active devices on a network by employing a feature or features in the network ...

Security
  • virtual firewall

    A virtual firewall is a firewall device or service that provides network traffic filtering and monitoring for virtual machines (...

  • cloud penetration testing

    Cloud penetration testing is a tactic an organization uses to assess its cloud security effectiveness by attempting to evade its ...

  • cloud workload protection platform (CWPP)

    A cloud workload protection platform (CWPP) is a security tool designed to protect workloads that run on premises, in the cloud ...

CIO
  • Regulation SCI (Regulation Systems Compliance and Integrity)

    Regulation SCI (Regulation Systems Compliance and Integrity) is a set of rules adopted by the U.S. Securities and Exchange ...

  • strategic management

    Strategic management is the ongoing planning, monitoring, analysis and assessment of all necessities an organization needs to ...

  • IT budget

    IT budget is the amount of money spent on an organization's information technology systems and services. It includes compensation...

HRSoftware
  • ADP Mobile Solutions

    ADP Mobile Solutions is a self-service mobile app that enables employees to access work records such as pay, schedules, timecards...

  • director of employee engagement

    Director of employee engagement is one of the job titles for a human resources (HR) manager who is responsible for an ...

  • digital HR

    Digital HR is the digital transformation of HR services and processes through the use of social, mobile, analytics and cloud (...

Customer Experience
  • chatbot

    A chatbot is a software or computer program that simulates human conversation or "chatter" through text or voice interactions.

  • martech (marketing technology)

    Martech (marketing technology) refers to the integration of software tools, platforms, and applications designed to streamline ...

  • transactional marketing

    Transactional marketing is a business strategy that focuses on single, point-of-sale transactions.

Close