What is tokenization?

Tokenization is the process of replacing sensitive data with unique identification symbols that retain all the essential information about the data without compromising its security. Tokenization, which seeks to minimize the amount of sensitive data a business needs to keep on hand, has become a popular way for small and midsize businesses to bolster the security of credit card and e-commerce transactions while minimizing the cost and complexity of compliance with industry standards and government regulations.

Examples of tokenization

Tokenization technology can, in theory, be used with sensitive data of all kinds, including bank transactions, medical records, criminal records, vehicle driver information, loan applications, stock trading and voter registration. For the most part, any system in which surrogate, nonsensitive information can act as a stand-in for sensitive information can benefit from tokenization.

Tokenization is often used to protect credit card data, bank account information and other sensitive data handled by payment processors. Payment processing use cases that tokenize sensitive credit card information include the following:

  • mobile wallets, such as Google Pay and Apple Pay;
  • e-commerce sites; and
  • businesses that keep customers' cards on file.

How tokenization works

Tokenization substitutes sensitive information with equivalent nonsensitive information. The nonsensitive, replacement information is called a token.

Tokens can be created in the following ways:

  • using a mathematically reversible cryptographic function with a key;
  • using a nonreversible function, such as a hash function; or
  • using an index function or randomly generated number.

As a result, the token becomes the exposed information, and the sensitive information that the token stands in for is stored safely in a centralized server known as a token vault. The token vault is the only place where the original information can be mapped back to its corresponding token.

Here is one real-world example of how tokenization with a token vault works.

  • A customer provides their payment details at a point-of-sale (POS) system or online checkout form.
  • The details, or data, are substituted with a randomly generated token, which is generated in most cases by the merchant's payment gateway.
  • The tokenized information is then encrypted and sent to a payment processor. The original sensitive payment information is stored in a token vault in the merchant's payment gateway. This is the only place where the token can be mapped to the information it represents.
  • The tokenized information is encrypted again by the payment processor before being sent for final verification.

On the other hand, some tokenization is vaultless. Instead of storing the sensitive information in a secure database, vaultless tokens are stored using an algorithm. If the token is reversible, then the original sensitive information is generally not stored in a vault.

Tokenization and PCI DSS

Payment card industry (PCI) standards do not allow retailers to store credit card numbers on POS terminals or in their databases after customer transactions.

To be PCI compliant, merchants must either install expensive, end-to-end encryption systems or outsource their payment processing to a service provider who offers a tokenization option. The service provider handles the issuance of the token's value and bears the responsibility for keeping the cardholder data secure.

In such a scenario, the service provider issues the merchant a driver for the POS system that converts credit card numbers into randomly generated values (tokens). Since the token is not a primary account number (PAN), it can't be used outside the context of a unique transaction with a specific merchant.

In a credit card transaction, for instance, the token typically contains only the last four digits of the actual card number. The rest of the token consists of alphanumeric characters that represent cardholder information and data specific to the transaction underway.

Benefits of tokenization

Tokenization makes it more difficult for hackers to gain access to cardholder data, as compared with older systems in which credit card numbers were stored in databases and exchanged freely over networks.

The main benefits of tokenization include the following:

  • It is more compatible with legacy systems than encryption.
  • It is a less resource-intensive process than encryption.
  • It reduces the fallout risks in a data breach.
  • It makes the payment industry more convenient by propelling new technologies like mobile wallets, one-click payment and cryptocurrency. This, in turn, enhances customer trust because it improves both the security and convenience of a merchant's service.
  • It reduces the steps involved in complying with PCI DSS regulations for merchants.

History of tokenization

Tokenization has existed since the beginning of early currency systems, with coin tokens long being used as a replacement for actual coins and banknotes. Subway tokens and casino tokens are examples of this, as they serve as substitutes for actual money. This is physical tokenization, but the concept is the same as in digital tokenization: to act as a surrogate for a more valuable asset.

Digital tokenization saw use as early as the 1970s. In the databases of the time, it was used to separate certain sensitive data from other data being stored.

More recently, tokenization was used in the payment card industry as a way to protect sensitive cardholder data and comply with industry standards. The organization TrustCommerce is credited with creating the concept of tokenization to protect payment card data in 2001.

Types of tokens

Numerous ways to classify tokens exist.

Three main types of tokens -- as defined by the Securities and Exchange Commission  and the Swiss Financial Market Supervisory Authority  -- differ based on their relationship to the real-world asset they represent. These include the following:

  • Asset/security token. These are tokens that promise a positive return on an investment. These are economically analogous to bonds and equities.
  • Utility token. These are created to act as something other than a means of payment. For example, a utility token may give direct access to a product or platform or provide a discount on future goods and services offered by the platform. It adds value to the function of a product.
  • Currency/payment token. These are created solely as a means of payment for goods and services external to the platform they exist on.

In a payment context, there is also an important difference between high- and low-value tokens. A high-value token acts as a direct surrogate for a PAN in a transaction and can complete the transaction itself. Low-value tokens (LVTs) also act as stand-ins for PANs but cannot complete transactions. Instead, LVTs must map back to the actual PANs.

Tokenization vs. encryption

Digital tokenization and encryption are two different cryptographic methods used for data security. The main difference between the two is that tokenization does not change the length or type of the data being protected, whereas encryption does change both length and data type.

This makes the encryption unreadable to anyone without a key, even when they can see the encrypted message. Tokenization does not use a key in this way -- it is not mathematically reversible with a decryption key. Tokenization uses nondecryptable information to represent secret data. Encryption is decryptable with a key.

Encryption has long been the preferred method of data security. But there has been a recent shift to tokenization as the more cost-effective and secure option. Encryption and tokenization are often used in tandem, however.

Chart showing how blockchain works
Blockchain relies on tokenization, with blockchain tokens digitally representing real-world assets.

Tokenization and blockchain

Tokenization in blockchain refers to the issuance of a blockchain token, also known as a security or asset token. Blockchain tokens are digital representations of real-world assets. A real-world asset is tokenized when it is represented digitally as cryptocurrency.

In traditional, centralized economic models, large financial institutions and banks are responsible for certifying the integrity of the transaction ledger. In a blockchain-based economy or token economy, this responsibility and power shifts to individuals, as the integrity of transactions are verified using cryptography on an individual level instead of a centralized one.

This is possible because the cryptocurrency tokens are linked together in a blockchain, or group of digital assets, which enables the digital asset to be mapped back to the real-world asset. Blockchains provide an unchangeable, time-stamped record of transactions. Each new set of transactions, or blocks in the chain, is dependent on the others in the chain to be verified.

Therefore, a tokenized asset in a blockchain can eventually be traced back to the real-world asset it represents by those authorized to do so -- while still remaining secure -- because transactions have to be verified by every block in the chain.

This was last updated in February 2023

Continue Reading About tokenization

Dig Deeper on Compliance

Enterprise Desktop
Cloud Computing