Browse Definitions :

LZW compression

What is LZW compression?

LZW compression is a method to reduce the size of Tag Image File Format (TIFF) or Graphics Interchange Format (GIF) files. It is a table-based lookup algorithm to remove duplicate data and compress an original file into a smaller file. LZW compression is also suitable for compressing text and PDF files. The algorithm is loosely based on the LZ78 algorithm that was developed by Abraham Lempel and Jacob Ziv in 1978.

Invented by Abraham Lempel, Jacob Ziv and Terry Welch in 1984, the LZW compression algorithm is a type of lossless compression. Lossless algorithms reduce bits in a file by removing statistical redundancy without causing information loss. This makes LZW -- and other lossless algorithms, like ZIP -- different from lossy compression algorithms that reduce file size by removing less important or unnecessary information and cause information loss.

The LZW algorithm is commonly used to compress GIF and TIFF image files and occasionally for PDF and TXT files. It is part of the Unix operating system's file compression utility. The method is simple to implement, versatile and capable of high throughput in hardware implementations. Consequently, LZW is often used for general-purpose data compression in many PC utilities.

pros and cons of data compression
LZW reduces the size of TIFF or GIF files.

How LZW compression works

The LZW compression algorithm reads a sequence of symbols, groups those symbols into strings and then converts each string into codes. It takes each input sequence of bits of a given length -- say, 12 bits -- and creates an entry in a table for that particular bit pattern, consisting of the pattern itself and a shorter code. The table is also called a dictionary or codebook. It stores character sequences chosen dynamically from the input text and maintains correspondence between the longest encountered words and a list of code values.

As the input is read, any repetitive results are substituted with the shorter code, effectively compressing the total amount of input. The shorter code takes up less space than the string it replaces, resulting in a smaller file. As the number of long, repetitive words increases in the input data, the algorithm's efficiency also increases. Compression occurs when the output is a single code instead of a longer string of characters. This code can be of any length and always has more bits than a single character.

The LZW algorithm does not analyze the incoming text. It simply adds every new string of characters it sees into a code table. Since it tries to recognize increasingly longer and repetitive phrases and encode them, LZW is referred to as a greedy algorithm.

LZE compression encoding logic
The LZW algorithm doesn't analyze incoming text. It reads a sequence of symbols, groups those symbols into strings and then converts each string into codes in a table.

Code table in LZW compression

Unlike earlier approaches, such as LZ77 and LZ78, the LZW algorithm includes a lookup table of codes as part of the compressed file. Typically, the number of table entries is 4,096. In the code table, codes 0-255 are assigned to represent single bytes from the input file. Before the algorithm starts encoding, the table contains only the first 256 entries. The rest of the table is blank. In other words, the first 256 codes are assigned to the standard character set by default.

The remaining codes are assigned to strings as the algorithm proceeds with the compression. When encoding starts, the algorithm identifies repeated sequences in the data and adds them to the code table so that it fills up with more entries. For file compression, codes 256 through 4,095 are used to represent sequences of bytes. These codes refer to substrings, while codes 0-255 refer to individual bytes.

The decoding program that decompresses the file can build the table by using the algorithm as it processes the encoded input. It takes each code from the compressed file and translates it through the code table that's being built to find the character that code represents.

Advantages and drawbacks of LZW compression

The LZW algorithm quickly compresses large TIFF or GIF files. It works especially well for files containing a lot of repetitive data, which is common with monochrome images.

One drawback of LZW compression is that compressed files without repetitive information can be large, defeating the purpose of compression. Another issue is that some versions of the algorithm are copyrighted, so companies must pay royalties or licensing fees to use it. These fees may get added to the product cost.

Finally, LZW is not the most efficient compression algorithm. Other algorithms are available to compress files faster and more efficiently.

LZW compression vs. ZIP compression

LZW and ZIP are both lossless compression methods, meaning no data is lost after compression. TIFF files retain their quality after being compressed into smaller files using either LZW or ZIP. That said, compressed TIFF files can be slightly slower to work with because they require more processing effort to open and close them.

lossless vs. lossy compression
LZW, like ZIP, is a lossless compression method, which means no data is lost after compression.

LZW and ZIP provide good results with 8-bit TIFF files. For 16-bit TIFF files, the ZIP algorithm performs better than LZW. In fact, LZW tends to make 16-bit files larger. Generally, both algorithms work efficiently when they can group a lot of similar data and work on images that are low on detail and contain few tones. These images compress more than images containing lots of detail or different tones.

Explore the differences among compression vs. deduplication vs. encryption.

This was last updated in January 2023

Continue Reading About LZW compression

  • firewall as a service (FWaaS)

    Firewall as a service (FWaaS), also known as a cloud firewall, is a service that provides cloud-based network traffic analysis ...

  • private 5G

    Private 5G is a wireless network technology that delivers 5G cellular connectivity for private network use cases.

  • NFVi (network functions virtualization infrastructure)

    NFVi (network functions virtualization infrastructure) encompasses all of the networking hardware and software needed to support ...

  • Advanced Encryption Standard (AES)

    The Advanced Encryption Standard (AES) is a symmetric block cipher chosen by the U.S. government to protect classified ...

  • operational risk

    Operational risk is the risk of losses caused by flawed or failed processes, policies, systems or events that disrupt business ...

  • risk reporting

    Risk reporting is a method of identifying risks tied to or potentially impacting an organization's business processes.

  • Risk Management Framework (RMF)

    The Risk Management Framework (RMF) is a template and guideline used by companies to identify, eliminate and minimize risks.

  • robotic process automation (RPA)

    Robotic process automation (RPA) is a technology that mimics the way humans interact with software to perform high-volume, ...

  • spatial computing

    Spatial computing broadly characterizes the processes and tools used to capture, process and interact with three-dimensional (3D)...

  • OKRs (Objectives and Key Results)

    OKRs (Objectives and Key Results) encourage companies to set, communicate and monitor organizational goals and results in an ...

  • cognitive diversity

    Cognitive diversity is the inclusion of people who have different styles of problem-solving and can offer unique perspectives ...

  • reference checking software

    Reference checking software is programming that automates the process of contacting and questioning the references of job ...

Customer Experience
  • martech (marketing technology)

    Martech (marketing technology) refers to the integration of software tools, platforms, and applications designed to streamline ...

  • transactional marketing

    Transactional marketing is a business strategy that focuses on single, point-of-sale transactions.

  • customer profiling

    Customer profiling is the detailed and systematic process of constructing a clear portrait of a company's ideal customer by ...