solid-state storage garbage collection

What is solid-state storage garbage collection?

Solid-state storage garbage collection, or SSD garbage collection, is an automated process by which a solid-state drive (SSD) improves write performance. Garbage collection proactively eliminates the need for whole block erasures prior to every write operation.

Despite its name, SSD garbage collection has nothing to do with discarding files that are no longer needed. Rather, garbage collection is aimed at optimizing space and improving efficiency.

Why garbage collection is important

Garbage collection is essential because SSDs, which are initially extremely fast, become slower with time. Good maintenance helps preserve SSDs' speed and make them more reliable.

When a file is deleted from a computer, most operating systems (OSes) delete the table of contents entry but do not delete the actual data blocks from the storage media. Hard-disk drives (HDDs) overwrite the unneeded data blocks. Flash SSDs, however, must erase the unneeded data blocks before new data can be written.

Flash SSDs divide storage into blocks, which are further divided into pages. Data is read and written at the page level but must be erased at the block level. A significant amount of voltage is needed to erase data, and it is difficult to target that voltage at a more granular level without negatively impacting adjacent cells.

The purpose of garbage collection is to increase efficiency by keeping as many empty blocks as possible so that, when the SSD has to write data, it doesn't have to wait for a block to be erased.

How SSD garbage collection works

The inability to overwrite data or to erase it at the page level means that SSDs must handle data updates and deletions in a way much different from HDDs. For example, if a user updates a file, the updated data must be written to empty pages, sometimes in different blocks. The data in the original pages is then marked stale, which means that the data is no longer valid.

Pages that contain stale data cannot be used until they are erased, and they can be erased only at the block level. To complicate matters, blocks typically include pages that contain good data, along with the pages that contain stale data, making management more difficult.

To reclaim pages with the stale data, the pages with the good data must first be moved to another block so the original block can be erased. This constant shifting of data results in many more program/erase (P/E) cycles than requested by the host system, a situation referred to as write amplification.

All these extra P/E cycles can reduce the SSD's life span and have a negative impact on the drive's performance. To help address these issues, SSD vendors typically build garbage collection capabilities into their storage controllers, especially for their enterprise drives.

How SSD garbage collection is implemented

The way vendors implement garbage collection can vary significantly among devices, resulting in some drives handling garbage collection more efficiently than others. A drive's effectiveness at garbage collection depends on numerous factors, including the algorithms being used, when garbage collection runs and the overhead that the process incurs.

Despite the differences between garbage collection techniques, the goals are generally the same:

  • to better manage the P/E cycles; and
  • to reduce the impact on performance and overall endurance.

Working in the background, garbage collection systematically does the following:

  • identifies which pages contain stale data;
  • moves the pages with good data to another block; and
  • erases all the data from the original block.

Many SSD controllers run their garbage collection algorithms during off-peak times to maintain optimal write speeds during normal operations, although not all drives take this approach. Most controllers also incorporate wear leveling into their garbage collection operations to distribute P/E cycles more evenly across the storage blocks to prevent overused blocks from wearing out.

Wear leveling helps extend the life of SSDs, as does garbage collection and TRIM. SSD TRIM is a Serial Advanced Technology Attachment command that enables an OS to inform a NAND flash SSD which data blocks it can erase because they are no longer in use. TRIM, which is not an acronym, is complementary to garbage collection.

Garbage collection techniques

Although garbage collection techniques vary, the fundamental process is the same: move good data out of a block that contains stale data and then erase the entire block. Consider the example in the accompanying figure, which shows two SSD storage blocks -- Block A and Block B -- as they progress through the data update process.

ssd garbage collection, solid-state drive
Diagram illustrates solid-state storage garbage collection for two SSD storage blocks -- Block A and Block B -- as they progress through the data update process.

Although the figure shows only 15 pages per block, a block might contain 64, 128 or 256 pages, with page sizes commonly ranging between 4 kilobytes and 16 KB. Because a block can contain so many pages, garbage collection is essential to optimizing storage efficiently and extending the life of the drive.

Block A starts off with five pages that contain data (pages 1 through 5). When the data in those pages is updated, the SSD controller marks the original pages stale (invalid) and writes the updated data to new pages (pages 1+ through 5+). When the SSD receives additional data, the controller stores it in pages 6 through 10, completely filling Block A. Although five of the pages contain nothing but stale data, no more data can be added to the block until the entire block is erased.

This is where garbage collection kicks in. It copies pages 1+ through 5+ and pages 6 through 10 to Block B and then erases all the pages in Block A. Now, all the pages in Block A are available to store data, and five pages in Block B are available to store data. This process runs continuously throughout the life of the drive, helping to maximize storage availability and performance.

See how Zoned Namespaces improve SSD lifetime, throughput and latency; how solid-state drive performance metrics go beyond latency and IOPS; if SSD overprovisioning has storage benefits; and how to handle slow disks in heterogeneous SSD deployments.

This was last updated in March 2022

Continue Reading About solid-state storage garbage collection

Dig Deeper on Flash memory and storage

Disaster Recovery
Data Backup
Data Center
and ESG