Sergey Nivens - Fotolia
Key-value storage is a modern approach to data storage intended to make SSDs more efficient than existing block and object storage systems and enable new SSD functionality. It simplifies how data is consumed and stored to eliminate some of the translation steps required to use block storage to store objects.
Key-value storage enables unstructured data, such as photos and videos, to be stored as a single addressable object. It can also be useful for storing various record types -- medical or employee records, for example -- associated with a unique identifier.
In a recent Storage Network Industry Association webinar, Bill Martin, co-chair of SNIA's Technical Council and a Samsung engineer, discussed the difference between NVMe key-value storage and block and object storage. Here's a look at these comparisons, as well the advantages of key-value storage and standards work being done.
NVMe key-value storage vs. block storage
With block storage, data is stored in fixed-size chunks, or blocks, and each block is associated with a unique numeric logical block address (LBA) on a drive. Key value, on the other hand, stores data in an unstructured format, and a key is used to address and locate the data. A key can be a URL, a name or pretty much whatever you want, Martin said. It's a "somewhat unstructured pointer to where that data is" he said.
With block storage, the LBA is a fixed number of bytes, and storage space is allocated in multiples of the block size. In contrast, with key-value storage, keys vary in length from 1 byte to 32 bytes based on the NVM Express group's NVMe command set definition, Martin said. Storage space is allocated in increments of bytes, and the data -- or value -- is associated with just the amount of physical storage needed to store specific data, he added. This contrasts with block storage, where logical blocks are associated with physical blocks on a one-to-one basis.
Key-value vs. object storage
With object storage, data is stored using a 128-bit unique identifier that makes it possible to find data even if you don't know its physical location. Metadata stored along with the unique ID and the data as part of the object in a flat structure enables the storage of large amounts of unstructured data. However, objects are typically stored on a block storage device, and a translation is done on the object storage stack.
Key-value data, on the other hand, is stored using a key of variable length on a native key-value device and doesn't use metadata, according to Martin. Key-value and object storage have more similarities than key-value and block storage, but they aren't the same. Key value "is an underlying protocol that supports object storage at the device level," Martin said.
With, object storage, the protocol provides mapping of an object identifier to the object at different layers in the system and may be split across multiple layers under certain circumstances, such as when you have sharding, Martin said. With key value, storage provides mapping of a key to a value and is directly addressed to a specific storage device.
How key value works
Key-value operations include:
Data, once stored, isn't updatable or extendable in place, Martin explained. If you want to modify some portion of your value, you must rewrite the entire value. The same happens if you want to add to it; you must delete the existing data and store the new value. The value stored is the complete value.
Data is retrieved as a single value associated with a key, and a portion of the data can be retrieved if you start from the beginning of the value, Martin explained. "The intent with key value is that you are dealing with complete values, and the application … normally would just retrieve the entire value and do whatever it wants with that entire value," he said.
Bill MartinCo-chair, SNIA's Technical Council
Key-value pairs can be deleted, he said, and you can list all keys stored on the device and test for the existence of a key.
The key-value architecture advantages
Block storage and key-value architecture have similar top-level architectures, according to Martin, but with block storage, triple mapping takes place -- from the key to the file system, the file system to the LBA, and the LBA to the physical address. Key value replaces triple mapping with a single mapping table. The key-value pair goes through a thin key-value library and has a protocol at the lower layer of the architecture that maps the key to the value, he said.
The NVMe key-value approach provides the following three advantages, Martin added:
- an increased number of transactions per second;
- a decreased write amplification factor; and
- decreased latency.
Decreased write amplification is particularly significant because it can increase the endurance of an SSD.
Key-value storage also enables a storage device to manipulate data based on content, so you can search values for a pattern or perform encoding on your value, Martin said. He also noted that key-value storage removes provisioning overhead issues associated with block storage because there's preassigned mapping of the logical to physical address, and the address range is limited based on the size of the physical storage. In addition, the key can be unique across multiple devices, so you don't need to map it at an upper layer.
Key-value standards and initial products
NVM Express's NVMe key-value storage command set was ratified in June 2020. An initial release of the SNIA key-value APIs was published in April 2020 and is being updated to work with the NVMe commands.
Among initial key-value products are Samsung spinoff Stellus Technologies' key-value-based platform rolled out in February. It provides a file system that can run on Samsung's key-value SSDs and other flash storage.