Annual Update on Computational Storage
Computational storage allows data to reside close to processing power, thus allowing processing tasks to be in-line with data accesses. Stay on top of the IT architecture by reviewing this update.
Download this presentation: Annual Update on Computational Storage
00:01 Chuck Sobey: Hello, and welcome to Flash Memory Summit's annual update on computational storage. I'm Chuck Sobey with ChannelScience, and I'm also your FMS conference chair. We're glad you're able to join us this year, if virtually.
We'll start this update by looking at the problem solved by computational storage. We'll see how computational storage fits with the overall trend toward data-centric computing. We'll clear up some terms that might be confusing on first hearing and introduce the leading computational storage architectures. We'll finish with the industry players, trade association developments and talk about what to watch for next. In many applications, it takes more time and energy to move data to and from memory than it does to compute with it, even more time and energy are needed to move data from storage to memory in the first place. This is what is driving the economic case for bringing compute to the data.
01:00 CS: Where is compute now? The architecture of a PC or a server looks something like this. A CPU works on the data that's in memory in the DRAM; the data gets to the memory over a bus or network from storage like SSDs. As a specific example, suppose we want to analyze transactions such as stock purchases or retail sales or reservations on, say, Mondays in June. The process is: load the entire database from storage into memory, search the records in memory for Mondays in June, perform the analysis on those records and disregard all of the other data that was loaded in the memory.
So, what changes if we bring compute to the data, specifically, if we bring compute to the SSD? On this slide, we've simply changed the SSD to a computational storage device. This CSD will be described in a few slides, but it has enough processing capability to interpret a command from the host to search the database for records from Mondays in June, then only those records need to be loaded into memory for the subsequent analysis on only those records. Clearly, using a computational storage device, in this case, frees CPU cycles for other tasks and reduces network and bus traffic. Furthermore, the records are ready for analysis sooner, driving down latency in getting results.
This article is part of
Flash Memory Summit 2020 Sessions From Day Three
02:46 CS: The industry has standardized on the term computational storage, but in some literature you'll see it referred to do as in storage processing or in CPU processing. These next two terms are specific instances of the general idea of data-centric computing. Near memory processing has an emphasis on the DRAM hardware, whereas near data processing is a more general term for the concept and implementation. Processing in-memory and neuromorphic computing are terms describing memory with certain fundamental math functions built into the hardware. This is an old idea that is seeing new life. Examples have been presented at other FMS sessions. Next, we'll provide a quick comparison of this architecture shift from compute-centric to data-centric.
03:31 CS: Compute-centric architectures for high-performance computing are geared towards solving differential equations for things like the fluid dynamics of weather forecasting, the bottleneck is the CPU and memory. Data-centric architectures are designed to analyze petabytes of data, it is a data-first, not compute-first design goal. This is for applications like search, network analysis, and video creation and transmission. Bottleneck is storage and I/O; computational storage breaks these bottlenecks and provides additional benefits. These additional benefits include freeing up the CPU for other tasks, enabling parallelism, especially in the data center, and reducing the movement of data, thereby improving latency and network bandwidth utilization. These top three major bullets are benefits especially well suited to the needs of data centers and cloud service providers.
04:33 CS: Edge computing is also well served by computational storage's ability to reduce the need to be connected to the cloud. In addition, data that is generated and kept at the edge can be more secure and private if it is not traveling over the networks to the cloud. Furthermore, at the edge, access to the cloud can be uncertain -- this results in unpredictable latency. 5G will be an enabler for edge and IoT applications. 5G is supposed to provide reduced and more predictable latency, so this tale of the latency distribution should be improved. One of the promises of 5G is the enablement of the tactile internet. This means that a human can remotely control a robot and feel the response feedback in what appears to us as real-time. This needs to be on the order of a millisecond. On the chart on the right, we see that light travels about 200 miles in 1 millisecond, that's about 300 kilometers.
05:39 CS: So, the human and the robot should be less than 100 miles apart in order for the round-trip action and feedback to complete in 1 millisecond. Of course, this assumes the latency distribution on the previous slide is tightly controlled. So, how do you make a computational storage device? If we can simplify the description of an SSD as NAND and processing, then we can simplify the description of a computational storage device as an SSD plus more processing.
06:15 CS: There are, of course, multiple ways to implement such an architecture, there are four current computational storage architectures, a computational storage device can incorporate the extra processors as an FPGA alongside the SSD controller ASIC, or the extra processing can be integrated with the SSD controller functions into one custom ASIC. There are also reports of some attempts to change the firmware of standard SSDs to repurpose some of their compute cycles to computational storage tests -- the performance is bound to be limited in comparison with hardware solutions.
06:53 CS: Additionally, the computational storage processing can take place on an accelerator card that does not have any storage of its own. This can be part of a network fabric or built into an array. Here are text summaries of the architectures in the figure on the previous slide. This slide is for your later study, if you want a more detailed look at the various current architectures I just introduced. The Storage Networking Industry Association, SNIA, has a lot of resources on their website. Where computational storage lands within its structure is in the compute memory and storage initiatives, Computational Storage SIG; within this, the activity is concentrated at the Computational Storage TWG. We're happy to know that this TWG form from conversations that were had at FMS 2018. SNIA currently has a draft specification available for download and not for public review. You can download it at this link provided. SNIA's draft spec defines capabilities and actions for computational storage. These include management actions to identify a CSD and its features, security actions for access controlling encryption, fundamental operations like storing and retrieving data. SNIA is specifying two types of computational storage services: fixed and programmable.
08:28 CS: A computational storage service, in general, performs computation on data associated with a particular storage device. Programmable services are developed by a user and can provide a wide range of functions. Fixed services are configurable, but their fundamental function is set by the manufacturer. Examples of fixed computational storage services include compression, deduplication and encryption. Programmable computational storage services mentioned in the spec are Berkeley Packet Filter, free network traffic analysis, containers, FPGA bit streams for rapid reconfiguration and hosting operating systems such as Linux. There are over 40 companies participating in SNIA's initiative, many of them are presenting and or sponsoring here at FMS. There are also computational storage startups still in stealth mode.
09:25 CS: The computational storage devices today connect to systems and to each other, via NVMe. FMS has many presentations on NVMe this year as well as in past years, so here, I will only call your attention to the recently formed NVMe Computational Storage Task Group. They will make sure the NVMe specification has the necessary features to support a wide range of computational storage devices. There is close coordination between SNIA and NVM Express, so that the specifications are not in conflict. Many people in companies are members of both groups. This will help the overall adoption of computational storage. There are some big changes coming that could affect key elements of SSDs; these elements are ARM processor cores, and Xilinx FPGAs. Nvidia is planning to buy ARM for $40 billion. I expect we'll see more AI and ML in more places.
10:20 CS: AMD is planning to buy Xilinx for $35 billion. I think the expected result is more accelerators in more places. Of course, these deals are subject to government approval in many jurisdictions. In addition to this update and introduction, you'll also hear in this session the very latest from industry leaders at Eideticom, IBM, NR. And I'll share my own update right now, ChannelScience has a proposal that is being evaluated by the U.S. Department of Energy that uses programmable computational storage services on a CSD. Our prototype will provide data triage for scientific instruments like this electron microscope shown here. Wish me luck.
11:08 CS: In conclusion, some developments I'll be watching for in computational storage next year are continued coordination between the standards bodies and perhaps activity from the old OpenFog Consortium; more computational storage startups emerging from stealth mode; more AI/ML implementations, especially at the edge; new opportunities enabled by 5G as it expands; low cost devices enabled by RISC-V; and computational storage being used to scale the data center in new ways, including further leveraging the parallel nature of NAND flash chips. Note that AWS offers computational storage as an option in Aqua, they're Advanced Query Accelerator.
Thank you for watching this presentation and please be sure to visit our sponsors virtual booths. I look forward to answering any questions you may have in the chat, and after the event, you can feel free to get in contact with me at the email shown here. Please enjoy the rest of this 15th annual Flash Memory Summit.