macrovector - Fotolia
The premise underlying computational storage is that it takes too much time and effort and too many resources to move data from where it's stored to where it's processed. The time lag, in particular, contributes to latency -- the enemy of response time. Moving processing closer to the data increases efficiencies, decreases response time and reduces latency.
Theoretically computational storage for IoT and other applications should speed up processing, but theory and reality often are at odds. It comes down to which computational storage use cases make sense. Do they add costs, and can a performance gain justify the costs? All good questions that must be answered if computational storage is going to take off.
But wait, hasn't the computational storage concept been tried before under a different name? Remember storage offload? Why did it fail?
The storage offload saga
Storage offload tried to address the limited processing power of CPUs. It was thought that offloading some of the storage processing issues would free up the CPU. However, the CPU kept doubling in transistors every 18 to 24 months, increasing processing power exponentially, so who needed storage offloading?
Shared storage also offloaded a lot of the CPU-intensive storage processing from servers. The CPUs that could be put on storage media were either underpowered or too expensive. And application-specific integrated circuits (ASICs) were too expensive and took too long to bring to market. OSes were too big, cumbersome and inefficient to run on the available offload processing power, or too proprietary. And the storage media was primarily HDDs, and putting processing directly on an HDD has inherent issues that make it difficult.
Needless to say, storage offload didn't take off.
What's different today
Moore's Law is running out of steam, and CPUs aren't doubling the transistor count every 18 to 24 months. Stand-alone shared storage arrays are no longer the only storage-sharing option. Software defined storage (SDS) has made server-side storage shareable in both scale-up and scale-out configurations.
There are small, inexpensive CPUs with quite a bit of power. Most are ARM processors, but low-cost, powerful field-programmable gate arrays (FPGAs) are also being used. The media has generally been moving to fast and non-volatile solid states -- NVMe drives, SATA flash SSDs and storage-class memory -- instead of spinning rust. And finally, Linux and containers have made running applications in ARM processors or FPGAs acceptably efficient.
That brings us back to the use cases. Where does it make sense to use computational storage? The biggest use case is computational storage for IoT. These devices are proliferating at an impressive rate. Statista says there will be more than 75.44 billion by 2025. Most are relatively small, generating data on an ongoing basis. There isn't a lot of room for the traditional Von Neuman computer architecture. And yet, most of these devices must capture, store and analyze data in real time.
Computational storage for IoT
Take the example of the autonomous vehicle. It must determine if the sensory input it's receiving is a reflection, shadow or human being, and it has to do so in microseconds. That's potentially a good computational storage use case.
Another example is the oil and gas industry's need to analyze drill bit vibration in the field. Doing that in real time can prevent field leaks, damage and other unexpected problems.
There are also CCTV cameras, which are getting smaller and doing more with little room for client-server architectures. Camera resolution keeps improving, which creates even more data. Moving that data to a central location where it can be analyzed takes time. Real-time processing in the camera is a must today, particularly for applications such as facial recognition, license plate reading and contraband or explosive identification. Actionable information is time-sensitive and can't wait for the data to be moved. Analyzing data in the field requires processing and storage, making computational storage an ideal fit.
These are just a few of the applications where computational storage for IoT makes sense. There are many others. Generally, if it's an IoT device, computational storage will likely be a possibility.
Inside scale-out server architecture
Another computational storage use case comes from within the server and storage architectures themselves. As previously mentioned, Moore's Law has slowed. It's one of the reasons that CPUs stopped getting faster more than a decade ago and started adding more cores. Scale-out server architectures have proliferated as a result. Scale-out is the core of hyper-converged infrastructure. But what if scale-out became internalized within the server? In other words, what if we had an inside scale-out server architecture?
With this approach, the main CPU cores would be used for application processing and the computational storage cores for CPU-intensive storage workloads, such as snapshots, replication, deduplication, compression, encryption, decryption, virus scanning, metadata search and even content search. The server would become an inside scale-out cluster. There aren't any servers with this architecture yet. It's one of many capabilities computational storage could enable.
A word of caution: The world of computational storage is new; several vendors are already producing products, including Burlywood, Eideticom, NGD Systems, Nyriad, Samsung and ScaleFlux. But there's no computation storage standard. SNIA and OpenFog Consortium have established working groups to create a computational storage standard, but until one is settled on, caveat emptor.
There are other issues to watch for. Computational storage devices will cost more than standard storage devices. And a breakthrough in CPU processing could cause computational storage to be sidelined. It's happened before; it can happen again.
There's huge potential in computational storage for IoT and other use cases. But, as with all new technologies, proceed with caution.