Sergey Nivens - Fotolia

How to plan for data storage in IoT deployments

Business objectives, data retention needs and cost all factor into where organizations should store endpoint-generated data in growing IoT deployments.

Although cloud computing has made data storage fiscally feasible and technically possible for many organizations, IT leaders may find that even the cloud has its limits. IT professionals must understand where data storage in IoT deployments should reside based on their organization's needs.

Organizations average 30% data growth year-over-year as a result of their rapidly expanding IoT infrastructure, according to research firm Mordor Intelligence's "IoT Data Management Market – Growth, Trends and Forecast (2020 -- 2025)" report.

Given the volume of data that endpoint devices generate, experts said IT leaders cannot rely on the cloud to hold that information indefinitely. The cost of transmitting and storing the data, in addition to managing and securing it, will quickly overload most organizations.

To determine the right storage options for IoT data organizations should take the steps described here.

Identify business goals and infrastructure requirements

Enterprise IT and business leaders must first establish business goals and then identify the data needed to reach those objectives to determine what single technology or combination of technologies are best to process and store the data.

IT experts must consider whether the application needs real-time analysis of the data produced by endpoint devices to drive reactions where even milliseconds count and how much data they need to keep to fuel their goals.

The answers to those questions will influence their choices, said Rob Mesirow, leader of the PwC Connected Solutions and IoT practice and a partner in the firm's Technology, Media and Telecom Risk and Regulatory practice.

In most cases, organizations can send data from endpoint devices to the cloud and store it there. "With the modern cloud architecture, it's safer and a lot cheaper to put it in the cloud with any of the large cloud providers" than relying on an on-premises data center, Mesirow said. However, organizations that need to analyze significant amounts of data or that can't tolerate latency in that analysis need to keep the data closer to the devices.

IT experts should consider which data points need to be analyzed in real time, which can be analyzed later and how long to retain the various categories of data.

Similarly, organizations that introduce machine learning or AI into those processes need to consider whether those elements will work better when run at the edge or in the cloud.

Organizations must also take security into account. Consider the sensitivity of the data generated by endpoint devices as well as how secure those devices are and the investment required to adequately secure data stored at or near the endpoint versus in the cloud.

In some cases, organizations need to evaluate if they're operating IoT devices in areas that don't have the infrastructure to transmit data to the cloud as well as whether they'll be able to implement that infrastructure and at what cost.

"All these factors all need to be considered in tandem," said Massimo Russo, a managing director and senior partner with Boston Consulting Group.

Plan for analysis at the edge

Although most organizations aren't worried about milliseconds, they might still need some data handled with as little latency as possible. IT experts should consider which data points need to be analyzed in real time, which can be analyzed later and how long to retain the various categories of data. The time needed for analytics and the longevity of retention factor into where organizations should process and store data.

For example, an organization that uses connected cameras in vehicles to monitor driver behavior would find the cost and complexity of streaming the entire video feed to the cloud cost-prohibitive, Russo said. At the same time, processing such data in the cloud will not provide the real-time analytics needed to identify and correct problematic driver behavior.

In such cases, organizational leaders should have infrastructure to analyze video in real time on the edge and only transmit data about problematic behavior to the cloud.

Determine what data needs to be retained, and what does not

Although organizations see data as an asset, not all data holds value. Organizations should keep that in mind as they determine their IoT data storage needs, recognizing that they'll save money and possibly reduce risk by storing only essential data.

For example, a grocery chain with endpoint devices taking frequent temperatures in its refrigerators and freezers may only need average daily temperatures to run through its analytics applications, in which case, moving and storing every single temperature reading taken by those devices would be excessive. That's particularly true if there are hundreds, thousands or even tens of thousands of endpoint devices constantly generating temperature readings in the organization.

"If I can do a lot of processing on the edge, then what I store afterward is a lot less," Russo said.

Store data with purpose-built edge devices

Organizations can also opt for storage at the edge where it is created for a set period of time -- a decision that would allow the company to retrieve the data if needed.

For example, an organization using connected cameras could store the entire video feed on the camera. The organization would then have to consider its data retention needs against the storage capacity of the edge devices, the security requirements and the cost.

Organizations can store IoT data in the cloud or anywhere on edge devices.

Although some endpoint devices can store and process data, many can only handle lightweight analytics and have limited storage capacity, experts said. Moreover, they warn that designing an IoT environment that puts too much analytics and storage capacity on the endpoint devices, which creates a very expensive and heavy infrastructure.

Organizations that want or need to have the data stay at the edge, but aren't able to process and store data on the devices, can use edge gateways or micro data centers to balance local storage needs, said James Staten, vice president and principal analyst with Forrester Research.

"We're seeing more capabilities that are put near, in or on the IoT equipment itself," Staten said.

Dig Deeper on Enterprise internet of things

Data Center
Data Management