yurolaitsalbert - Fotolia

Understand IoT data management essentials

The flood of data from IoT devices requires careful planning of infrastructure and data management, and current processes might not be up to the task.

IoT data management has changed the way organizations must design their infrastructure to gain the most advantages from IoT technology. The changes caused by IoT and demand for real-time analysis may not be intuitive for IT pros when, historically, they secured data in a centralized, on-premises data center.

Jason Carolan, chief innovation officer of Denver-based data center and colocation company Flexential Corp., discussed the best practices for creating secure infrastructure for IoT data management. The organization augments its data centers with network connectivity to reduce latency, provides cloud services and offers security and compliance consulting.

What is key to creating effective IoT data management infrastructure?

Jason Carolan: There's a term fog computing: building out applications around the intelligence of things instead of IoT. There's also edge computing that brings compute storage closer to the end application so that end data can be classified at ingestion in real time.

If you think about all the data that IoT platforms create -- often, the data is related to security or compliance -- you don't want your platform to break; you don't want to lose data. Being able to do more hyperlocal is where infrastructure is going. Applications are becoming much more distributed.

When we think about the internet and cloud computing, cloud became very centralized. The larger cloud platforms are typically either on the coast or scattered in the middle of the country. It's very common for a packet to take 50 to 80 milliseconds to move across the U.S. Applications also aren't written just to do one query and then come back and stop; it's more of a constant set of queries, so it's very easy to add a lot of latency when packets keep traversing the network.

Jason CarolanJason Carolan

By using edge computing and developing edge-native applications, you're able to reduce that processing time. You're able to improve the security of the platform. You're able to improve the reliability of the platform by doing compute processing closer. It really depends on the application, but the trend is getting compute, network and storage much closer to the devices so that they can be processed in real time. There might be a copy of that data that ultimately goes into the hyperscale platforms like Amazon for more longer-term analysis -- and we often see that -- so it might be ingesting some data locally, doing some analytics at real time.

How has IoT data changed security practices?

Carolan: What we see now is everything is network-connected, but it takes a real governance practice to make sure that those devices are kept up to date. They often have embedded operating systems that people don't know, but they can also be a large honeypot sitting there for somebody to hack into.

This is an emerging space that needs a lot of care, feeding and attention because developers tend to develop code and it may not be safe. They get it into production. The business needs to keep moving forward because it's a new application -- it's a new feature and it's critical -- but security really is critical to these types of use cases, especially if you think about our electrical grid, about a car rolling down the road. Ensuring that we've got certificates in place, ensuring that the code is certified and has been written by the right folks. Security is an area that that will require some federal mandates to get that regulatory aspect of these types of workloads.

What would you recommend IT professionals do to secure IoT data and edge computing?

Carolan: IoT security is an extension of the defense-in-depth approach. I like to call it the security onion. You peel away layers and layers, which are often reinforcing technologies. Things like passwords are great, but multifactor identification is better. Blacklisting is great -- where you prevent something from happening. But whitelisting is even better -- where you specifically say this application, this device has this type of connectivity profile and it's allowed to do these types of things.

Understand, out of all the applications that you might have and all the data that you might be creating, which are the most important for you to be able to continue running.
Jason Carolanchief innovation officer, Flexential

Typically, the industry is very much used to blacklisting the things that aren't permitted, but there's a shift toward whitelisting, where we're saying we've got to explicitly permit this device to do the things that it's doing. There are also authentication layers that ensure the device is authenticated on the network. There are access control protocols -- such as the extensible authentication protocol -- that the network can use to give permission into the network.

You want to use those different protocols and messaging layers to ensure the network is safe, so you're not connecting directly to the server. You're connecting to some middleware that says, 'Hey, is this device authenticated. Do I trust that data?' We're seeing this multi-tiered approach of adjusting data into the network, but then not really trusting it until we can verify that that data is, in fact, the data we're expecting.

How should organizations create a disaster recovery plan for IoT data?

Carolan: The first part of disaster recovery is understanding your risk tolerance and business continuity, which would be where you can use the platform in an event that there is something going on. How do you continue those activities post-disaster? Understand what the risk tolerance is for the different types of applications that you have.

At Flexential, we came up with maybe a thousand applications, but there's really five that are key. Understand, out of all the applications that you might have and all the data that you might be creating, which are the most important for you to be able to continue running. Typically, you come up with an RTO and RPO -- a recovery time objective, which is how much how much time can you be down, and then a recovery point objective, which is how much data can you tolerate losing. That's the starting point, looking at your applications and your data and saying how do I go create a plan, whether that's a disruption for a few minutes, whether that's a disruption for a few hours or potentially even a large-scale disaster where you might be disrupted for days or weeks. There are different formulas for all of that.

We typically tell customers that ensuring those critical applications and data are off-site, but also next to compute and network gear so that they could be brought up very quickly is very important. Some environments also run what I call active-active applications where disaster recovery is built into the platform. We're starting to see that more, where an organization could have instances running across the country in real time together. But the first thing you have to do is make sure you have good copies of your data. Most of our customers have at least two methods of data recovery, and we see that as a best practice for any critical data is have at least two copies using different technology, different formats and different places.

A great example of data recovery methods would be using a backup technology to make a copy of that data and then storing it on Amazon [Simple Storage Service]. Then, using a disaster recovery tool like Zerto to replicate the data in real time, make a copy someplace else that's different from the copy that's sitting on top of Amazon. Those two methods give you some protection if there was a ransomware issue that had been in your system for a matter of weeks or days. Your likelihood of having a copy that is safe and secure someplace else is much higher by using a second method.

Dig Deeper on Enterprise internet of things

Data Center
Data Management