Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

4 industrial IoT data misconceptions that hold back its progress

The industrial internet of things and digitized smart factories have outsized potential to meaningfully and permanently improve the efficiencies and capabilities of countless operations. But despite this hype, IIoT adoption and progress has lagged — and I believe this is due to a series of misconceptions that keep IIoT-benefitting companies and developers from pursuing and implementing the technology (or doing so in the most effective ways).

IIoT does demand completely new requirements for real-time data management and analysis. That said, when properly understood and approached correctly, these requirements really shouldn’t be the obstacles to adoption that they are still too often made out to be.

Here are four persistent misconceptions about IIoT, along with more accurate interpretations of the challenges businesses in this space do encounter.

Misconception #1: IIoT doesn’t require unique or new database needs

It very much does. IoT analysts, including those at Gartner, point out that IoT poses entirely new challenges in terms of data volume, data and query complexity, and integration. In reality, the difference between a machine working in a standalone mode or within a networked IIoT remote monitoring system is stark. Unfortunately, organizations are making the mistake of trying to implement an IIoT infrastructure based on existing, traditional databases, such as Microsoft SQL Server, Oracle and so forth). These databases are — in addition to being expensive — usually technically incapable of meeting the increased requirements created by the massive data volume that must be processed for IIoT success. While traditional SQL databases are easy to use, they are not built to query machine data streams in real time.

Misconception #2: IIoT implementations require a NoSQL database

Even database experts often assume — erroneously — that high volumes of unstructured data amount to an inevitable NoSQL use case. It is true that NoSQL databases are particularly well-suited to support complex and flexible queries, thanks to their efficient scaling and distributed architectures. However, the infrastructures of NoSQL databases are often very complicated, requiring a great deal of attention paid to their planning, operation and administration. At the same time, in industrial practice there is almost always relational data that must also be stored, including topologies, firmware information, and ERP or article data. Using relational and non-relational databases means that two different systems must be run in parallel and synchronized. Another challenge is that there is no standardized query language for NoSQL databases, they each have their own query language. To use NoSQL databases like Apache Cassandra, Elasticsearch or MongoDB also means enlisting specialized and experienced programmers — which are expensive, if you can even find one. An alternative that avoids these challenges is to replace pure NoSQL databases with newer and more advanced SQL-based systems that combine the familiarity of ANSI SQL with the scalability and flexibility of NoSQL.

Misconception #3: Time-series databases are the answer

Specialized time-series databases are always in fashion. However, it remains a common mistake to choose a time-series database as the basis for an IIoT platform. These databases are often limited in both their functionality and their scalability with intensive parallel usage. In addition to the visualization of data streams, IIoT necessitates support for frequent analysis operations and data model changes. For instance, these processes may be used to properly diagnose and understand the causes of abnormalities within factory production. An IIoT database must also allow for interactive work with real-time data, including simultaneous reading, writing and execution of ad hoc queries, for use cases such as machine learning, under heavy load.

Additionally, the need for agile processes requires that an IIoT database adapt or extend data schemas at runtime. This means that bare sensor data, ERP data, quality data and so forth are used to examine production anomalies. For example, anomalies may be associated with certain jobs or due to specific raw materials from certain suppliers.

Data model changes of this nature often require teams to completely rebuild their time-series databases, which is (somewhat ironically) time-consuming, not to mention extremely costly. To solve this, many enterprises will use a time-series database alongside a separate relational database handling non-time series data. While this solution is quick to implement, growth will rapidly make the database expensive and increase the difficulty of keeping all the data in those different databases in sync.

Misconception #4: AI can only be achieved with better, cleaner data than you have

IIoT developers sometimes assume they lack the data or data hygiene to set up successful AI systems. And it may be the case that inadequate data leads to poor AI-controlled automation.

However, the fear of inadequate data automatically meaning that no useful results can be obtained, or that wrong decisions will be made, is simply unfounded. In practice, most companies pursuing IIoT will build a real-time data store to optimize — not replace — human decision-making with AI technologies and machine learning.

A practical approach here is to monitor analysis results and then gradually and automatically clean up the data as it goes through your process. Trying to completely clean all historical data — and thus delaying development and implementation of intelligent IIoT systems until your data reaches perfection — will backfire by leaving your AI systems with a quantity and depth of data that is too small to move forward properly. It’s usually better to simply get started, collect raw data and develop the use cases along the way.

Developing an accurate perspective

Organizations in a position to benefit from IIoT should understand that implementing this technology will require wholly new data management and analysis capabilities. Pipelines of sensor data — delivering thousands or even hundreds of thousands of readings per minute in dozens of message formats — must be integrated and analyzed in real time in order to properly monitor, predict and control the behavior of the things in the system. Fast acquisition and analysis of machine data is a prerequisite, while data-driven automation is key to the success of a future-proof IIoT project. IIoT-empowered facilities require data management systems able to:

  • Ensure rapid development and time-to-value;
  • Enable real-time data analysis;
  • Maintain consistent uptime; and
  • Ensure low IT operating costs for hosting, integration and administration.

The means to implement the IIoT applications and smart factories of the future are available to businesses today, if they are able to recognize and transcend the above-mentioned misunderstandings.

All IoT Agenda network contributors are responsible for the content and accuracy of their posts. Opinions are of the writers and do not necessarily convey the thoughts of IoT Agenda.

Data Center
Data Management