In the first part of this two-part series, I discussed that one of the biggest challenges to gathering data in the industrial world is learning how to effectively collect it from various systems without putting their core functions at risk.
In part 2, I’ll explore the role of context in transforming 80% of the effort into higher value and higher impact as we move toward more advanced AI.
Context is critical
Once an organization has solved the technical problem of gathering data from their systems without compromising its core functions, the next priority is ensuring that the relationships between the data and the underlying physical world or business processes are not lost or obfuscated. We call this meta-data, or more commonly context.
This context enables both the higher-order AI and attending data scientists to structure the organization and data without losing crucial perspective of the underlying physics of the business. Without this context, data engineers and scientists need to find ways to introduce that perspective into the data for use in a process called normalization. Those steps, which involve a process of reorganizing a database to permit users to properly utilize it for further analysis, are time consuming and frustrating.
In addition to the hassle of normalization, there is also the risk that the fidelity of the data is compromised. When the source data is transformed to fit into consistent data structures, we call this data degradation. These losses can compound because they can occur at any boundary or exchange from the original data source to the final data representation used by the AI, often resulting in valuable information lost.
The evolution of AI
As AI technology is advancing, such as in deep learning, the need to normalize data is diminishing, but there is still a need for underlying systems and exchanges that can maintain the full fidelity of the data and propagate the available context.
Without these, data scientists must augment the available data with any needed context, including feature identification and labeling. This late binding or addition of context in the higher-level data management systems does not help the core system benefit future users and uses of its data with that added context: All that effort and investment are limited to the one higher order system. The process of late binding of context does not address the 80/20 problem and might ultimately add redundant effort.
In the future, AI itself might eliminate some of the need for late binding. Advances in AI can allow much of the underlying business and operational physics to be discovered, and some argue that this is actually a better approach to training and learning from the data. Instead of designing systems to maintain context, the effort is put into translating the models and lessons back to humans so that appropriate actions can be taken, and the models can be applied back to the business and operations processes.
Even if the computational resources and data science are available for this approach, the models should leverage the context and language that speak to the domain experts and others who are close to the underlying core systems and processes. This enables them to implement change and translate AI insights into actions.
In short, AI needs to speak the language of the humans who are using it.
The autonomous future
We are moving toward an autonomous future, which is much more of a closed loop where the AI informs a cyber-physical system without direct human intervention.
But even in this world, there are humans with desired outcomes. Humans will set the objectives that an autonomous operation will strive for, and the parameters or constraints that it should operate within, such as zero carbon, zero waste, minimal energy and maximum throughput. The tools we choose today will impact our readiness for this autonomous future. It is a lot more durable to fix a problem at the source than to add tooling on top.
Addressing the fidelity of data as it flows from the source to the AI engine and ensuring the exchange of all available context as the data flows through the system will help ensure that the data available to the AI engine is of immediate use. For many legacy systems, tools that can help augment the data with context in the most automated manner can fill this gap.
As with the Buddhist parable of 83 problems, this article won’t help you with the existential 84th problem: The desire to not have any problems in the first place. But, hopefully, it will help you put those problems — and opportunities — in perspective.
All IoT Agenda network contributors are responsible for the content and accuracy of their posts. Opinions are of the writers and do not necessarily convey the thoughts of IoT Agenda.