Gorodenkoff - stock.adobe.com
Data scientists increasingly play mission-critical roles in IT, as big data continues to surge and predictive systems grow more essential to organizations. However, data scientists aren't a one-size-fits-all proposition, and different skill sets work with different IT domains. Now, the IT domain that most needs a data scientist is IoT.
IoT architecture differs from conventional IT and cloud architecture because of its broad distribution of devices and networking intricacies. The differences in the kind of data on the edge and how it must be processed are just as important as the architecture. Data processing is where IoT data scientists can significantly influence both the quality and use of data.
The challenges of working with data at the edge can be well-mitigated by an IoT data scientist who has the right skills for the job. Organizations that invest in an IoT data scientist will see improvements in several areas. The IoT data scientist's diverse knowledge base will take pressure and time off IT during project deployment and testing. IT can reconsider and adapt to decisions about data management and algorithm application in an ongoing and accelerated fashion, rather than wait until the system is ready for formal testing. An IoT data scientist can develop a complete understanding of the IoT system's behavior, both operationally and in the abstract potential of its output.
Edge data creates unique challenges
IoT data scientists must understand the differences in the processing and management of data on the edge, where IoT happens, versus traditional infrastructure. The following four aspects show the contrasts:
Preprocessing of data. Data doesn't flow out of IoT in tidy, well-formatted records, as it does in conventional systems. IoT data is often sparse or incomplete, subject to the whims of the environment and the state of the machine producing it and it varies under changing conditions. The data is frequently temporal and time sensitive. IoT data scientists can apply deep learning to spot conditional shifts in data patterns, make predictive assessments of data quality and fill in the gaps as needed.
Sensor fusion. Increasingly, the state of a machine or a process depends on many IoT sensor inputs. The challenge is to integrate the data from disparate devices meaningfully to boost the quality and mitigate the uncertainty of individual results. Data scientists often must customize data integration, which requires specialized methodology to achieve and validate.
Deep learning and AI on the edge. Many IoT applications need AI, but also have a real-time component, such as facial recognition. In such scenarios, the AI application must learn in real time, as there's no room for the latency created from round trips of data to and from a cloud. Deep learning must occur where IoT data is created in edge computing nodes.
Real-time processes. Another major consideration is the need to aggregate and correlate IoT data for real-time processes, such as fleet management. IoT data is often unstructured and must be tagged and correctly synchronized in real time for proper use because time windows fluctuate, and some applications require instantaneous best-guess corrections.
The skills an IoT data scientist must have
All data scientists should be well-versed in machine learning and deep learning, but IoT data scientists also require different skills from traditional data scientists.
An understanding of signal processing. Data streaming into enterprise processes via IoT channels should be treated as a plethora of signals all flowing into a battlefield command center. The timing and relative strength of the signals are crucial to making sense of the intelligence they can convey. Competence with signal processing mathematics, as well as information theory, are major advantages in making sense of IoT data.
Knowledge of gateway layers. Between the edge and the enterprise there should always be a gateway layer where security, routing and often data aggregation take place. A strong foundation in how this layer works, the ways it can be configured and the hardware and software options available will be a plus for any data scientist who must match the handling of the data to the actual hardware.
Edge analytics. IoT data scientists should understand how edge analytics differs from cloud analytics because organizations increasingly require real-time response and low latency in IoT applications. Cloud analytics is seldom time-sensitive and doesn't always require granular inputs, while IoT analytics does.
An understanding of blockchain. Blockchain is a necessary skill at the edge as its use continues to grow. IT experts creatively apply blockchain to heighten security beyond enterprise firewalls and audit transactions in decentralized field environments.
Personal skills. IoT data scientists must be cross-disciplinary by nature and capable of multitasking. More than that, however, they should be curious and innovative, and willing to learn new things when the task requires it. It's also a plus if they work well with a wide range of people in other disciplines.