For many people, big data is, erroneously, synonymous with the Hadoop framework. But Hadoop does not have the ability to deal with real-time, streaming data, as is the case with IoT data. While IoT data has similar characteristics as big data, IoT data is much more complex. IoT data is:
- Messy, noisy, and sometimes intermittent because sensors are often deployed in the field. IoT data is ultimately collected by sensors sitting somewhere – for example, a sensor could be deployed on a telephone pole or street light. Sensors often cut in and out.
- Often highly unstructured, and sourced from a variety of sensors (fixed and mobile)
- Dynamic – “data in motion” as opposed to the traditional “data at rest”
- Sometimes indirect – we cannot measure a certain relevant quantity directly, for example, use a video camera with video analytics to count people in a certain area
The notion that collecting information from sensors and bringing it into one central computing station is not a long term scalable solution, particularly as the volume of IoT devices and data is forecasted to explode. Bringing such a large amount of data to a relatively small number of data centers where it is then analyzed in the cloud, simply not scale. It will also be costly because transporting bits from here to there actually costs money.
With so many devices producing so much data, a correspondingly large array of analytics, compute, storage and networking power and infrastructure are essential. Though analytics will be necessary to the growth and business value of IoT, the traditional approach to analytics won’t be the right fit.
To The Edge!
A clear solution that addresses scale and efficiency has arisen–distribute the analytics to the edge, or very close to it. Enterprises must harness the smartness of the myriad of smart devices and their low-cost computational power to allow them to run valuable analytics on the device itself. Multiple devices are usually connected to a local gateway where potentially more compute power is available enabling more complex multi-device analytics close to the edge.
How does distributed IoT analytics work? The hierarchy begins with “simple” analytics on the smart device itself, more complex multi-device analytics on the IoT gateways, and finally the heavy lifting — the big data analytics — running in the cloud. This distribution of analytics offloads the network and the data centers by creating a model that scales. Distributing the analytics to the edge is the only way to progress.
Edge IoT analytics is more than just about operational efficiencies and scalability. Many business processes do not require “heavy duty” analytics and therefore the data collected, processed and analyzed on or near the edge can drive automated decisions. For example, a local valve can be turned off when a leak is detected.
If latency is a concern for some businesses, then actions can be taken in real time to avoid delays between the sensor-registered event and the reaction to that event. This is extremely true of industrial control systems when sometimes there is no time to transmit the data to a remote cloud. Issues such as this can be remedied with a distributed analytics model.
MISTY 1.0 Premium bundles Edge IoT Analytics in addition to everything in MISTY Pro. MISTY Premium has built in prediction/machine learning models that can be applied to a set of data from sensors on a high-end gateway. An Artificial Intelligence package will be soon added to a future version of MISTY 2.0 Premium.