Going open-source to handle IIoT real-time data

May 14, 2019
It’s a particularly immense data challenge.

By Jürgen Sutterlüti, VP, energy segment and data analytics with Gantner Instruments

Industrial organizations depend on increasingly vast networks of sensors in order to gain the real-time insights that drive their business and operational decisions. Supporting these IIoT capabilities—which can now mean collecting 100,000 measurements-per-second-per-sensor—requires a particularly robust and finely-tuned infrastructure that includes edge-computing devices for nearby low-latency processing, as well as data technologies that can match up to the flexibility, scalability and performance needs for handling IIoT

Gantner Instruments' Jürgen Sutterlüti

sensor data at this volume.

It’s a particularly immense data challenge, and it’s one we’ve learned how to navigate from experience.

Our edge-computing devices support the monitoring and analytics used to track conditions such as the stress placed on bridges, the movement and vibration of train tracks, key indicators at energy plants, and other metrics that are integral to performance and safety. To add to the data challenge, monitored metrics in these areas also must adhere to a heterogenous collection of formats, including analog and digital signals, and those that must be in line with myriad industrial protocols.

Typical use cases for these edge devices fall into one of three categories:

  • Testing that lasts hours or days and requires data rates reaching 10,000 of samples-per-second
  • Long-term asset monitoring that can extend for years
  • Event-based data logging that can reach an intensive 1,000,000 samples-per-second and last a period of minutes. (In this scenario, data is processed immediately by the edge device, with results and raw data files sent to the database thereafter.)

For us, the correct strategy turned out to be largely an open-source one. Given the challenges of managing and storing real-time data from thousands of sensors, we vetted options and found that an open-source time-series database (CrateDB) and an open-source data streaming platform (Apache Kafka), working in tandem, stood as the most advantageous data-stack option for supporting these data-heavy IIoT use cases. This back-end approach, which we utilized for our data-layer solution, adds efficiency to our data acquisition, storage and enrichment. It also provides an additional open back end for data handling and storage, while sparing internal resources from these duties and allowing them to accelerate their time to market.

This data-back-end strategy enables IIoT-fueled businesses to better embrace the trend toward more distributed and adaptive monitoring and control applications—an area where fast and efficient data-stream utilization is a must—due to its ability to adapt and scale as data volume and performance demands increase. In performing benchmarking tests to investigate how best to expand our cloud-connectivity service and data-storage options, it became clear that the tandem approach could yield the flexibility, cost efficiency, and technical capabilities to effectively process real-time data at high volumes. 

Dive deeper! Find more data-analytics features here.