Using the hybrid-data model to bridge cloud & on-premises solutions

March 24, 2021
It provides industrial organizations a balanced approach to data storage that supports local plant management and optimization programs.

By Steve Pavlosky, director of digital product management with GE Digital

Industrial companies are increasingly talking about how they want to use cloud-based analytics to optimize plant operations. Leaders see value in having a single platform for data storage and analytics. The question is: How to leverage these analytics on the vast amount of data created by assets in the plant?

One answer is a hybrid model that enables organizations to bridge the cloud and on-premises  solutions to obtain the benefits of both.

IT leaders around the world see the potential value. Pilot projects commence, and then, a couple of months into the project, someone in finance sees the bill for cloud-based data ingestion and storage and the red flags go up. 

The cloud is not cost-effective for the storage of the high-resolution, time-series data generated by machines and process equipment. So how do we leverage the power of cloud-based analytics tools on the vast amounts of data created by modern machines? Fortunately, the answer is clear: a hybrid-data-management model that uses historian technology near the source of the data (in the plant or corporate data center) and moves the relevant data at the right frequency to the cloud for analytics.

Machine and process data gathered in the plants is often collected at one second (or faster) intervals. When you do the math, each sensor generates about 2.6 million individual values per month, which need to be processed and potentially stored.

Here’s a common and scenario involving a major cloud-based analytics platform provider that charges both for the processing (i.e. loading of data) and the storage of that data. Using their storage option for the application, the cost of processing and storing the data for a year is approximately $200 (US) per sensor. The storage option is limited to about 120 sensors at that rate. Another storage option is slightly cheaper and can store 10 times as much data, but is limited to 1,200 sensors. If you have 1,200 sensors at $150 per year, the cost would be $180,000 per year.

A traditional process historian is purpose-built to store time-series data very efficiently. For a 12,000-tag historian license (10x the max of the cloud-based analytics platform), license and support cost plus the server to run the software would be less than $90,000 over a three-year period. That’s more than 80% savings compared to using the cloud for raw storage. Yes, there is electricity cost to think about and IT overhead. However, with 80% savings, the electricity and IT overhead are incremental.

The missing element here is the cloud-based analytic suite. With a hybrid-data model, industrial organizations can bridge these two technologies to obtain the benefits of both.

In a hybrid-data model, the industrial historian and cloud work together to meet the needs of both the IT and the operations teams. An industrial historian is highly efficient at storing time-series data at scale; users today often store tens of thousands of tags per plant for local analysis and reporting purposes. Industrial historian technology is built for this purpose with key features for cost effectiveness, efficiency and security.

One of the key technologies used to store this large amount of data is compression, which minimizes the data stored to disk or moved between servers. Industrial historian technology also improves query performance and has built-in aggregation features. Furthermore, industrial historian technology has collectors, which move data from a source to a destination. These collectors can use the compression and aggregation functionality to dramatically reduce the amount of data moved from a source to a destination. Using the historian’s collectors, users can define both compression ratios and aggregation that result in sending only the important data value changes at the right rate needed for analytics to the cloud. This means very large amounts of high-resolution data stored at the plant and a small subset of that data sent to the cloud for analytics, at the right rate.

The hybrid-data model

This combination of on-premise historian coupled with cloud storage for the specific data required for analytics—the hybrid-data model—provides industrial organizations with a balanced approach to data storage that supports local plant management, optimization programs, and minimizes overall cost.

For the on-prem technology in the hybrid-data model, it’s important to remember that historians offer significant advantages over relational databases (RDBs), which have helped manufacturers gain more information about their operations by supporting simple operator queries, answering questions such as: Which customer ordered the largest shipment? RDBs are built to manage relationships and are ideal for storing contextual or genealogical information about manufacturing processes, but are rarely the best approach for vast amounts of process-data collection and optimization.

On the other hand, historians are designed for manufacturing and process-data acquisition and presentation. They maximize the power of time-series data and excel at answering questions that manufacturing typically needs to address real-time decisions in production such as: What was todays hourly unit production average compared to where it was a year ago or two years ago today?

Historians offer key advantages over RDBs, including built-in data collection capabilities, faster speeds, higher data compression, robust redundancy , enhanced data security, and quicker time to value.

In a hybrid-cloud model, compression is particularly important. The powerful compression algorithms of plant- or enterprise-wide historians enable users to store years of data easily and securely online, which enhances performance, reduces maintenance, and lowers costs. Archives can be automatically created, backed up, and purged—enabling extended use without the need for a database administrator.

As a result, industrial organizations can leverage increased process visibility for better and faster decisions, increased productivity, and reduced costs for a sustainable competitive advantage.

For example, asset-performance management (APM) solutions typically leverage on-prem historian technology, which sends the relevant data to the cloud. The APM solution accesses the data from the cloud for analytics and optimization. The hybrid model reduces costs and maintenance while ensuring that process engineers have the data that they need for analysis.

In another example, a food manufacturer uses HMI/SCADA and MES solutions in conjunction with a historian for time-series and A&E data management in a hybrid on-prem/cloud data model. The historian system collects data at very high speed from the multiple data sources, aggregates it, and stores it efficiently and securely. A subset of the data is sent to the cloud and leveraged by analytics software. This solution has reduced raw materials costs and decreased customer complaints by 33%.

With the increase in analytics, industrial organizations cannot fully predict what data they will need to answer the next issue. Fortunately, the hybrid-data model allows companies to use historian technology to secure a cost-effective, flexible way to collect all the data—and have it available for sending to the cloud and driving analytics.