Data science is transforming how we approach problem-solving, from preventing infectious diseases to optimizing power grids. Putting theory into practice, however, isn’t always so
easy. According to Gartner, 48 percent of companies invested in big data in 2016, but growth has slowed due to challenges with deployment strategies. Several factors contribute to these challenges, such how data science is embedded within an organization, building and deploying the appropriate data infrastructure, and the quality and availability of the data itself. While the data collected and analyzed across industries differs significantly, there are parallels in the approaches and challenges.
Prior to working as a data scientist in industry, my research was focused on data-driven modeling of infectious-disease transmission, control, and the immune response. In this arena, I developed many of the modeling skills I use in my job today—taking data from asset maintenance and operations to create models that contribute value to industrial organizations.
There are many parallels between applications in industrial settings and healthcare. Take, for example influenza, a well-known respiratory-tract infection transmitted from person to person either through direct contact or coughing, sneezing or talking. The impacts from an influenza epidemic range from increased morbidity rates in the immunocompromised (think young or elderly), to economic factors from decreased productivity due to the incapacitating effects of the illness.
When someone visits a doctor after contracting influenza, his or her symptoms and treatments are recorded in medical records. This data may include body temperature, heart rate, blood pressure, timing of when symptoms began, and of course patient characteristics including age, weight, height, ethnicity. If an epidemic has spread throughout a community, symptoms and treatments from many doctor visits will be recorded.
Through the analysis of this aggregated patient data, we can develop data-driven approaches to diagnose and treat patients. Past cases can suggest treatment options and help identify treatment strategies based on patient profiles (personalized medicine). For high-risk patients, home health monitors can detect worsening health conditions. This data can even shed light on the most effective treatment centers and hospitals, identify susceptible populations, and determine optimal intervention and control strategies.
This approach in healthcare mirrors how we approach asset health in industrial organizations. In the industrial sector, one of the most traumatic events for any organization is asset failure. From stopped production and costly physical damage to safety concerns for employees and the environment, the impacts of asset failure remain a top risk. While asset failure itself is not comparable to human health, similar methods are used for analyzing data. Connected sensors and historical maintenance data can be effectively used to analyze equipment failures across similar asset types and identify patterns in asset failure. Classification of past failure events can identify recurring failures as candidates for risk-mitigation strategies, and repair options can be suggested based on past fixes. Condition monitoring of asset health using real-time data gives early warning to potential failures, which can be used to avoid unplanned downtime.
The art of data science is nuanced within each business sector, but what remains consistent is its alignment with business goals. A healthcare facility may look to improve the quality of patient care in a cost-effective manner; this business objective can be tackled by better coordinating patient data followed by developing methods for identifying and delivering actionable insights from the data to the medical practitioner in the right way at the right time.
The same concept applies to my organization, where a business goal is to deliver analytics within a software solution that enables companies to intelligently manage their assets in a consumable manner. Coordinating and integrating asset data, in combination with connecting asset data with use cases and actionable insights, is central to the challenge in data science problem solving.
Sarah Lukens is a data scientist in asset-performance management with GE Digital.