- Register


Home >Get more from your production data

Get more from your production data

23 February 2021

Strategic Data Science is an essential pillar of every Industry 4.0 scenario. A four-step data mining approach based on CRISP-DM supports successful projects, as outlined by Dingeman Knaap and Tim Foreman

WHENEVER BIG Data is mentioned, most people think of social media or the analysis of customer behaviour in online commerce. However, strategic data analysis is also gaining momentum in the production environment. Frost & Sullivan believes that data analysis in the industrial sector has immense potential – production efficiency could be increased by about 10%, operating costs could be reduced by almost 20% and maintenance costs could be minimised by 50% when focusing on utilising the data that’s already there in the production process.

Important insights that are hidden in the available information are lost

The issue for many factories today is that while the data is easily collected and stored, little happens after that, and important insights that are hidden in the available information are lost. In addition, there is often a lack of budget and personnel to devote to this task. But those who overcome these hurdles and focus on Industrial Data Science will soon gain new insights to transform their production environment into a data paradise.

Beyond manual analyses

Channelling the huge flood of data and extracting value from the information collected by sensors, controllers and machines is undoubtedly a complex task, as it involves more than standard statistical methods and tools. Manual evaluations and the creation of dashboards and reports are not enough. One reason for this is that dashboards become increasingly complicated as data volume expands. They also don’t show relevant information at the right time so that an operator can see at a glance what is going on and take action. The routines implemented in a normal machine control system for monitoring production processes and detecting errors can identify current deviations and problems. However, they are not able to predict future problems, link information in a meaningful way and perform advanced analysis.

The central task of data analysis in Industry 4.0 scenarios is to extract decision-relevant information from collected data and present it to the right user at the right time. This involves planning the process of converting data into useful information in a conscientious and well-founded manner before implementing it. The process requires close cooperation between data experts (data scientists) and specialists in production processes who know the story behind the data.

Volume, variety & velocity  

Data scientists are especially familiar with the '3 Vs' of large data sets: Volume, Variety and Velocity. A modern packaging machine, for example, can easily generate gigabytes of data per day that can be stored over a long period of time. For inspection machines, the systems may generate many terabytes each day. Storing this amount of data is not a problem, but using it is a challenge. Furthermore, machines today not only produce data, but the type of data is much broader than it was a few years ago – measured values stored, as well as raw information from sensors and other metadata. It is not only about maintenance results, but also associated images. Additionally, data can be generated by the machine operator. This includes cycle times and even written and spoken feedback.

An operator needs to be informed about potential problems immediately

But that's not all: raw data from sensors is typically read every millisecond and must be treated as streaming data. Concurrently, the speed of data analysis is playing an increasingly important role. As such, updating the dashboards once a day or every hour is simply not enough. An operator needs to be informed about potential problems immediately to avoid difficulties and downtime. Ideally, the machine should therefore be notified in real time so that it can automatically correct itself within the same product cycle. In addition, data may be corrupted due to a problem in the sensor or other device, it might go missing or it could be recorded in an outdated manner. Because these scenarios can seriously compromise analysis and lead to false conclusions, data scientists must continually check the “Veracity” of the data – a fourth “V”.

Industrial Data Science is a relatively new discipline, which is why there is no broadly valid approach that is suitable for every company. Every solution and application requires customised data analysis and modelling to achieve the best possible result. However, a standard approach is useful. The CRISP-DM model, (Cross-Industry Standard Process for Data Mining) is the most commonly adapted basis. OMRON simplified and tailored CRISP-DM into a new approach. The four steps of this approach are preparation, analysis and application development, evaluation and maintenance. More information about these phases can be found in the box below.

Practical example: SMT line

A data-driven solution does not always have to include complex machine learning models or artificial intelligence. Sometimes, effective data processing to provide the right information at the right time in the right way can be enough. An illustrative example of such a data science project can be found in the whitepaper "Data Science Services by Omron – How to get the full value from your factory floor data", which is available for free download. The project was carried out at the Omron Manufacturing of the Netherlands (OMN) factory on surface-mount technology (SMT) lines where electronic components are mounted and soldered onto printed circuit boards (PCBs).

Developing the potential of Big Data in your own production environment is no small feat, but it’s worth doing. In today’s manufacturing environment, it’s not enough to just collect data and build a few graphs. Instead, filtering out production-relevant information from the data and presenting it to the appropriate audience in the right way is vital. The key is to transform data into useful information. This must be done in close cooperation between data scientists and experts in the production process. Only then can a solution be developed that is popular, often used and generates long-term value.

Data science project approach: preparation, analysis and application development, evaluation and maintenance

Phase 1: Preparation

The preparation phase is the most important. A data science project will never be successful if the goal is unclear. Therefore, in this first step, all participants and area experts deal with the problem or the specific requirement in order to arrive at a clearly defined project goal. This is carried out by analysing the machine and/or the production process in detail in order to gain an overview of which data is already available and which still needs to be collected. In this process, an initial data set is collected and analysed as a kind of feasibility study. At the end of the preparation phase, a report is produced that provides insights into the expected generated value and a realistic ROI.

Phase 2: Analysis and application development

In the following stage, the data is collected over a longer period of time in order to obtain a representative picture of the machine and process behaviour. Depending on the project objective, a data pipeline contains the following stages:

- Data collection: Data is collected from various sources – from raw sensor data to information from MES systems.

- Data pre-processing: The collected data is prepared for the analysis step, transformed, merged and cleaned up.

- Data analytics: The developed analysis algorithms and machine learning models are applied.

- Application: The results and conclusions of the data analysis are made available. An example is visualisations that are tailored to the situation, target group or as feedback to the machine.

The necessary machine learning models can be trained and validated together with the other data processing steps. If the validation is successful, an application is developed based on the described data pipeline, which can easily be implemented and executed.

Phase 3: Evaluation

The application is used in the production environment where performance and business results are evaluated. If the performance does not meet expectations, the previous project phases are repeated.

Phase 4: Service and maintenance

Production processes and machine behaviour are subject to constant change over time. Reasons for this can be updates or wear and tear. Therefore, a regular revalidation of the solution is necessary to ensure that it works realistically and retains its value. In addition, the amount of data available continues to increase, meaning superior models can often be developed. As a result, existing (machine learning) models should be reviewed regularly.

Dingeman Knaap is senior R&D engineer, Omron Europe, and Tim Foreman is European R&D manager, Omron Europe


Key Points

  • Utilising the data already available in the production process, production efficiency can be increased and operating costs reduced
  • Systems may generate many terabytes each day; storing this amount of data is not a problem, but using it is a challenge
  • A data-driven solution does not always have to include complex machine learning models or artificial intelligence