ABS’s latest advisory provides best practices for creating and using metadata for managing digital information to support the growing development and adoption of smart, autonomous and remote-control functions in the marine and offshore industries.
etadata defines and describes other data and what it relates to, providing context to enable data to be processed and translated into usable information.
Because of its role in translating other data, making it more useable, metadata is an essential element of smart, autonomous, and remote-control functions that rely on data to support decisions by humans or systems
said Patrick Ryan, ABS Senior Vice President, Global Engineering and Technology.
This standard (ISO 8000-61:2016 Data quality – Part 61: Data quality management: process reference model) provides a process approach to data quality management in a holistic way, overcoming the difficulties of isolated data quality activities. The basic structure of the data quality management process is composed of Implementation, Data-related Support and Resource Provision as shown in Figure 3 (ISO 8000-61:2016, 2016). The central process is Implementation which represents a continuous quality improvement model consisting of four repetitive sub-steps based on the ‘Plan – Do – Check – Act’ (PDCA) cycle (ASQ, 2019). Furthermore, this standard also defines the lower-level processes for data quality management. Each element is described by a purpose, outcomes and activities that are to be applied for the assurance of data quality.
This process approach is applicable to manage the quality of digital data sets including both structured and less structured data. However, the scope of this standard simply provides a process reference model for managing data quality and cannot serve as a methodology for data quality management. The process approach and PDCA model are done at a very high level, which are more theoretical than practical and need to be supplemented by detailed methods or procedures tailored for marine and offshore applications to achieve the outcomes of the defined processes.
ISO 14224 (Petroleum, petrochemical and natural gas industries – Collection and exchange of reliability and maintenance data for equipment) provides a systematic approach for data collection and exchange, resulting in improved quality of data (ISO 14224:2016, 2016). It focuses on the data required in various analyses and provides a standardized data collection format for facilitating the exchange of reliability and maintenance data throughout the operational life cycle. It also sets the foundation for a consistent tracking of failures and maintenance records, allowing the prioritization and implementation of corrective actions.
This standard describes data quality control and assurance practices and provides guidance for the user regarding the quality of reliability and maintenance data. Firstly, prior to the data collection process, a set of planning measures need to be completed to ensure that consistent and compatible data are obtained. Secondly, during and after the data collection process, standardization of data collection practices is defined.
Issues Associated with CMMS Data and Impacts
The typical data quality issues with CMMS data are summarized as follows:
- Insufficient sample size of the failure data
- Lack of information on time to failure
- Inconsistency in data type or format
- Logical errors – failure events referring to an asset or functional location which does not exist. This is because the asset data does not contain a list of relevant failure events. A functional location is used to bridge both the failure event and an asset. The connection between asset and failure event is indirect. Over time, assets can be replaced, repaired, or overhauled and installed in other places
- Illogical order of dates
- Missing events – missing rows and missing columns
- Implausible reliability data is not wrong – Reliability data is artificially changed by service technicians or SMEs based on their domain judgements. For example, implausible dates (e.g., failure on the day of delivery), similar entries (e.g., two sites which had identical failures) and round figures (e.g., 5000 hours)
- Low in richness of failure information – For example, unstructured ‘free-text’ failure descriptions are used rather than a standardized failure code. The date of start of operation and the date of failure are not available, the ‘related’ dates (e.g., delivery date, or reporting date) are used instead of actual dates. These will affect the accuracy of the advanced reliability analysis
Issues Associated with Event Data Generated by Sensors and Impacts
The data quality issues with the event data generated by sensors are summarized below:
- Missing data – The sensor stops providing any reading on its interface which results in absence of values/events. This may be caused by data disruption due to the constraints from data compression and security encryption losses
- Stuck/jammed data – The sensor reading gets jammed and stuck in some incorrect value
- Bias – The observed sensor reading deviates from the expected value by a constant offset or produces a temporal delay (e.g., sample selection bias, time-period bias)
- Invalid data type – Invalid data type for the same data element. This may be caused by data disruption due to the constraints from data compression and security encryption losses
- Different data format/pattern – Different data format/pattern for the same data element
- Wrong timestamps or illogical chronological order – Timestamps are mismatched to the expected timeline. Timestamps are not in chronological order