Nautilus Labs released a White Paper comparing the impact of different data source inputs on simulation accuracy rates that form the basis for performance and Voyage Optimization models.
The study evaluated Data Science metrics, R2 and RMSE, that measure the accuracy of simulations built with three types of data sets:
- Models based on noon reports only; ii) models based on high-frequency sensor data;
- Models based on a combination of a vessel’s noon reports enriched with high-frequency sensor data of similar vessels.
The paper found that while simulations built on high-frequency sensor data yield the most accurate simulations, in situations where a vessel is not equipped with sensors, simulation accuracy can be significantly improved by feeding the underlying model with a combination of data from the vessel’s noon reports and sensor data from similar vessels.
More specifically, the team made the following findings:
- Models based on sensor data (“HFD”) reliably yielded the most accurate predictions.
- Models based only on noon reports (“Noon-only”) yielded the least accurate predictions.
- Models based on the individual vessel’s noon reports combined with sensor data from similar vessels from the Nautilus data pool (“Noon + Nautilus data pool”) improved on the accuracy of noon-only models by a significant margin in every instance: “Noon + Nautilus data pool” provided 62% of the benefits of “HFD”, without the increased ship-specific sensor data requirements (RMSE); “Noon + Nautilus data pool” improved the predictive power over “Noon-only” by 33% (R2).
While high-frequency sensors are the gold standard in data collection for seafaring vessels, the reality is that many fleets may not yet be fully equipped with sensors. Being able to produce more accurate simulations even for vessels without sensors brings us much closer to achieving fleet-wide optimization and efficiency rooted in machine learning-based simulations
said Todd Sundsted, Chief Technology Officer (CTO).