Caterpillar Uses Big Data, Data Analytics, and Machine and Deep Learning to Build Ground-Truth for Training, Validation, and Deploying Classifiers

“We can access machine learning capabilities with a few lines of MATLAB code. Then, using code generation, engineers can deploy their trained classifier into the machine without manual intervention or delays in the process.”

Key Outcomes

  • Sped up design iterations
  • Automated labor-intensive tasks such as ground-truth labeling and comparison of classifier performance
  • Quickly shifted from collected data to an improved classifier running on the machine using machine learning apps and code generation

Caterpillar, in collaboration with MathWorks, developed a big data infrastructure, with a web front end to leverage external labelers, a database for searching and retrieving labeled ground-truth, and an engineer interface. This interface for machine learning, visualization, and code generation enables function developers to use the labeled ground-truth for training, validating, and deploying classifiers.

By automating the task of labeling field data, the system reduces the need for human intervention. It directs engineers to focus their data collection effort on the conditions of critical needs. The infrastructure is also scalable in the number of users, the amount of data, and the amount of available compute power.