Contenuto principale

Machine Learning Pipeline Phases

Machine learning pipelines and components go through different phases during the course of their development. These phases are based on the status of the pipeline or component learnables. A learnable is a data-dependent parameter that is set when a component processes data using the learn object function. When a pipeline has unset learnables, the pipeline is in the learn phase. When all of a pipeline's learnables are set, the pipeline is in the run phase.

Learn Phase

When you create a pipeline or component object, the data-dependent learnables are empty, which means the pipeline or component is in the learn phase. During this phase, you can modify learn parameters to adjust the behavior of the pipeline to meet your specifications. When you are satisfied with the results, use the learn object function and training data to learn (or train) the pipeline or component. This process sets the learnable values. Learning is different for components and pipelines:

  • Component — The component processes the input data based on the specifications set by the learn parameters and assigns values to the learnables based on the data.

  • Pipeline — A pipeline contains multiple components that process data in a specific order. Each component of a pipeline learns as data passes through the component in the order specified by the pipeline structure. A component learns only after all preceding components have learned. Learning in order ensures that each component sets learnable values using data transformed by the previous components, and prevents inconsistencies between the data used to learn the pipeline and the data use to run the pipeline.

Run Phase

A pipeline or component is in the run phase when all its learnables are set and its HasLearned property is true. You can use the learned pipeline or component to process data using the run object function.

During the run phase, all learn parameters are locked and cannot be changed. If you need to change a learn parameter value, use the reset function to unlock the learn parameters and clear all learnable values. This action returns the pipeline or component to the learn phase.

You can update the value of run parameters at any time. Run parameters do not affect the computation of learnable values during the learn phase and, instead, provide opportunities to adjust component behavior when learning or running the pipeline.

Iterative Pipeline Development

Pipeline development is not always a linear process. A pipeline can move between the learn phase and the run phase many times during development. Two pipelines mechanisms make this iterative process possible.

  • reset function — For a pipeline or component in the run phase, the reset object function unlocks learn parameters and clears any set learnables. You can then adjust a component's learn parameters to tune its behavior. If the component is part of a pipeline, the reset function resets the specified component and all subsequent pipeline components while maintaining the learnable values of any preceding components. Using the reset function returns your pipeline or component to the learn phase.

  • Ordered learning — The learn object function evaluates components in a specific order. If a pipeline contains components that have already learned, the learn object function reuses the set learnables and does not relearn the component. Any learned components must precede all unlearned components in the pipeline order.

Using these tools, you can move a pipeline between the learn and run phases without relearning the entire pipeline. Once you are satisfied with the performance of a component, you can maintain its learnable values and still add new components to the pipeline or modify existing components in the pipeline.

Pipeline Phases During Deployment

You can train and execute machine learning pipelines either locally in MATLAB® or while they are deployed to a target. Use the package object function to generate either a standalone application or an archive for deployment on MATLAB Production Server™.

  • With a standalone application, you can execute the pipeline on a system without a licensed copy of MATLAB. Simply share the executable file to allow access to the pipeline.

  • With a deployable archive for MATLAB Production Server, you can integrate the pipeline into web, database, and production enterprise applications. To execute a deployed pipeline, create a client object from MATLAB or Python®. For information on client workflows in MATLAB, see prodserver.pipeline.Client (MATLAB Production Server). For information on client workflows in Python, see matlab.production_server.client.ProdserverClient (MATLAB Production Server).

After you create a pipeline in MATLAB, you can deploy the pipeline in the learn phase, or train the pipeline locally and deploy it in the run phase. A pipeline deployed in the learn phase provides a structural framework for data processing, and allows downstream users to modify learn parameters and train the pipeline using custom data. A pipeline deployed in the run phase provides a data processing workflow tuned for a specific purpose, and allows downstream users to process new data that complies with the data set used to train the pipeline.

See Also

| | |

Topics