Contenuto principale

predict

Predict labels using classification XGBoost model

Since R2026a

    Description

    labels = predict(mdl,X) returns a vector of predicted class labels for the predictor data in the table or matrix X, based on the pretrained XGBoost classification model mdl.

    labels = predict(mdl,X,UseParallel=UseParallel) specifies whether to perform computations in parallel.

    [labels,scores] = predict(___) also returns a matrix of classification scores indicating the likelihood that a label comes from a particular class, using any of the input argument combinations in the previous syntaxes. For each observation in X, the predicted class label corresponds to the maximum score among all classes.

    Examples

    collapse all

    Import a pretrained XGBoost multiclass classification model trained using the NLP dataset with 13 classes. The pretrained model is provided with this example.

    The model was trained in Python and saved as a json file using model.save_model('trainedMulticlassXGBoostModel.json').

    load nlpdata
    modelfile = "trainedMulticlassXGBoostModel.json";
    Mdl = importModelFromXGBoost(modelfile)
    Mdl = 
      CompactClassificationXGBoost
                   ResponseName: 'Y'
                     ClassNames: [0 1 2 3 4 5 6 7 8 9 10 11 12]
                 ScoreTransform: 'softmax'
                     NumTrained: 390
        ImportedModelParameters: [1×1 struct]
    
    
      Properties, Methods
    
    

    The model is imported as a CompactClassificationXGBoost model object.

    Use the dot notation to view the imported model parameters.

    Mdl.ImportedModelParameters
    ans = struct with fields:
                BaseScore: 0.5000
                Objective: 'multi:softmax'
                  Booster: 'Tree'
        NumBoostingRounds: 30
         HasParallelTrees: 1
               NumClasses: 13
                 IsBinary: 0
    
    

    The parameters indicate it is a multiclass classification model trained using the 'Tree' booster. Convert the response data Y to an encoded array to match the imported model and the predictor matrix X to a full matrix.

    labels = categories(Y);
    
    % Create mapping with 0-based indexing
    labelMap = containers.Map(labels,0:length(labels)-1);
    
    % Encode Y
    Y = cellstr(Y);
    encodedY = zeros(size(Y));
    for i = 1:length(Y)
        encodedY(i) = labelMap(Y{i});
    end
    
    fullX = full(X);

    Use the predict function to evaluate the class for the ninth sample.

    predict(Mdl,fullX(9,:))
    ans = 
    0
    

    Input Arguments

    collapse all

    Compact classification XGBoost model, specified as a CompactClassificationXGBoost model object created with importModelFromXGBoost.

    Predictor data to be classified, specified as a numeric matrix or a table.

    Each row of X corresponds to one observation, and each column corresponds to one variable. If there are missing values in a row, the software uses the learned branch direction from the pretrained model. The predictor data cannot include categorical predictors (logical, categorical, char, string, or cell).

    For a numeric matrix, the variables that make up the columns of X must have the same order as the predictor variables used to train mdl.

    For a table:

    • predict does not support multicolumn variables or cell arrays other than cell arrays of character vectors.

    • All predictor variables in X must have the same variable names and data types as those stored in mdl.PredictorNames. X can contain additional variables, such as response variables and observation weights, but predict ignores them.

    Flag to run in parallel, specified as a numeric or logical 1 (true) or 0 (false). If you specify UseParallel=true, the predict function executes for-loop iterations by using parfor. The loop runs in parallel when you have Parallel Computing Toolbox™.

    Example: UseParallel=true

    Data Types: logical

    Output Arguments

    collapse all

    Predicted class labels, returned as a numeric array.

    The predict function classifies an observation into the class yielding the highest score.

    Class scores, returned as a numeric matrix with one row per observation and one column per class. For each observation and each class, the score represents the confidence that the observation originates from that class. A higher score indicates a higher confidence.

    Extended Capabilities

    expand all

    Version History

    Introduced in R2026a