CompactClassificationNaiveBayes
Compact naive Bayes classifier for multiclass classification
Description
CompactClassificationNaiveBayes is a compact version of the
            naive Bayes classifier. The compact classifier does not include the data used for
            training the naive Bayes classifier. Therefore, you cannot perform some tasks, such as
            cross-validation, using the compact classifier. Use a compact naive Bayes classifier for
            tasks such as predicting the labels of the data.
Creation
Create a CompactClassificationNaiveBayes model from a full, trained
                ClassificationNaiveBayes classifier by
            using compact.
Properties
Predictor Properties
This property is read-only.
Predictor names, specified as a cell array of character vectors. The order of the
            elements in PredictorNames corresponds to the order in which the
            predictor names appear in the training data X.
This property is read-only.
Expanded predictor names, specified as a cell array of character vectors.
If the model uses dummy variable encoding for categorical variables, then
                ExpandedPredictorNames includes the names that describe the
            expanded variables. Otherwise, ExpandedPredictorNames is the same as
                PredictorNames.
This property is read-only.
Categorical predictor
    indices, specified as a vector of positive integers. CategoricalPredictors
    contains index values indicating that the corresponding predictors are categorical. The index
    values are between 1 and p, where p is the number of
    predictors used to train the model. If none of the predictors are categorical, then this
    property is empty ([]).
Data Types: single | double
This property is read-only.
Multivariate multinomial levels, specified as a cell array. The length of
                                CategoricalLevels is equal to the number of
                        predictors (size(X,2)).
The cells of CategoricalLevels correspond to predictors
                        that you specify as 'mvmn' during training, that is, they
                        have a multivariate multinomial distribution. Cells that do not correspond
                        to a multivariate multinomial distribution are empty
                        ([]).
If predictor j is multivariate multinomial, then
                                CategoricalLevels{j}
                        is a list of all distinct values of predictor j in the
                        sample. NaNs are removed from
                                unique(X(:,j)).
Predictor Distribution Properties
This property is read-only.
Predictor distributions, specified as a character vector or cell array of
                        character vectors. fitcnb uses the predictor
                        distributions to model the predictors. This table lists the available
                        distributions.
| Value | Description | 
|---|---|
| 'kernel' | Kernel smoothing density estimate | 
| 'mn' | Multinomial distribution. If you specify mn, then all features are
                                                  components of a multinomial distribution.
                                                  Therefore, you cannot include'mn'as an element of a string
                                                  array or a cell array of character vectors. For
                                                  details, see Estimated Probability for Multinomial Distribution. | 
| 'mvmn' | Multivariate multinomial distribution. For details, see Estimated Probability for Multivariate Multinomial Distribution. | 
| 'normal' | Normal (Gaussian) distribution | 
If DistributionNames is a 1-by-P cell
                        array of character vectors, then fitcnb models the feature
                                j using the distribution in element
                                j of the cell array.
Example: 'mn'
Example: {'kernel','normal','kernel'}
Data Types: char | string | cell
This property is read-only.
Distribution parameter estimates, specified as a cell array.
                DistributionParameters is a
                K-by-D cell array, where cell
                (k,d) contains the distribution parameter
            estimates for instances of predictor d in class k.
            The order of the rows corresponds to the order of the classes in the property
                ClassNames, and the order of the predictors corresponds to the
            order of the columns of X.
If class k has no observations for predictor
                j, then the
                    Distribution{
            is empty (k,j}[]).
The elements of DistributionParameters depend on the distributions
            of the predictors. This table describes the values in
                    DistributionParameters{.k,j}
| Distribution of Predictor j | Value of Cell Array for Predictor jand Classk | 
|---|---|
| kernel | A KernelDistributionmodel.
                                Display properties using cell indexing and dot notation. For
                                example, to display the estimated bandwidth of the kernel density
                                for predictor 2 in the third class, useMdl.DistributionParameters{3,2}.Bandwidth. | 
| mn | A scalar representing the probability that token j appears in class k. For details, see Estimated Probability for Multinomial Distribution. | 
| mvmn | A numeric vector containing the probabilities for each possible
                                level of predictor j in class
                                    k. The software orders the probabilities by
                                the sorted order of all unique levels of predictor
                                    j (stored in the property CategoricalLevels). For more details, see
                                    Estimated Probability for Multivariate Multinomial Distribution. | 
| normal | A 2-by-1 numeric vector. The first element is the sample mean and the second element is the sample standard deviation. For more details, see Normal Distribution Estimators | 
This property is read-only.
Kernel smoother type, specified as the name of a kernel or a cell array of kernel
            names. The length of Kernel is equal to the number of predictors
                (size(X,2)).
                Kernel{j} corresponds to
            predictor j and contains a character vector describing the type of
            kernel smoother. If a cell is empty ([]), then fitcnb did not fit a kernel distribution to the corresponding
            predictor.
This table describes the supported kernel smoother types. I{u} denotes the indicator function.
| Value | Kernel | Formula | 
|---|---|---|
| 'box' | Box (uniform) |  | 
| 'epanechnikov' | Epanechnikov |  | 
| 'normal' | Gaussian |  | 
| 'triangle' | Triangular |  | 
Example: 'box'
Example: {'epanechnikov','normal'}
Data Types: char | string | cell
Since R2023b
This property is read-only.
Predictor means, specified as a numeric vector. If you specify
                Standardize as 1 or true
            when you train the naive Bayes classifier using fitcnb, then the
            length of the Mu vector is equal to the number of predictors. The
            vector contains 0 values for predictors with nonkernel distributions,
            such as categorical predictors (see DistributionNames).
If you set Standardize to 0 or
                false when you train the naive Bayes classifier using
                fitcnb, then the Mu value is an empty
            vector ([]).
Data Types: double
Since R2023b
This property is read-only.
Predictor standard deviations, specified as a numeric vector. If you specify
                Standardize as 1 or true
            when you train the naive Bayes classifier using fitcnb, then the
            length of the Sigma vector is equal to the number of predictors.
            The vector contains 1 values for predictors with nonkernel
            distributions, such as categorical predictors (see
                DistributionNames).
If you set Standardize to 0 or
                false when you train the naive Bayes classifier using
                fitcnb, then the Sigma value is an empty
            vector ([]).
Data Types: double
This property is read-only.
Kernel smoother density support, specified as a cell array. The length of
                Support is equal to the number of predictors
                (size(X,2)). The cells represent the regions to which
                fitcnb applies the kernel density. If a cell is empty
                ([]), then fitcnb did not fit a kernel distribution to the corresponding
            predictor.
This table describes the supported options.
| Value | Description | 
|---|---|
| 1-by-2 numeric row vector | The density support applies to the specified bounds, for example [L,U], whereLandUare the finite lower and upper bounds,
                                respectively. | 
| 'positive' | The density support applies to all positive real values. | 
| 'unbounded' | The density support applies to all real values. | 
This property is read-only.
Kernel smoother window width, specified as a numeric matrix.
                                Width is a
                                K-by-P matrix, where
                                K is the number of classes in the data, and
                                P is the number of predictors
                                (size(X,2)).
Width(
                        is the kernel smoother window width for the kernel smoothing density of
                        predictor k,j)j within class
                                k. NaNs in column
                                j indicate that fitcnb did not fit
                        predictor j using a kernel density.
Response Properties
This property is read-only.
Unique class names used in the training model, specified as a categorical or character array, logical or numeric vector, or cell array of character vectors.
ClassNames has the same data type as Y, and
            has K elements (or rows) for character arrays. (The software treats string arrays as cell arrays of character
    vectors.)
Data Types: categorical | char | string | logical | double | cell
This property is read-only.
Response variable name, specified as a character vector.
Data Types: char | string
Training Properties
Prior probabilities, specified as a numeric vector. The order of the elements in
                Prior corresponds to the elements of
                Mdl.ClassNames.
fitcnb normalizes the prior probabilities
            you set using the 'Prior' name-value pair argument, so that
                sum(Prior) = 1.
The value of Prior does not affect the best-fitting model.
            Therefore, you can reset Prior after training Mdl
            using dot notation.
Example: Mdl.Prior = [0.2 0.8]
Data Types: double | single
Classifier Properties
Misclassification cost, specified as a numeric square matrix, where
                Cost(i,j) is the cost of classifying a point into class
                j if its true class is i. The rows correspond
            to the true class and the columns correspond to the predicted class. The order of the
            rows and columns of Cost corresponds to the order of the classes in
                ClassNames. 
The misclassification cost matrix must have zeros on the diagonal.
The value of Cost does not influence training. You can reset
                Cost after training Mdl using dot
            notation.
Example: Mdl.Cost = [0 0.5 ; 1 0]
Data Types: double | single
Classification score transformation, specified as a character vector or function handle. This table summarizes the available character vectors.
| Value | Description | 
|---|---|
| "doublelogit" | 1/(1 + e–2x) | 
| "invlogit" | log(x / (1 – x)) | 
| "ismax" | Sets the score for the class with the largest score to 1, and sets the scores for all other classes to 0 | 
| "logit" | 1/(1 + e–x) | 
| "none"or"identity" | x (no transformation) | 
| "sign" | –1 for x < 0 0 for x = 0 1 for x > 0 | 
| "symmetric" | 2x – 1 | 
| "symmetricismax" | Sets the score for the class with the largest score to 1, and sets the scores for all other classes to –1 | 
| "symmetriclogit" | 2/(1 + e–x) – 1 | 
For a MATLAB® function or a function you define, use its function handle for the score transformation. The function handle must accept a matrix (the original scores) and return a matrix of the same size (the transformed scores).
Example: Mdl.ScoreTransform = 'logit'
Data Types: char | string | function handle
Object Functions
| compareHoldout | Compare accuracies of two classification models using new data | 
| edge | Classification edge for naive Bayes classifier | 
| lime | Local interpretable model-agnostic explanations (LIME) | 
| logp | Log unconditional probability density for naive Bayes classifier | 
| loss | Classification loss for naive Bayes classifier | 
| margin | Classification margins for naive Bayes classifier | 
| partialDependence | Compute partial dependence | 
| plotPartialDependence | Create partial dependence plot (PDP) and individual conditional expectation (ICE) plots | 
| predict | Classify observations using naive Bayes classifier | 
| shapley | Shapley values | 
Examples
Reduce the size of a full naive Bayes classifier by removing the training data. Full naive Bayes classifiers hold the training data. You can use a compact naive Bayes classifier to improve memory efficiency.
Load the ionosphere data set. Remove the first two predictors for stability.
load ionosphere
X = X(:,3:end);Train a naive Bayes classifier using the predictors X and class labels Y. A recommended practice is to specify the class names.  fitcnb assumes that each predictor is conditionally and normally distributed.
Mdl = fitcnb(X,Y,'ClassNames',{'b','g'})
Mdl = 
  ClassificationNaiveBayes
              ResponseName: 'Y'
     CategoricalPredictors: []
                ClassNames: {'b'  'g'}
            ScoreTransform: 'none'
           NumObservations: 351
         DistributionNames: {1×32 cell}
    DistributionParameters: {2×32 cell}
  Properties, Methods
Mdl is a trained ClassificationNaiveBayes classifier.
Reduce the size of the naive Bayes classifier.
CMdl = compact(Mdl)
CMdl = 
  CompactClassificationNaiveBayes
              ResponseName: 'Y'
     CategoricalPredictors: []
                ClassNames: {'b'  'g'}
            ScoreTransform: 'none'
         DistributionNames: {1×32 cell}
    DistributionParameters: {2×32 cell}
  Properties, Methods
CMdl is a trained CompactClassificationNaiveBayes classifier.
Display the amount of memory used by each classifier.
whos('Mdl','CMdl')
Name Size Bytes Class Attributes CMdl 1x1 16436 classreg.learning.classif.CompactClassificationNaiveBayes Mdl 1x1 112598 ClassificationNaiveBayes
The full naive Bayes classifier (Mdl) is more than seven times larger than the compact naive Bayes classifier (CMdl).
To label new observations efficiently, you can remove Mdl from the MATLAB® Workspace, and then pass CMdl and new predictor values to predict.
Train and cross-validate a naive Bayes classifier. fitcnb implements 10-fold cross-validation by default. Then, estimate the cross-validated classification error.
Load the ionosphere data set. Remove the first two predictors for stability.
load ionosphere X = X(:,3:end); rng('default') % for reproducibility
Train and cross-validate a naive Bayes classifier using the predictors X and class labels Y. A recommended practice is to specify the class names. fitcnb assumes that each predictor is conditionally and normally distributed.
CVMdl = fitcnb(X,Y,'ClassNames',{'b','g'},'CrossVal','on')
CVMdl = 
  ClassificationPartitionedModel
    CrossValidatedModel: 'NaiveBayes'
         PredictorNames: {'x1'  'x2'  'x3'  'x4'  'x5'  'x6'  'x7'  'x8'  'x9'  'x10'  'x11'  'x12'  'x13'  'x14'  'x15'  'x16'  'x17'  'x18'  'x19'  'x20'  'x21'  'x22'  'x23'  'x24'  'x25'  'x26'  'x27'  'x28'  'x29'  'x30'  'x31'  'x32'}
           ResponseName: 'Y'
        NumObservations: 351
                  KFold: 10
              Partition: [1×1 cvpartition]
             ClassNames: {'b'  'g'}
         ScoreTransform: 'none'
  Properties, Methods
CVMdl is a ClassificationPartitionedModel cross-validated, naive Bayes classifier. Alternatively, you can cross-validate a trained ClassificationNaiveBayes model by passing it to crossval.
Display the first training fold of CVMdl using dot notation.
CVMdl.Trained{1}ans = 
  CompactClassificationNaiveBayes
              ResponseName: 'Y'
     CategoricalPredictors: []
                ClassNames: {'b'  'g'}
            ScoreTransform: 'none'
         DistributionNames: {1×32 cell}
    DistributionParameters: {2×32 cell}
  Properties, Methods
Each fold is a CompactClassificationNaiveBayes model trained on 90% of the data.
Full and compact naive Bayes models are not used for predicting on new data. Instead, use them to estimate the generalization error by passing CVMdl to kfoldLoss.
genError = kfoldLoss(CVMdl)
genError = 0.1852
On average, the generalization error is approximately 19%.
You can specify a different conditional distribution for the predictors, or tune the conditional distribution parameters to reduce the generalization error.
More About
In the bag-of-tokens model, the value of predictor j is the nonnegative number of occurrences of token j in the observation. The number of categories (bins) in the multinomial model is the number of distinct tokens (number of predictors).
Naive Bayes is a classification algorithm that applies density estimation to the data.
The algorithm leverages Bayes theorem, and (naively) assumes that the predictors are conditionally independent, given the class. Although the assumption is usually violated in practice, naive Bayes classifiers tend to yield posterior distributions that are robust to biased class density estimates, particularly where the posterior is 0.5 (the decision boundary) [1].
Naive Bayes classifiers assign observations to the most probable class (in other words, the maximum a posteriori decision rule). Explicitly, the algorithm takes these steps:
- Estimate the densities of the predictors within each class. 
- Model posterior probabilities according to Bayes rule. That is, for all k = 1,...,K, - where: - Y is the random variable corresponding to the class index of an observation. 
- X1,...,XP are the random predictors of an observation. 
- is the prior probability that a class index is k. 
 
- Classify an observation by estimating the posterior probability for each class, and then assign the observation to the class yielding the maximum posterior probability. 
If the predictors compose a multinomial distribution, then the posterior probability where is the probability mass function of a multinomial distribution.
Algorithms
If predictor variable j has a conditional normal distribution (see the DistributionNames property), the software fits the distribution to the data by computing the class-specific weighted mean and the unbiased estimate of the weighted standard deviation. For each class k:
- The weighted mean of predictor j is - where wi is the weight for observation i. The software normalizes weights within a class such that they sum to the prior probability for that class. 
- The unbiased estimator of the weighted standard deviation of predictor j is - where z1|k is the sum of the weights within class k and z2|k is the sum of the squared weights within class k. 
If all predictor variables compose a conditional multinomial distribution (see the
            DistributionNames property), the software fits the distribution
        using the Bag-of-Tokens Model. The software stores the probability
        that token j appears in class k in the
        property
                DistributionParameters{.
        With additive smoothing [2], the estimated probability isk,j}
where:
- which is the weighted number of occurrences of token j in class k. 
- nk is the number of observations in class k. 
- is the weight for observation i. The software normalizes weights within a class so that they sum to the prior probability for that class. 
- which is the total weighted number of occurrences of all tokens in class k. 
If predictor variable j has a conditional multivariate
        multinomial distribution (see the DistributionNames property), the
        software follows this procedure:
- The software collects a list of the unique levels, stores the sorted list in - CategoricalLevels, and considers each level a bin. Each combination of predictor and class is a separate, independent multinomial random variable.
- For each class k, the software counts instances of each categorical level using the list stored in - CategoricalLevels{.- j}
- The software stores the probability that predictor - jin class- khas level L in the property- DistributionParameters{, for all levels in- k,- j}- CategoricalLevels{. With additive smoothing [2], the estimated probability is- j}- where: - which is the weighted number of observations for which predictor j equals L in class k. 
- nk is the number of observations in class k. 
- if xij = L, and 0 otherwise. 
- is the weight for observation i. The software normalizes weights within a class so that they sum to the prior probability for that class. 
- mj is the number of distinct levels in predictor j. 
- mk is the weighted number of observations in class k. 
 
References
[1] Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. Springer Series in Statistics. New York, NY: Springer, 2009. https://doi.org/10.1007/978-0-387-84858-7.
[2] Manning, Christopher D., Prabhakar Raghavan, and Hinrich Schütze. Introduction to Information Retrieval, NY: Cambridge University Press, 2008.
Extended Capabilities
Usage notes and limitations:
- The - predictfunction supports code generation.
- When you train a naive Bayes model by using - fitcnb, the following restrictions apply.- The value of the - 'DistributionNames'name-value pair argument cannot contain- 'mn'.
- The value of the - 'ScoreTransform'name-value pair argument cannot be an anonymous function.
 
For more information, see Introduction to Code Generation.
Version History
Introduced in R2014bfitcnb supports the standardization of predictors with kernel
        distributions. That is, you can specify the Standardize name-value
        argument as true when the DistributionNames
        name-value argument includes at least one "kernel" distribution. Naive
        Bayes models include Mu and Sigma properties that
        contain the means and standard deviations, respectively, used to standardize the predictors
        before training. The properties are empty when fitcnb does not perform
        any standardization.
See Also
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Seleziona un sito web
Seleziona un sito web per visualizzare contenuto tradotto dove disponibile e vedere eventi e offerte locali. In base alla tua area geografica, ti consigliamo di selezionare: .
Puoi anche selezionare un sito web dal seguente elenco:
Come ottenere le migliori prestazioni del sito
Per ottenere le migliori prestazioni del sito, seleziona il sito cinese (in cinese o in inglese). I siti MathWorks per gli altri paesi non sono ottimizzati per essere visitati dalla tua area geografica.
Americhe
- América Latina (Español)
- Canada (English)
- United States (English)
Europa
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)