Documentation

## Summary Statistics Grouped by Category

### Note

The `nominal` and `ordinal` array data types might be removed in a future release. To represent ordered and unordered discrete, nonnumeric data, use the Categorical Arrays (MATLAB) data type instead.

### Summary Statistics Grouped by Category

This example shows how to compute summary statistics grouped by levels of a categorical variable. You can compute group summary statistics for a numeric array or a dataset array using `grpstats`.

Load sample data.

`load hospital`

The dataset array, `hospital`, has 7 variables (columns) and 100 observations (rows).

Compute summary statistics by category.

The variable `Sex` is a nominal array with two levels, `Male` and `Female`. Compute the minimum and maximum weights for each gender.

`stats = grpstats(hospital,'Sex',{'min','max'},'DataVars','Weight')`
```stats = Sex GroupCount min_Weight max_Weight Female Female 53 111 147 Male Male 47 158 202 ```

The dataset array, `stats`, has observations corresponding to the levels of the variable `Sex`. The variable `min_Weight` contains the minimum weight for each group, and the variable `max_Weight` contains the maximum weight for each group.

Compute summary statistics by multiple categories.

The variable `Smoker` is a logical array with value `1` for smokers and value `0` for nonsmokers. Compute the minimum and maximum weights for each gender and smoking combination.

```stats = grpstats(hospital,{'Sex','Smoker'},{'min','max'},... 'DataVars','Weight')```
```stats = Sex Smoker GroupCount min_Weight max_Weight Female_0 Female false 40 111 147 Female_1 Female true 13 115 146 Male_0 Male false 26 158 194 Male_1 Male true 21 164 202 ```

The dataset array, `stats`, has an observation row for each combination of levels of `Sex` and `Smoker` in the original data.

Download ebook