Summary Statistics Grouped by Category
Note
The nominal
and ordinal
array data types are not recommended. To represent ordered and unordered discrete, nonnumeric
data, use the Categorical Arrays data type instead.
Summary Statistics Grouped by Category
This example shows how to compute summary statistics grouped by levels of a categorical variable. You can compute group summary statistics for a numeric array or a dataset array using grpstats
.
Load sample data.
load hospital
The dataset array, hospital
, has 7 variables (columns) and 100 observations (rows).
Compute summary statistics by category.
The variable Sex
is a nominal array with two levels, Male
and Female
. Compute the minimum and maximum weights for each gender.
stats = grpstats(hospital,'Sex',{'min','max'},'DataVars','Weight')
stats = Sex GroupCount min_Weight max_Weight Female Female 53 111 147 Male Male 47 158 202
The dataset array, stats
, has observations corresponding to the levels of the variable Sex
. The variable min_Weight
contains the minimum weight for each group, and the variable max_Weight
contains the maximum weight for each group.
Compute summary statistics by multiple categories.
The variable Smoker
is a logical array with value 1
for smokers and value 0
for nonsmokers. Compute the minimum and maximum weights for each gender and smoking combination.
stats = grpstats(hospital,{'Sex','Smoker'},{'min','max'},... 'DataVars','Weight')
stats = Sex Smoker GroupCount min_Weight max_Weight Female_0 Female false 40 111 147 Female_1 Female true 13 115 146 Male_0 Male false 26 158 194 Male_1 Male true 21 164 202
The dataset array, stats
, has an observation row for each combination of levels of Sex
and Smoker
in the original data.