Main Content

Tables of Mixed Data

Store Related Data in Single Container

You can use the table data type to collect mixed-type data and metadata properties, such as variable names, row names, descriptions, and variable units, in a single container. Tables are suitable for column-oriented or tabular data that is often stored as columns in a text file or in a spreadsheet. For example, you can use a table to store experimental data, with rows representing different observations and columns representing different measured variables.

Tables consist of rows and column-oriented variables. Variables in a table can have different data types and different sizes, but the variables must have the same number of rows. Also, the data within a variable is homogeneous, which enables you to treat a table variable like an array of data.

For example, load sample data about patients from the patients.mat MAT-file. Combine blood pressure data into a single variable. Convert a four-category variable called SelfAssessedHealthStatus—which has values of Poor, Fair, Good, or Excellent—to a categorical array. View information about several of the variables.

load patients
BloodPressure = [Systolic Diastolic];
SelfAssessedHealthStatus = categorical(SelfAssessedHealthStatus);

whos("Age","Smoker","BloodPressure","SelfAssessedHealthStatus")
  Name                            Size            Bytes  Class          Attributes

  Age                           100x1               800  double                   
  BloodPressure                 100x2              1600  double                   
  SelfAssessedHealthStatus      100x1               624  categorical              
  Smoker                        100x1               100  logical                  

Now, create a table from these variables and display it. The variables can be stored together in a table because they all have the same number of rows, 100.

T = table(Age,Smoker,BloodPressure,SelfAssessedHealthStatus)
T=100×4 table
    Age    Smoker    BloodPressure    SelfAssessedHealthStatus
    ___    ______    _____________    ________________________

    38     true       124     93             Excellent        
    43     false      109     77             Fair             
    38     false      125     83             Good             
    40     false      117     75             Fair             
    49     false      122     80             Good             
    46     false      121     70             Good             
    33     true       130     88             Good             
    40     false      115     82             Good             
    28     false      115     78             Excellent        
    31     false      118     86             Excellent        
    45     false      114     77             Excellent        
    42     false      115     68             Poor             
    25     false      127     74             Poor             
    39     true       130     95             Excellent        
    36     false      114     79             Good             
    48     true       130     92             Good             
      ⋮

Each variable in a table has one data type. If you add a new row to the table, MATLAB® forces consistency of the data type between the new data and the corresponding table variables. For example, if you try to add information for a new patient where the first column contains the patient's health status instead of age, as in the expression T(end+1,:) = {"Poor",true,[130 84],37}, then you receive the error:

Right hand side of an assignment to a categorical array must be a categorical or text representing a category name.

The error occurs because MATLAB® cannot assign numeric data, 37, to the categorical array, SelfAssessedHealthStatus.

Access Data Using Numeric or Named Indexing

You can index into a table using parentheses, curly braces, or dot notation. Parentheses allow you to select a subset of the data in a table and preserve the table container. Curly braces and dot notation allow you to extract data from a table. Within each table indexing method, you can specify the rows or variables to access by name or by numeric index.

Consider the sample table from above. Each row in the table, T, represents a different patient. The workspace variable, LastName, contains unique identifiers for the 100 rows. Add row names to the table by setting the RowNames property to LastName and display the first five rows of the updated table.

T.Properties.RowNames = LastName;
T(1:5,:)
ans=5×4 table
                Age    Smoker    BloodPressure    SelfAssessedHealthStatus
                ___    ______    _____________    ________________________

    Smith       38     true       124     93             Excellent        
    Johnson     43     false      109     77             Fair             
    Williams    38     false      125     83             Good             
    Jones       40     false      117     75             Fair             
    Brown       49     false      122     80             Good             

In addition to labeling the data, you can use row and variable names to access data in the table. For example, use named indexing to display the age and blood pressure of the patients Williams and Brown.

T(["Williams","Brown"],["Age","BloodPressure"])
ans=2×2 table
                Age    BloodPressure
                ___    _____________

    Williams    38      125     83  
    Brown       49      122     80  

Now, use numeric indexing to return an equivalent subtable. Return the third and fifth rows from the first and third variables.

T([3 5],[1 3])
ans=2×2 table
                Age    BloodPressure
                ___    _____________

    Williams    38      125     83  
    Brown       49      122     80  

For more information on table indexing, see Access Data in Tables.

Describe Data with Table Properties

In addition to storing data, tables have properties to store metadata, such as variable names, row names, descriptions, and variable units. You can access a property using T.Properties.PropName, where T is the name of the table and PropName is the name of a table property.

For example, add a table description, variable descriptions, and variable units for Age.

T.Properties.Description = "Simulated Patient Data";

T.Properties.VariableDescriptions = ...
    ["" ...
     "true or false" ...
     "Systolic/Diastolic" ...
     "Status Reported by Patient"];

T.Properties.VariableUnits("Age") = "Yrs";

Individual empty strings within VariableDescriptions indicate that the corresponding variable does not have a description. For more information, see the Properties section of table.

To print a table summary, use the summary function.

summary(T)
T: 100x4 table

Description: Simulated Patient Data

Variables:

    Age: double (Yrs)
    Smoker: logical (34 true, true or false)
    BloodPressure: 2-column double (Systolic/Diastolic)
    SelfAssessedHealthStatus: categorical (4 categories, Status Reported by Patient)

Statistics for applicable variables:

                                NumMissing      Min        Median         Max          Mean           Std    

    Age                             0            25             39         50         38.2800        7.2154  
    BloodPressure(:,1)              0           109            122        138        122.7800        6.7128  
    BloodPressure(:,2)              0            68        81.5000         99         82.9600        6.9325  
    SelfAssessedHealthStatus        0                                                                        

Comparison to Cell Arrays

Like a table, a cell array can provide storage for mixed-type data in a single container. But unlike a table, a cell array does not provide metadata that describes its contents. It does not force data in its columns to remain homogenous. You cannot access the contents of a cell array using row names or column names.

For example, convert T to a cell array using the table2cell function. The output cell array contains the same data but has no information about that data. If it is important to keep such information attached to your data, then storing it in a table is a better choice than storing it in a cell array.

C = table2cell(T)
C=100×4 cell array
    {[38]}    {[1]}    {[124 93]}    {[Excellent]}
    {[43]}    {[0]}    {[109 77]}    {[Fair     ]}
    {[38]}    {[0]}    {[125 83]}    {[Good     ]}
    {[40]}    {[0]}    {[117 75]}    {[Fair     ]}
    {[49]}    {[0]}    {[122 80]}    {[Good     ]}
    {[46]}    {[0]}    {[121 70]}    {[Good     ]}
    {[33]}    {[1]}    {[130 88]}    {[Good     ]}
    {[40]}    {[0]}    {[115 82]}    {[Good     ]}
    {[28]}    {[0]}    {[115 78]}    {[Excellent]}
    {[31]}    {[0]}    {[118 86]}    {[Excellent]}
    {[45]}    {[0]}    {[114 77]}    {[Excellent]}
    {[42]}    {[0]}    {[115 68]}    {[Poor     ]}
    {[25]}    {[0]}    {[127 74]}    {[Poor     ]}
    {[39]}    {[1]}    {[130 95]}    {[Excellent]}
    {[36]}    {[0]}    {[114 79]}    {[Good     ]}
    {[48]}    {[1]}    {[130 92]}    {[Good     ]}
      ⋮

To access subsets of data in a cell array, you can only use indexing with parentheses or curly braces.

C(1:5,1:3)
ans=5×3 cell array
    {[38]}    {[1]}    {[124 93]}
    {[43]}    {[0]}    {[109 77]}
    {[38]}    {[0]}    {[125 83]}
    {[40]}    {[0]}    {[117 75]}
    {[49]}    {[0]}    {[122 80]}

Comparison to Structures

Structures also can provide storage for mixed-type data. A structure has fields that you can access by name, just as you can access table variables by name. However, it does not force data in its fields to remain homogenous. Structures do not provide any metadata to describe their contents.

For example, convert T to a scalar structure where every field is an array, in a way that resembles table variables. Use the table2struct function with the ToScalar name-value argument.

S = table2struct(T,ToScalar=true)
S = struct with fields:
                         Age: [100x1 double]
                      Smoker: [100x1 logical]
               BloodPressure: [100x2 double]
    SelfAssessedHealthStatus: [100x1 categorical]

In this structure, you can access arrays of data by using field names.

S.Age
ans = 100×1

    38
    43
    38
    40
    49
    46
    33
    40
    28
    31
      ⋮

But to access subsets of data in the fields, you can only use numeric indices, and you can only access one field at a time. Table row and variable indexing provides more flexible access to data in a table.

S.Age(1:5)
ans = 5×1

    38
    43
    38
    40
    49

See Also

| | | |

Related Topics