Main Content

rowfun

Apply function to table or timetable rows

Description

B = rowfun(func,A) applies the function func to each row of the table or timetable A and returns the results in the table or timetable B.

func accepts size(A,2) inputs.

If A is a timetable and func aggregates data over groups of rows, then rowfun assigns the first row time from each group of rows in A as the corresponding row time in B. To return B as a table without row times, specify 'OutputFormat' as 'table'.

example

B = rowfun(func,A,Name,Value) applies the function func to each row of the table A with additional options specified by one or more Name,Value pair arguments.

For example, you can specify which variables to pass to the function func and how to call func.

Examples

collapse all

Apply the function hypot to each row of the 5-by-2 table A to find the shortest distance between the variables x and y.

Create a table, A, with two variables of numeric data.

rng('default')
x = randi(10,[5,1]);
y = randi(10,[5,1]);
A = table(x,y)
A=5×2 table
    x     y 
    __    __

     9     1
    10     3
     2     6
    10    10
     7    10

Apply the function, hypot, to each row of A. The function hypot takes two inputs and returns one output.

B = rowfun(@hypot,A,'OutputVariableNames','z')
B=5×1 table
      z   
    ______

    9.0554
     10.44
    6.3246
    14.142
    12.207

B is a table.

Append the function output, B, to the input table, A.

[A B]
ans=5×3 table
    x     y       z   
    __    __    ______

     9     1    9.0554
    10     3     10.44
     2     6    6.3246
    10    10    14.142
     7    10    12.207

Define and apply a geometric Brownian motion model to a range of parameters.

Create a function in a file named gbmSim.m that contains the following code.

% Copyright 2015 The MathWorks, Inc.

function [m,mtrue,s,strue] = gbmSim(mu,sigma)
% Discrete approximation to geometric Brownian motion
%
% [m,mtrue,s,strue] = gbmSim(mu,sigma) computes the 
% simulated mean, true mean, simulated standard deviation, 
% and true standard deviation based on the parameters mu and sigma.
numReplicates = 1000; numSteps = 100;
y0 = 1;
t1 = 1;
dt = t1 / numSteps;
y1 = y0*prod(1 + mu*dt + sigma*sqrt(dt)*randn(numSteps,numReplicates));
m = mean(y1); s = std(y1);

% Theoretical values
mtrue = y0 * exp(mu*t1); strue = mtrue * sqrt(exp(sigma^2*t1) - 1);
end

gbmSim accepts two inputs, mu and sigma, and returns four outputs, m, mtrue, s, and strue.

Define the table, params, containing the parameters to input to the Brownian Motion Model.

mu = [-.5; -.25; 0; .25; .5];
sigma = [.1; .2; .3; .2; .1];

params = table(mu,sigma)
params =

  5x2 table

     mu      sigma
    _____    _____

     -0.5     0.1 
    -0.25     0.2 
        0     0.3 
     0.25     0.2 
      0.5     0.1 

Apply the function, gbmSim, to the rows of the table, params.

stats = rowfun(@gbmSim,params,...
    'OutputVariableNames',...
    {'simulatedMean' 'trueMean' 'simulatedStd' 'trueStd'})
stats =

  5x4 table

    simulatedMean    trueMean    simulatedStd    trueStd 
    _____________    ________    ____________    ________

       0.60501       0.60653       0.05808       0.060805
       0.77916        0.7788         0.161        0.15733
        1.0024             1        0.3048        0.30688
        1.2795         1.284       0.25851        0.25939
        1.6498        1.6487       0.16285        0.16529

The four variable names specified by the 'OutputVariableNames' name-value pair argument indicate that rowfun should obtain four outputs from gbmSim. You can specify fewer output variable names to return fewer outputs from gbmSim.

Append the function output, stats, to the input, params.

[params stats]
ans =

  5x6 table

     mu      sigma    simulatedMean    trueMean    simulatedStd    trueStd 
    _____    _____    _____________    ________    ____________    ________

     -0.5     0.1        0.60501       0.60653       0.05808       0.060805
    -0.25     0.2        0.77916        0.7788         0.161        0.15733
        0     0.3         1.0024             1        0.3048        0.30688
     0.25     0.2         1.2795         1.284       0.25851        0.25939
      0.5     0.1         1.6498        1.6487       0.16285        0.16529

Create a table, A, where g is a grouping variable.

rng('default')
g = randi(3,[15,1]);
x = rand([15,1]);
y = rand([15,1]);

A = table(g,x,y)
A=15×3 table
    g       x           y    
    _    ________    ________

    3     0.14189     0.70605
    3     0.42176    0.031833
    1     0.91574     0.27692
    3     0.79221    0.046171
    2     0.95949    0.097132
    1     0.65574     0.82346
    1    0.035712     0.69483
    2     0.84913      0.3171
    3     0.93399     0.95022
    3     0.67874    0.034446
    1     0.75774     0.43874
    3     0.74313     0.38156
    3     0.39223     0.76552
    2     0.65548      0.7952
    3     0.17119     0.18687

Define the anonymous function, func, to compute the average difference between x and y.

func = @(x,y) mean(x-y);

Find the average difference between variables in groups 1, 2, and 3 defined by the grouping variable, g.

B = rowfun(func,A,...
    'GroupingVariable','g',...
    'OutputVariableName','MeanDiff')
B=3×3 table
    g    GroupCount    MeanDiff
    _    __________    ________

    1        4         0.032744
    2        3          0.41822
    3        8          0.14656

The variable GroupCount indicates the number of rows in A for each group.

Input Arguments

collapse all

Function, specified as a function handle. You can define the function in a file or as an anonymous function. If func corresponds to more than one function file (that is, if func represents a set of overloaded functions), MATLAB® determines which function to call based on the class of the input arguments.

func can accept no more than size(A,2) inputs. By default, rowfun returns the first output of func. To return more than one output from func, use the 'NumOutputs' or 'OutputVariableNames' name-value pair arguments.

Example: func = @(x,y) x.^2+y.^2; takes two inputs and finds the sum of the squares.

Input table, specified as a table or a timetable.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: 'InputVariables',2 uses only the second variable in A as an input to func.

Specifiers for selecting variables of A to pass to func, specified as the comma-separated pair consisting of 'InputVariables' and a positive integer, vector of positive integers, string array, character vector, cell array of character vectors, pattern scalar, logical vector, or a function handle.

If you specify 'InputVariables' as a function handle, then it must return a logical scalar, and rowfun passes only the variables in A where the function returns 1 (true).

One or more variables in A that define groups of rows, specified as the comma-separated pair consisting of 'GroupingVariables' and a positive integer, vector of positive integers, string array, character vector, cell array of character vectors, pattern scalar, or logical vector.

The value of 'GroupingVariables' specifies which table variables are the grouping variables, not their data types. A grouping variable can be numeric, or have data type categorical, calendarDuration, datetime, duration, logical, or string.

Rows in A that have the same grouping variable values belong to the same group. rowfun applies func to each group of rows, rather than separately to each row of A. The output, B, contains one row for each group.

If any grouping variable contains NaNs or missing values (such as NaTs, undefined categorical values, or missing strings), then the corresponding rows do not belong to any group, and are excluded from the output.

Row labels can be grouping variables. You can group on row labels alone, on one or more variables in A, or on row labels and variables together.

  • If A is a table, then the labels are row names.

  • If A is a timetable, then the labels are row times.

The output, B, has one row for each group of rows from the input, A.

  • If you specify 'OutputFormat','uniform' or 'OutputFormat','cell', then the output has one or more columns corresponding to the input table variables that func was applied to.

  • If you specify 'OutputFormat','table' or 'OutputFormat','timetable', then the output has:

    • One or more variables corresponding to the input table variables that func was applied to.

    • Variables corresponding to the grouping variables.

    • A new variable, GroupCount, whose values are the number of rows of the input A that are in each group.

Indicator for calling func with separate inputs, specified as the comma-separated pair consisting of 'SeparateInputs' and either true, false, 1, or 0.

true

func expects separate inputs. rowfun calls func with size(A,2) inputs, one argument for each data variable.

This is the default behavior.

false

func expects one vector containing all inputs. rowfun creates the input vector to func by concatenating the values in each row of A.

Indicator to pass values from cell variables to func, specified as the comma-separated pair consisting of 'ExtractCellContents' and either false, true, 0, or 1.

true

rowfun extracts the contents of a variable in A whose data type is cell and passes the values, rather than the cells, to func

For grouped computation, the values within each group in a cell variable must allow vertical concatenation.

false

rowfun passes the cells of a variable in A whose data type is cell to func.

This is the default behavior.

Variable names for outputs of func, specified as the comma-separated pair consisting of 'OutputVariableNames' and a character vector, cell array of character vectors, or string array, with names that are nonempty and distinct. The number of names must equal the number of outputs desired from func.

Furthermore, the variable names must be valid MATLAB identifiers. If valid MATLAB identifiers are not available for use as variable names, MATLAB uses a cell array of N character vectors of the form {'Var1' ... 'VarN'} where N is the number of variables. You can determine valid MATLAB variable names using the function isvarname.

Number of outputs from func, specified as the comma-separated pair consisting of 'NumOutputs' and 0 or a positive integer. The integer must be less than or equal to the possible number of outputs from func.

Example: 'NumOutputs',2 causes rowfun to call func with two outputs.

Format of B, specified as the comma-separated pair consisting of 'OutputFormat' and either the value of 'auto', 'table', 'timetable', 'uniform', or 'cell'.

'auto' (default) (since R2023a)

rowfun returns an output whose data type matches the data type of the input A.

'table'

rowfun returns a table with one variable for each output of func. For grouped computation, B, also contains the grouping variables.

'table' allows you to use a function that returns values of different sizes or data types. However, for ungrouped computation, all of the outputs from func must have one row each time it is called. For grouped computation, all of the outputs from func must have the same number of rows.

If A is a table, then this is the default output format.

'timetable'

rowfun returns a timetable with one variable for each variable in A (or each variable specified with 'InputVariables'). For grouped computation, B also contains the grouping variables.

rowfun creates the row times of B from the row times of A. If the row times assigned to B do not make sense in the context of the calculations performed using func, then specify the output format as 'OutputFormat','table'.

If A is a timetable, then this is the default output format.

'uniform'

rowfun concatenates the values returned by func into a vector. All of the outputs from func must be scalars with the same data type.

'cell'

rowfun returns the output as a cell array. 'cell' allows you to use a function that returns values of different sizes or data types.

Function to call if func fails, specified as the comma-separated pair consisting of 'ErrorHandler' and a function handle. Define this function so that it rethrows the error or returns valid outputs for function func.

MATLAB calls the specified error-handling function with two input arguments:

  • A structure with these fields:

    identifier

    Error identifier.

    message

    Error message text.

    index

    Row or group index at which the error occurred.

  • The set of input arguments to function func at the time of the error.

For example,

function [A, B] = errorFunc(S, varargin)
warning(S.identifier, S.message);
A = NaN; B = NaN;

Output Arguments

collapse all

Output table, returned as a table or a timetable. B can store metadata such as descriptions, variable units, variable names, and row names. For more information, see the Properties sections of table or timetable.

Version History

Introduced in R2013b

expand all