Main Content

coder.loop.Control

Loop optimization control object

Since R2023a

    Description

    Use instances of coder.loop.Control class to optimize MATLAB® for loops in the generated code.

    Creation

    Description

    example

    loopSchedule = coder.loop.Control creates a loop control object without the transformSchedule property set.

    Use the object functions to add loop transforms to the loop control object. Provide the loop index name as inputs to the object functions.

    Properties

    expand all

    Loop transform specified as a coder.loop.Control object.

    This property further contains its own transformSchedule property which itself is a coder.loop.Control object. You can use the object functions mentioned below to add more transforms to your top-level object. The subsequent transforms are stored in the transformSchedule property in a recursive manner.

    Object Functions

    apply

    loopSchedule.apply applies the loop transformations contained in the loop control object to the for loop nest that immediately follows the apply call.

    interchange

    loopSchedule = loopSchedule.interchange('loopA_ID','loopB_ID') adds an interchange transform to the loop control object for loops with loop index names loopA_ID and loopB_ID. This prompts the code generator to interchange the loops in the generated code.

    Use this transform when accessing array elements stored in contiguous memory blocks. If so, the data is loaded into cache and can be reused without a cache refresh.

    For example, see Interchange for Loops in Generated Code.

    parallelize

    loopSchedule = loopSchedule.parallelize('loopID') adds a parallelize transform to the loop control object with loop index loopID.

    This prompts the generated code for that loop to execute the iterations in parallel with the threads available for your target. This transforms requires EnableOpenMP to be set to true in your code configuration object.

    For example, see Selectively Parallelize for Loops in Generated Code.

    reverse

    loopSchedule = loopSchedule.reverse('loopID') adds a reverse transform to the loop control object for the loop with index name loopID.

    This prompts the generated code for that loop to execute the iterations in reverse order. Use this transform when you know the upper bound of the loop iterator.

    For example, see Reverse for-Loop Iteration Order in Generated Code.

    tile

    loopSchedule = loopSchedule.tile('loopID','tileSize','newTileID') adds a tile transform to the loop control object for the loop with index name loopID.

    This prompts the generated code for that loop to create an outer loop with loop index newTileID that is tiled according to the tileSize value and the inner loop index is set to the value of loopID.

    Use this transform to reduce iteration space of a loop into smaller blocks. This involves partitioning a large array from memory into smaller blocks that fit into your cache size. Use this transform when you have limited cache availability.

    For example, see Apply Tile Transform to for-Loop in Generated Code.

    unrollAndJam

    loopSchedule = loopSchedule.unrollAndJam('loopID','unrollFactor') adds an unroll and jam transform to the loop control object for the loop with index name loopID. This prompts the generated code for that loop to unroll and jam according to the unrollFactor value.

    Unroll and jam transforms are usually applied to perfectly nested loops, or where all the data elements are accessed within the inner loop. This transform unrolls the body of the inner loop according to the loop index of the outer loop. The default value of the unrollFactor is 2.

    For example, see Apply unrollAndJam on a for-Loop in the Generated Code.

    vectorize

    loopSchedule = loopSchedule.vectorize('loopID') adds a vectorize transform to the loop control object for the loop with index name loopID.

    This prompts the code generator to use the SIMD instruction set for your target hardware in the generated code. Set the InstructionSetExtensions property in your code configuration object according to your hardware requirements to apply this transform.

    For example, see Vectorize for Loop in the Generated Code.

    Examples

    collapse all

    Use the coder.loop.Control object to apply parallelize and interchange transforms to for loops in the generated code. Use the parallelize and interchange object functions provided by coder.loop.Control to optimize the generated code.

    Define a function forLoopParallelize. Within the function, create a coder.loop.Control object and add a parallel transform to the for loop with index name

    i.

    function [out] = forLoopParallelize(u,v)
    %#codegen
    row = size(u,1);
    col = size(u,2);
    out = zeros(row, col);
    
    schedule = coder.loop.Control;
    schedule = schedule.parallelize('i');
    schedule.apply;
    
    for i = 1:col
        for j = 1:row
            out(i,j) = out(i,j) + u(i, j) * v(i, j);
        end
    end
    
    end

    Generate code for this function by running the following command:

    codegen -config:lib forLoopParallelize -args ...
    {reshape(1:100,[10,10]), 2.*reshape(1:100,[10,10])} -launchreport

    Inspect the generated code in the report to see the parallelized loop. Notice that the code generator uses OpenMP when applicable.

    void forLoopParallelize(const double u[100], const double v[100],
                            double out[100])
    {
      int i;
      int j;
      int out_tmp;
      if (!isInitialized_forLoopParallelize) {
        forLoopParallelize_initialize();
      }
      memset(&out[0], 0, 100U * sizeof(double));
    #pragma omp parallel for num_threads(omp_get_max_threads()) private(j, out_tmp)
    
      for (i = 0; i < 10; i++) {
        for (j = 0; j < 10; j++) {
          out_tmp = i + 10 * j;
          out[out_tmp] += u[out_tmp] * v[out_tmp];
        }
      }
    }

    Define a function forLoopInterchange. Create a coder.loop.Control object and add an interchange transform to the for-loops with indices i and j.

    function out = forLoopInterchange()
    
    out = zeros(100,70);
    
    
    schedule = coder.loop.Control;
    schedule = schedule.interchange('i','j');
    schedule.apply;
    
    for i = 1:100
        for j = 1:70
            out(i,j) = out(i,j) + i*j;
        end
    end

    Generate code for this function by running the following command:

    codegen -config:lib forLoopInterchange -launchreport
    

    Inspect the generated code in the report to see the interchanged loop. Notice that the code generator interchanges the loop indices for both for-loops.

    void forLoopInterchange(double out[7000])
    {
      int i;
      int j;
      memset(&out[0], 0, 7000U * sizeof(double));
      for (j = 0; j < 70; j++) {
        for (i = 0; i < 100; i++) {
          int out_tmp;
          out_tmp = i + 100 * j;
          out[out_tmp] += (double)((i + 1) * (j + 1));
        }
      }
    }

    Version History

    Introduced in R2023a