Main Content

Workflow for Generating a Multithreaded MEX File using dspunfold

  1. Run the entry-point MATLAB® function with the inputs that you want to test. Make sure that the function has no runtime errors. Call codegen on the function and make sure that it generates a MEX file successfully.

  2. Generate the multithreaded MEX file using dspunfold. Specify a state length using the -s option. The state length must be at least the same length as the algorithm in the MATLAB function. By default, -s is set to 0, indicating that the algorithm is stateless.

  3. Run the generated analyzer function. Use the pass flag to verify that the output results of the multithreaded MEX file and the single-threaded MEX file match. Also, check if the speedup and latency displayed by the analyzer function are satisfactory.

  4. If the output results do not match, increase the state length and generate the multithreaded MEX file again. Alternatively, use the automatic state length detection (specified using -s auto) to determine the minimum state length that matches the outputs.

  5. If the output results match but the speedup and latency are not satisfactory, increase the repetition factor using -r or increase the number of threads using -t. In addition, you can adjust the state length. Adjust the dspunfold options and generate new multithreaded MEX files until you are satisfied with the results..

For best practices for generating the multithreaded MEX file using dspunfold, see the 'Tips' section of dspunfold.

Workflow Example

Run the Entry Point MATLAB Function

Create the entry-point MATLAB function.

function [y,mse] = AdaptiveFilter(x,noise)

persistent rlsf1 ffilt noise_var
if isempty (rlsf1)
    rlsf1 = dsp.RLSFilter(32, 'ForgettingFactor', 0.98);
    ffilt = dsp.FIRFilter('Numerator',fir1(32, .25)); % Unknown System
    noise_var = 1e-4;
end

d = ffilt(x) + noise_var * noise; % desired signal
[y,e] = rlsf1(x, d);

mse = 10*log10(sum(e.^2));
end

The function models an RLS filter that filters the input signal x, using d as the desired signal. The function returns the filtered output in y and the filter error in e.

Run AdaptiveFilter with the inputs that you want to test. Verify that the function runs without errors.

AdaptiveFilter(randn(1000,1), randn(1000,1));

Call codegen on AdaptiveFilter and generate a MEX file.

codegen AdaptiveFilter -args {randn(1000,1), randn(1000,1)}

Generate a Multithreaded MEX File Using dspunfold

Set the state length to 32 samples and the repetition factor to 1. Provide a state length that is greater than or equal to the algorithm in the MATLAB function. When at least one entry of frameinputs is set to true, state length is considered in samples.

dspunfold AdaptiveFilter -args {randn(1000,1), randn(1000,1)} -s 32 -f true
Analyzing input MATLAB function AdaptiveFilter
Creating single-threaded MEX file AdaptiveFilter_st.mexw64
Creating multi-threaded MEX file AdaptiveFilter_mt.mexw64
Creating analyzer file AdaptiveFilter_analyzer

Run the Generated Analyzer Function

The analyzer considers the actual values of the input. To increase the analyzer effectiveness, provide at least two different frames along the first dimension of the inputs.

AdaptiveFilter_analyzer(randn(1000*4,1),randn(1000*4,1))
Analyzing multi-threaded MEX file AdaptiveFilter_mt.mexw64  ... 
Latency = 8 frames
Speedup = 3.5x
Warning: The output results of the multi-threaded MEX file AdaptiveFilter_mt.mexw64 do not match 
the output results of the single-threaded MEX file AdaptiveFilter_st.mexw64. Check that you 
provided the correct state length value to the dspunfold function when you generated the 
multi-threaded MEX file AdaptiveFilter_mt.mexw64. For best practices and possible solutions to
this problem, see the 'Tips' section in the dspunfold function reference page. 
> In coder.internal.warning (line 8)
  In AdaptiveFilter_analyzer 

ans = 

    Latency: 8
    Speedup: 3.4686
       Pass: 0

Increase the State Length

The analyzer did not pass the verification. The warning message displayed indicates that a wrong state length value is provided to the dspunfold function. Increase the state length to 1000 samples and repeat the process from the previous section.

dspunfold AdaptiveFilter -args {randn(1000,1),randn(1000,1)} -s 1000 -f true
Analyzing input MATLAB function AdaptiveFilter
Creating single-threaded MEX file AdaptiveFilter_st.mexw64
Creating multi-threaded MEX file AdaptiveFilter_mt.mexw64
Creating analyzer file AdaptiveFilter_analyzer

Run the generated analyzer.

AdaptiveFilter_analyzer(randn(1000*4,1),randn(1000*4,1))
Analyzing multi-threaded MEX file AdaptiveFilter_mt.mexw64  ... 
Latency = 8 frames
Speedup = 1.8x

ans = 

    Latency: 8
    Speedup: 1.7778
       Pass: 1

The analyzer passed verification. It is recommended that you provide different numerics to the analyzer function and make sure that the analyzer function passes.

Improve Speedup and Adjust Latency

If you want to increase speedup and your system can afford a larger latency, increase the repetition factor to 2.

dspunfold AdaptiveFilter -args {randn(1000,1),randn(1000,1)} -s 1000 -r 2 -f true
Analyzing input MATLAB function AdaptiveFilter
Creating single-threaded MEX file AdaptiveFilter_st.mexw64
Creating multi-threaded MEX file AdaptiveFilter_mt.mexw64
Creating analyzer file AdaptiveFilter_analyzer

Run the analyzer.

 AdaptiveFilter_analyzer(randn(1000*4,1), randn(1000*4,1))
Analyzing multi-threaded MEX file AdaptiveFilter_mt.mexw64  ... 
Latency = 16 frames
Speedup = 2.4x

ans = 

    Latency: 16
    Speedup: 2.3674
       Pass: 1

Repeat the process until you achieve satisfactory speedup and latency.

Use Automatic State Length Detection

Choose a state length that is greater than or equal to the state length of your algorithm. If it is not easy to determine the state length for your algorithm analytically, use the automatic state length detection tool. Invoke automatic state length detection by setting -s to auto. The tool detects the minimum state length with which the analyzer passes the verification.

dspunfold AdaptiveFilter -args {randn(1000,1),randn(1000,1)} -s auto -f true
Analyzing input MATLAB function AdaptiveFilter
Creating single-threaded MEX file AdaptiveFilter_st.mexw64
Searching for minimal state length (this might take a while)
Checking stateless ... Insufficient
Checking 1000 ... Sufficient
Checking 500 ... Insufficient
Checking 750 ... Insufficient
Checking 875 ... Sufficient
Checking 812 ... Insufficient
Checking 843 ... Sufficient
Checking 827 ... Insufficient
Checking 835 ... Insufficient
Checking 839 ... Sufficient
Checking 837 ... Sufficient
Checking 836 ... Sufficient
Minimal state length is 836
Creating multi-threaded MEX file AdaptiveFilter_mt.mexw64
Creating analyzer file AdaptiveFilter_analyzer

Minimal state length is 836 samples.

Run the generated analyzer.

AdaptiveFilter_analyzer(randn(1000*4,1), randn(1000*4,1))
Analyzing multi-threaded MEX file AdaptiveFilter_mt.mexw64  ... 
Latency = 8 frames
Speedup = 1.9x

ans = 

    Latency: 8
    Speedup: 1.9137
       Pass: 1

The analyzer passed the verification.

See Also

Functions

Related Topics