Accelerating BER Simulations Using the Parallel Computing Toolbox

This example uses Parallel Computing Toolbox™ to accelerate a simple, QPSK bit error rate (BER) simulation. The system consists of a QPSK modulator, a QPSK demodulator, an AWGN channel, and a bit error rate counter.

Set the simulation parameters.

```EbNoVec = 5:8; % Eb/No values in dB totalErrors = 200; % Number of bit errors needed for each Eb/No value totalBits = 1e7; % Total number of bits transmitted for each Eb/No value ```

Allocate memory to the arrays used to store the data generated by the function, `helper_qpsk_sim_with_awgn`.

```[numErrors, numBits] = deal(zeros(length(EbNoVec),1)); ```

Run the simulation and determine the execution time. Only one processor will be used to determine baseline performance. Accordingly, observe that the normal for-loop is employed.

```tic for idx = 1:length(EbNoVec) errorStats = helper_qpsk_sim_with_awgn(EbNoVec,idx, ... totalErrors,totalBits); numErrors(idx) = errorStats(idx,2); numBits(idx) = errorStats(idx,3); end simBaselineTime = toc; ```

Calculate the BER.

```ber1 = numErrors ./ numBits; ```

Rerun the simulation for the case in which Parallel Computing Toolbox is available. Create a pool of workers.

```pool = gcp; assert(~isempty(pool), ['Cannot create parallel pool. '... 'Try creating the pool manually using ''parpool'' command.']) ```

Determine the number of available workers from the `NumWorkers` property of `pool`. The simulation runs the range of values over each worker rather than assigning a single point to each worker as the former method provides the biggest performance improvement.

```numWorkers = pool.NumWorkers; ```

Determine the length of `EbNoVec` for use in the nested `parfor` loop. For proper variable classification, the range of a for-loop nested in a `parfor` must be defined by constant numbers or variables.

```lenEbNoVec = length(EbNoVec); ```

Allocate memory to the arrays used to store the data generated by the function, `helper_qpsk_sim_with_awgn`.

```[numErrors,numBits] = deal(zeros(length(EbNoVec),numWorkers)); ```

Run the simulation and determine the execution time.

```tic parfor n = 1:numWorkers for idx = 1:lenEbNoVec errorStats = helper_qpsk_sim_with_awgn(EbNoVec,idx, ... totalErrors/numWorkers,totalBits/numWorkers); numErrors(idx,n) = errorStats(idx,2); numBits(idx,n) = errorStats(idx,3); end end simParallelTime = toc; ```

Calculate the BER. In this case, the results from multiple processors must be combined to generate the aggregate BER.

```ber2 = sum(numErrors,2) ./ sum(numBits,2); ```

Compare the BER values to verify that the same results are obtained independent of the number of workers.

```semilogy(EbNoVec',ber1,'-*',EbNoVec',ber2,'-^') legend('Single Processor','Multiple Processors','location','best') xlabel('Eb/No (dB)') ylabel('BER') grid ```

You can see that the BER curves are essentially the same with any variance being due to differing random number seeds.

Compare the execution times for each method.

```fprintf(['\nSimulation time = %4.1f sec for one worker\n', ... 'Simulation time = %4.1f sec for multiple workers\n'], ... simBaselineTime,simParallelTime) fprintf('Number of processors for parfor = %d\n', numWorkers) ```
```Simulation time = 24.6 sec for one worker Simulation time = 6.1 sec for multiple workers Number of processors for parfor = 6 ```