HDL Code Generation for Viterbi Decoder
This example shows HDL code generation support for the Viterbi Decoder block. It shows how to check, generate, and verify the HDL code you generate from a fixed-point Viterbi Decoder model. This example also discusses the settings you can use to alter the HDL code you generate.
Introduction
The model shows HDL code generation for a fixed-point Viterbi Decoder block used in soft decision convolutional decoding. To learn more about HDL support for Viterbi Decoder, refer to the HDL Code Generation section of the block page in documentation.
To open the model, run the following commands:
modelname = 'hdlcoder_commviterbi';
open_system(modelname);
In this model, the top-level subsystem Viterbi Decoder Subsystem contains the Viterbi Decoder block. To open this subsystem, run the following commands:
systemname = [modelname '/Viterbi Decoder Subsystem'];
open_system(systemname);
The Viterbi Decoding Algorithm
There are three main components to the Viterbi decoding algorithm. They are the branch metric computation (BMC), add-compare-select (ACS), and traceback decoding. The following diagram illustrates the three units in the Viterbi decoding algorithm:
The Renormalization Method
The Viterbi Decoder prevents the overflow of the state metrics in the ACS component by subtracting the minimum value of the state metrics at each time step, as shown in the following figure:
Obtaining the minimum value of all the state metric elements in one clock cycle results in a poor clock frequency for the circuit. The performance of the circuit may be improved by adding pipeline registers. However, simply subtracting the minimum value delayed by pipeline registers from the state metrics may still lead to overflow. The hardware architecture modifies the renormalization method and avoids the state metric overflow in three steps. First, the architecture calculates values for the threshold and step parameters, based on the trellis structure and the number of soft decision bits. Second, the delayed minimum value is compared to the threshold. Last, if the minimum value is greater than or equal to the threshold value, the implementation subtracts the step value from the state metric; otherwise no adjustment is performed. The following figure illustrates the modified renormalization method:
Optimal State Metric Word Length Calculation
The hardware implementation calculates the optimal word length of the state metric and compares it with the value you specify for the block. The hardware architecture uses the optimal value if it is smaller than the one you specify. A message is displayed to show the value during HDL code generation. If the calculated value is larger than the value you specify, an error message is reported and the optimal value is displayed.
Applying the calculated optimal state metric word length in the hardware implementation may significantly reduce the hardware resource if the value you specify is too large. For example, if you set 16 bits as the state metric word length but only 9 bits are required to achieve the same numerical results, applying the calculated optimal state metric word length in the hardware architecture saves approximately 40 percent of the register resources. The calculated optimal state metric word length for some typical trellises is displayed in the following table:
Check and Generate Code for a Fixed-point Viterbi Model
This model decodes a DVB rate 1/2 , constraint length 7,(171,133) convolutional code with 3 bits soft decision. The decoder runs at continuous mode with the traceback depth of 32. The state metric word length is set to 16 bits. To validate the parameter settings of the Viterbi Decoder block, you can run the following commands:
workingdir = tempname;
checkhdl(systemname,'TargetDirectory',workingdir);
Running checkhdl generates messages that report:
the default value of TracebackStagesPerPipeline. More information on this parameter can be found in the section Pipelining the register-based traceback unit,
the state metric word length used in the HDL code compared with the one set on the block mask,
the total delay introduced by the pipeline registers with respect to the original Viterbi block.
To generate HDL for the subsystem containing the Viterbi Decoder block, run the following commands: workingdir = tempname; makehdl(systemname,'TargetDirectory',workingdir);
The top level VHDL file name matches the name of the block in the model. The Viterbi_Decoder component generated in the Viterbi_Decoder.vhd contains three components: BranchMetric, ACS, and Traceback. The ACS and Traceback components instantiate components ACSUnit and TracebackUnit multiple times respectively. Data type definitions are included in the package file Viterbi_Decoder_Subsystem_pkg.vhd.
To generate a testbench for the subsystem containing the Viterbi Decoder block, run the following command: makehdltb(systemname,'TargetDirectory',workingdir);
Optimization of The Traceback Unit
They are two methods to optimize the traceback unit: pipelining the register-based traceback or using the RAM-based traceback architecture.
Pipelining the register-based traceback unit
The Viterbi Decoder block decodes every bit by tracing back through a traceback depth you define for the block. Because the block implements a complete traceback for each decision bit, registers are used to store the minimum state index and branch decision in the Traceback Decoding unit. This unit may be pipelined in order to improve the performance of the generated circuit. Pipeline registers can be added to the traceback unit by specifying the number of traceback stages per pipeline register. This can be done by setting the TracebackStagesPerPipeline implementation parameter for the Viterbi Decoder in the HDL block properties dialog. Right click the Viterbi Decoder block to navigate to the HDL Block Properties menu.
Setting the property value to 4 results in the insertion of a pipeline register for every four traceback units in the model, as illustrated in the following figure:
The TracebackStagesPerPipeline implementation parameter provides you a way of balancing the circuit performance based on system requirements. A smaller parameter value indicates the requirement to add more registers to increase the speed of the traceback circuit. Increasing the number results in a lower usage of registers along with a decrease in the circuit speed. In our experiment with the rate 1/2 , constraint length 7,(171,133) convolutional code, adjusting the TracebackStagesPerPipeline parameter from 4 to 8 reduces the pipeline register usage in half, with the circuit speed changing from 173MHz to 94 MHz.
RAM-based traceback
Instead of using registers, you can choose to use RAMs to save the survivor branch information. This can be done by setting the HDL Architecture property of the Viterbi Decoder block to RAM-based Traceback.
There are two major differences between the register-based and the RAM-based traceback architectures.
Firstly, the register-based implementation combines the traceback and decode operations into one step and uses the best state found from the minimum operation as the decoding initial state. The RAM-based implementation traces back through one set of data to find the initial state to decode the previous set of data.
Secondly, the register-based implementation decodes one bit after a complete trackback; while the RAM-based implementation traces back through M samples, decodes the previous M bits in reverse order, and releases one bit in order at each clock cycle.
Due to the differences in the two traceback algorithms, the RAM-based implementation produces different numerical results than the register-based traceback. A longer traceback depth, for example, 10 times of constraint length, is recommended in the RAM-based traceback to achieve a similar bit error rate (BER) as the register-based implementation.
The size of RAM required for the implementation depends on the trellis and the traceback depth. The following table summarizes the RAM usage for some typical trellis structures.
Our experiment with the rate 1/2, constraint length 7, (171, 133) convolutional code shows that the RAM-based traceback unit uses 90% fewer registers than the register-based traceback unit (with pipelining every 4 stages) ) using similar clock constraints in synthesis. The two implementations provide a register-RAM tradeoff that can be tailored to the individual design.
Selected References
Clark, G. C. Jr. and J. Bibb Cain., Error-Correction Coding for Digital Communications, New York, Plenum Press, 1981.
G. Feygin and P. G. Gulak, "Architectural tradeoffs for survivor sequence memory management in Viterbi decoders," IEEE Transactions on Communications, vol. 41, no. 3, pp. 425-429, March 1993.