Frame-Based Optical Flow Deployment with LK method
This example shows how to use the frame-to-sample optimization from HDL Coder™ in Simulink® to generate and deploy a sample-based IP core with AXI4-Stream interfaces to hardware and verify it using live streaming data from MATLAB.
This example is an extension of the Generate HDL Code from Frame-Based Models by Using Neighborhood Modeling Methods (HDL Coder) example and explains the steps involved in generating a bitstream and deploying the design. In this example, you:
Simulate and validate a frame-based design for optical flow that uses Lucas-Kanade method.
Generate an HDL IP core for the design with an AXI4-Stream interface.
Integrate the generated IP core into the reference design, "Default System with SoC Blockset".
Use a simple script to run the design on hardware with live data.
Prerequisites
This example uses HDL Coder Support Package for AMD FPGA and SoC Devices for bitstream generation. To run the example on hardware, you must run the guided hardware setup included in the support package installation.
On the MATLAB Home tab Toolstrip, in the Environment section, click Add-Ons > Manage Add-Ons.
Locate HDL Coder Support Package for AMD FPGA and SoC Devices, and click Setup.
The setup tool configures the target board and host machine, confirms that the target starts correctly, and verifies host-target communication. For more information, see Set Up AMD FPGA and SoC Devices (SoC Blockset).
Vivado® (To view the supported versions, see the HDL Language Support and Supported Third-Party Tools and Hardware (HDL Coder))
SoC board. This example uses the AMD Zynq ZC706 Evaluation Kit.
Lucas-Kanade Method
To solve the optical flow constraint equation for u and v, the Lucas-Kanade method divides the original image into smaller sections and assumes a constant velocity in each section. Then it performs a weighted, least-square fit of the optical flow constraint equation to a constant model for all the sections in the image. For more information, see the opticalFlowLK
(Computer Vision Toolbox) object.
Simulation
open_system("hdlFrameOpticalFlowExtMemory");
The input video is split into a previous frame and current frame in the DUT using a unit delay block. The input signal is a frame composed of 360-by-640 pixels. The hardware reference design supports a maximum bitwidth of 16, so the model casts the pixel values to int16
datatype. Simulate the model to see output frame with the optical flow values overlaid on the input frame.
sim("hdlFrameOpticalFlowExtMemory");
The helperVerifyOpticalFlowDUT
script is used for validating processed frames from simulation against the reference. It calculates SSIM for the output frames obtained from simulation against MATLAB reference and verifies that this exceeds the minimum threshold. The reference is calculated using estimateFlow
function of the opticalFlowLK
(Computer Vision Toolbox) object. The script displays the input frame, reference output, and DUT output.
run("helperVerifyOpticalFlowDUT.m");
Generate and Integrate HDL IP Core using HDL Workflow Advisor
The computation of optical flow requires calculating disparity image, which is obtained by subtracting the current frame from the previous frame. This process requires storing the previous frame locally for processing. Given the substantial size of the frame, HDL Coder produces sample-based HDL code from the frame-based OpticalFlow DUT, allowing the delay (previous frame) to be transferred to external memory based on DelaySizeThreshold
parameter. Set the parameter to a value lower than the image size in kilobytes. In this case, set the DelaySizeThreshold
to 100 kilobytes. For more information, see Offload Large Delays from Frame-Based Models to External Memory (HDL Coder)
Start the targeting workflow by right clicking the Optical Flow DUT subsystem and selecting HDL Code > HDL Workflow Advisor.
In step 1.1, select IP Core Generation workflow and the platform
Xilinx Zynq ZC706 Evaluation Kit
.
In step 1.2, set the reference design to
Default System with External DDR3 Memory Access
.
In step 1.3, map the target platform AXI4-Stream interfaces to the input and output ports of the DUT.
In step 1.4, set the target frequency for the design to 150 MHz.
Step 2 prepares the design for HDL code generation.
Step 3 generates HDL code for the IP core.
Step 4.1 integrates the newly generated IP core into the reference design.
Step 4.2 creates the host interface script and an optional Zynq software interface model. Since this example uses the interface script, and not the model, uncheck Generate Simulink software interface model.
Step 4.3 generates the bitstream. The bitstream file is named
system_wrapper.bit
and located at hdl_prj\vivado_ip_prj\vivado_prj.runs\impl_1.Run Step 4.4 to build and download the FPGA bitstream.
Once the bitstream is generated, you can open HDL Code tab, select "Build Bitstream > Program Target Device" for deploying the bitstream for successive runs. Alternatively, you could also generate bitstream by directly clicking on "Build Bitstream" in HDL Code tab. This step generates the bitstream in an external shell.
You can interact with the FPGA design by reading and writing data from MATLAB on the host computer as described in the Interact with FPGA Design from Host Computer section of the Prototype Generated IP Core on Hardware using FPGA I/O (HDL Coder) example. The host computer sends and receives frames of data from the board as shown in the high level architecture of the system.
Run the Optical Flow System on FPGA.
Setup the interfaces for the vision processing system using gs_hdlFrameOpticalFlowExtMemory_setup
function generated in step 4.2. The function creates AXI4 stream interfaces for interfacing with the DUT. The function is currently configured with respect to the resolution of visiontraffic_cropped.avi
video. The script DeployFrameBasedOpticalFlow
is used to interface with the DUT by using AXI4-Stream interface and process the visiontraffic_cropped.avi
video.
%% Create fpga object hProcessor = xilinxsoc(); hFPGA = fpga(hProcessor); %% Setup fpga object This function configures the "fpga" object with the same interfaces as the generated IP core gs_hdlFrameOpticalFlowExtMemory_setup(hFPGA); %% Create grid with 5 pixel step for overlaying the optical flow on the input frame. v = VideoReader('visiontraffic_cropped.avi'); inputFrame = v.readFrame; borderOffset = 5; decimFactorRow = 5; decimFactorCol = 5; [R, C, ~] = size(inputFrame); RV = borderOffset:decimFactorRow:(R-borderOffset); CV = borderOffset:decimFactorCol:(C-borderOffset); [Y, X] = meshgrid(CV,RV); scaleFactor = 1/255;
%% Process frames from visiontraffic_cropped.avi % Create figure for display figure("Name", "Optical Flow Output from hw");
% Reset the DUT before writing the first frame writePort(hFPGA, 'CurrFrame', zeros(size(v.read(1),1:2))); for ii=1:20 frameIn = v.read(ii); frameInGray = single(rgb2gray(frameIn));
% Normalize the frame frameInGray = frameInGray*scaleFactor;
% Send input frame to Optical Flow IP using hFPGA object. writePort(hFPGA, 'CurrFrame', frameInGray);
% Read output from hardware using hFPGA object. vel_Values = readPort(hFPGA, 'VxPixFlow');
% Create optical flow lines from the real and imag parts of the data. tmp = vel_Values(RV,CV); vel_Lines = [Y(:), X(:), Y(:)+double(real(tmp(:)))*scaleFactor, X(:)+double(imag(tmp(:)))*scaleFactor]; frameOut = insertShape(frameInGray,"line",vel_Lines,"ShapeColor","yellow"); imshow(frameOut); end
When you finish the example, run the last line of the script to release any hardware resources used by the fpga
object:
release(hFPGA);