We detect you are using an unsupported browser. For the best experience, please visit the site using Chrome, Firefox, Safari, or Edge. X
Maximize Your Experience: Reap the Personalized Advantages by Completing Your Profile to Its Fullest! Update Here
Stay in the loop with the latest from Microchip! Update your profile while you are at it. Update Here
Complete your profile to access more resources.Update Here!
Item Qty
Your cart is empty.

VectorBlox™ Accelerator Software Development Kit and Neural Network IP

The VectorBlox Accelerator Software Development Kit (SDK) offers the most power-efficient Convolutional Neural Network (CNN)-based Artificial Intelligence/Machine Learning (AI/ML) inference with PolarFire FPGAs. VectorBlox software enables:

  • OpenVINO™ toolkit-based front-end tools
  • Support for most common frameworks like TensorFlow, Caffe, MxNet, PyTorch and DarkNet
  • Quick evaluation without prior FPGA knowledge
  • Software-overlay-based implementation; there's no need to reprogram the FPGA while updating CNNs
VectorBlox Accelerator SDK Tools

The VectorBlox Accelerator SDK contains different tools that compile a neural network description from frameworks like TensorFlow and ONNX into a Binary Large Object (BLOB). These BLOBs are stored in Flash and loaded into the DDR memory during execution.

  • Model Optimization: Converts a trained network to a common Intermediate Representation (IR) and cleans the network for inference by removing several layers used during training (e.g., batching operators and fuse layers)
  • Quantization: Converts optimized networks from FP32 to INT8 and enables networks to be represented using less memory with minimal loss in accuracy
  • Calibration: Involves adjusting activations and weights of a model represented in INT8 precision
  • Runtime Generation: Creates a BLOB that is written into embedded nonvolatile storage (e.g., SPI Flash)

Getting Started


*Refer to section 7.3 of the Libero software quick-start guide to learn how to merge these licenses.

The CoreVectorBlox IP


The CoreVectorBlox IP consists of a Matrix processor (MXP) and an MXP CNN IP. It can be instantiated as a single-core accelerator or in multi-cores when neural network workloads need to be shared. The MXP consists of eight 32-bit Arithmetic/Logic Units (ALUs) and is responsible for elementwise tensor operations like add, sub, xor, shift, mul, dotprod, etc. The MXP CNN IP consists of a 2D array of multiply and accumulate functions implemented using math blocks. As the name suggests, the MXP CNN IP is responsible for executing the convolutional layers of CNNs. Multiple networks can be overlaid during the runtime and switched dynamically. You can also use time-slicing to run simultaneous networks on a single CoreVectorBlox instantiation.

CoreVectorBlox IP

Development Flow


Step 1: Prepare Your Trained Model

Use Python scripts provided in the SDK to convert your trained model into an optimized INT8 representation called a BLOB. Run the BLOB through the VectorBlox Accelerator Simulator to verify accuracy of the network and to ensure successful conversion of the network.

Trained Model Preparation Development Flow

Step 2: Prepare Your Hardware

The PolarFire FPGA Video and Imaging Kit is configured to run as an AI-enabled smart camera. The SDK includes a pre-compiled kit image bitstream. Write the bistream into the PolarFire FPGA using the FlashPro programmer included with the kit. Write the BLOB, generated from Step 1, into the SPI Flash of the kit.

Hardware Preparation Development Flow

Step 3: Write Your Embedded Code

Use the provided embedded code in C/C++ based SoftConsole IDE and generate and program the hex file. Connect the video kit to an HDMI® monitor and turn it on. Modify the embedded code to load and run many CNN BLOBs, switch CNNs dynamically on the fly or load CNNs sequentially for simultaneous inferencing.

Write Embedded Code Development Flow

Deployment Options


The VectorBlox SDK is supported in the PolarFire FPGA Video and Imaging Kit (MPF300-VIDEO-KIT-NS).

VectorBlox Smart Camera Reference Design Flow

VectorBlox Smart Camera Reference Design Flow

  1. Video frame is received via MIPI CSI-2
  2. It is stored in the DDR4 memory via AXI-4 interconnect
  3. Before starting inference, the video frame is read back from the DDR4
  4. It is converted from RAW to RGB, RGB to planer R, G, B and written back into DDR4
  5. The CoreVectorBlox engine runs inference on R, G, B arrays and writes results back into memory
  6. The Mi-V (soft RISC-V) Ecosystem sorts probabilities and creates an overlay frame with bounding boxes and results into DDR4
  7. The original video frame is read and alpha blended with the overlay frame and sent out to an HDMI display