Inquire: Call 0086-755-23203480, or reach out via the form below/your sales contact to discuss our design, manufacturing, and assembly capabilities.
Quote: Email your PCB files to Sales@pcbsync.com (Preferred for large files) or submit online. We will contact you promptly. Please ensure your email is correct.
Notes: For PCB fabrication, we require PCB design file in Gerber RS-274X format (most preferred), *.PCB/DDB (Protel, inform your program version) format or *.BRD (Eagle) format. For PCB assembly, we require PCB design file in above mentioned format, drilling file and BOM. Click to download BOM template To avoid file missing, please include all files into one folder and compress it into .zip or .rar format.
Python FPGA Development with Xilinx: PYNQ & Beyond
When I first heard about programming FPGAs with Python, I was skeptical. After years of wrestling with Verilog timing constraints and cryptic synthesis errors, the idea of controlling hardware accelerators from a Jupyter notebook seemed almost too good to be true. But after spending considerable time with the PYNQ ecosystem and various Xilinx Python tools, I can say the landscape has genuinely changed for embedded engineers who want to leverage programmable logic without becoming HDL experts.
This guide covers everything from getting started with PYNQ to building custom hardware overlays, machine learning acceleration, and alternative Python-based HDL frameworks that can target Xilinx devices.
The Xilinx Python ecosystem centers around PYNQ (Python + Zynq), but extends far beyond a single framework. Before diving into specifics, let’s understand how these pieces fit together.
What Makes Zynq Python Development Different
Traditional FPGA development requires intimate knowledge of hardware description languages like Verilog or VHDL. The Zynq architecture changed this equation by combining an ARM processing system (PS) with programmable logic (PL) on a single chip. This architecture enables a natural division of labor: Python runs on the ARM cores while hardware accelerators execute in the FPGA fabric.
The key insight behind Zynq Python development is that most applications don’t need custom RTL. Pre-built hardware overlays can handle common acceleration tasks, leaving software developers free to focus on their algorithms rather than clock domain crossings.
Component
Purpose
Language
Processing System (PS)
Runs Linux, Python, application logic
Python/C/C++
Programmable Logic (PL)
Hardware accelerators, custom IP
VHDL/Verilog/HLS
AXI Interconnect
PS-PL communication
Hardware protocol
Overlays
Pre-built hardware configurations
Bitstream + drivers
Getting Started with PYNQ on Zynq Boards
PYNQ makes Xilinx Python development accessible by providing a complete software stack, pre-built overlays, and example notebooks. The framework supports multiple boards across the Zynq, Zynq UltraScale+, Kria, and Alveo families.
Supported PYNQ Development Boards
Choosing the right board depends on your application requirements and budget. Here’s a comparison of popular options:
Board
SoC/FPGA
RAM
Key Features
Price Range
PYNQ-Z2
Zynq XC7Z020
512MB DDR3
HDMI in/out, Audio, Arduino/Pmod headers
~$100
Ultra96-V2
Zynq UltraScale+ ZU3EG
2GB DDR4
WiFi/BT, 96Boards expansion
~$220
ZCU104
Zynq UltraScale+ XCZU7EV
2GB DDR4
Video codec, DisplayPort
~$1,100
Kria KV260
Zynq UltraScale+ K26 SOM
4GB DDR4
Vision AI, Smart camera
~$250
For beginners, the PYNQ-Z2 offers the best value. The Zynq XC7Z020 provides enough resources for learning while the board includes practical peripherals like HDMI input/output for video processing projects.
Setting Up Your PYNQ Environment
Getting PYNQ running takes about 15 minutes. Here’s the basic workflow:
Download the appropriate SD card image from the PYNQ website. Flash it to a microSD card using tools like Etcher or dd. Insert the card into your board and connect Ethernet. Power on and wait for the boot sequence to complete (about 60 seconds). Access Jupyter notebooks at http://pynq:9090 or http://192.168.2.99:9090.
The default credentials are username “xilinx” and password “xilinx”. From there, you’re immediately ready to run Python code that interacts with hardware.
Overlays are the secret sauce that makes Zynq Python development productive. Think of an overlay as a hardware library – you load it when needed, use its functions through a Python API, and swap it out for another overlay when your requirements change.
Understanding the Overlay Architecture
An overlay consists of three components: the bitstream (.bit file), the hardware handoff file (.hwh), and Python driver code. When you instantiate an overlay in Python, PYNQ automatically downloads the bitstream to the FPGA and creates Python objects for each IP block.
from pynq import Overlay
# Load the base overlay
overlay = Overlay(“base.bit”)
# Access hardware components through Python
overlay.leds[0].on()
overlay.buttons[0].read()
The base overlay included with each PYNQ board provides drivers for on-board peripherals: LEDs, buttons, switches, HDMI, audio, and GPIO interfaces. This lets you start experimenting immediately without building custom hardware.
Available Pre-Built Overlays
Beyond the base overlay, Xilinx and the community provide specialized overlays for common use cases:
Overlay
Purpose
Key IP Blocks
Base
Board peripherals
GPIO, Video, Audio
Logictools
Digital pattern generation
Pattern Generator, FSM, Boolean
PYNQ-ComputerVision
Image processing
OpenCV-compatible filters
BNN-PYNQ
Binary neural networks
Quantized inference engine
DPU-PYNQ
Deep learning
Vitis AI DPU
The logictools overlay deserves special mention for hardware debugging. It turns your PYNQ board into a configurable logic analyzer and pattern generator, perfect for testing external circuits or learning digital logic concepts.
Creating Custom Hardware Overlays
Pre-built overlays cover many scenarios, but eventually you’ll want to accelerate your own algorithms. Creating custom overlays requires Vivado and some understanding of the Zynq architecture, though not necessarily deep HDL expertise.
The Custom Overlay Development Flow
Building a custom overlay follows this general workflow:
Create IP blocks using Vivado HLS (C/C++ to RTL) or traditional HDL
Assemble the system in Vivado IP Integrator
Connect IP to the Zynq PS through AXI interfaces
Generate the bitstream and hardware handoff files
Write Python driver code to interface with your IP
The most accessible path for software developers is Vivado HLS. You write C or C++ code with specific pragmas, and the tool synthesizes it into RTL. This doesn’t produce optimal hardware, but it’s often fast enough and dramatically reduces development time.
AXI Interface Considerations
The communication between PS and PL happens through AXI interfaces. Understanding which interface to use is crucial for performance:
AXI GP (General Purpose): Low bandwidth, PS acts as master. Use for configuration registers and control signals.
AXI HP (High Performance): High bandwidth, PL acts as master. Use for DMA transfers and video streams. The Zynq-7000 provides four HP ports.
AXI ACP (Accelerator Coherency Port): Cache-coherent access to DDR. Use when sharing data structures between PS and PL without explicit cache management.
For most acceleration scenarios, you’ll use AXI GP for control and AXI HP for bulk data transfer via DMA.
Machine Learning Acceleration with Zynq Python
Machine learning inference is where Xilinx Python development really shines. The combination of PYNQ’s Python accessibility and the DPU’s inference performance creates a compelling platform for edge AI applications.
Vitis AI and the DPU
The Deep Learning Processor Unit (DPU) is AMD/Xilinx’s configurable inference accelerator for neural networks. It supports common architectures including ResNet, YOLO, SSD, and various transformer models.
The DPU-PYNQ project provides pre-built overlays and Python APIs that make deployment remarkably simple:
from pynq_dpu import DpuOverlay
overlay = DpuOverlay(“dpu.bit”)
overlay.load_model(“resnet50.xmodel”)
# Run inference
result = overlay.run(input_image)
DPU configurations vary by board. Larger boards like the ZCU104 can run multiple DPU cores simultaneously, while the Ultra96-V2 runs a smaller B1024 configuration suitable for real-time video inference at reduced precision.
Performance Expectations
Edge AI performance depends heavily on the specific board and model. Here are typical results for image classification:
Board
DPU Config
ResNet-50 (fps)
YOLO-v3 (fps)
Ultra96-V2
B1024 x1
60-80
10-15
ZCU104
B4096 x2
150-200
30-50
Kria KV260
B4096 x1
100-140
20-35
These numbers far exceed what’s achievable with pure CPU inference on the same ARM cores, demonstrating the value of hardware acceleration.
PYNQ excels at leveraging pre-built hardware, but what if you want to describe hardware itself using Python? Several frameworks enable this, each with different philosophies and target applications.
Amaranth HDL (formerly nMigen)
Amaranth is a modern Python-based hardware description language that generates synthesizable Verilog. Unlike PYNQ, Amaranth doesn’t abstract away hardware – it provides Python syntax for describing logic that will become actual gates and flip-flops.
Key features of Amaranth include a clean module system, strong type checking, built-in simulation, and direct integration with open-source synthesis tools. It’s particularly popular in the open-source hardware community.
from amaranth import *
class Counter(Elaboratable):
def __init__(self, width):
self.value = Signal(width)
def elaborate(self, platform):
m = Module()
m.d.sync += self.value.eq(self.value + 1)
return m
MyHDL and Migen
MyHDL is one of the oldest Python HDL projects, dating back to 2004. It uses Python generators to model concurrent hardware processes and can convert designs to VHDL or Verilog. While development has slowed, it remains functional and has extensive documentation.
Migen emerged from the LiteX project and focuses on building complete SoC designs. The LiteX ecosystem provides IP cores for common peripherals (UART, SPI, Ethernet, DDR controllers) that can be assembled into working systems entirely from Python.
Comparison of Python HDL Approaches
Framework
Use Case
Output
Xilinx Support
PYNQ
Using pre-built hardware
Python control code
Native
Amaranth
Designing hardware
Verilog
Via Vivado
Migen/LiteX
Building SoCs
Verilog
Via Vivado
MyHDL
Educational, prototyping
VHDL/Verilog
Via Vivado
cocotb
Verification/testing
Testbenches
Native simulation
For most Xilinx Python projects, PYNQ remains the practical choice. The alternative frameworks shine when you need to create novel hardware or target platforms beyond Xilinx’s ecosystem.
Practical Xilinx Python Development Tips
After working with these tools across multiple projects, here are the lessons that aren’t always obvious from documentation.
Memory Management Matters
PYNQ provides allocate() for creating DMA-capable buffers. Use these contiguous memory regions for any data transferred to hardware accelerators. Standard NumPy arrays won’t work correctly with DMA operations.
Loading an overlay reconfigures the FPGA, which takes 100-500ms depending on bitstream size. Don’t load overlays in tight loops. Load once at startup and reuse the instance throughout your application.
Clock Domain Awareness
Even when using PYNQ, understanding clock domains prevents subtle bugs. The PS runs at its own frequency while PL clocks are independently configurable. Data crossing between domains needs proper synchronization, typically handled by the AXI infrastructure but sometimes requiring attention in custom designs.
Useful Resources for Xilinx Python Development
Here are the essential resources for deepening your Xilinx Python and Zynq Python knowledge:
Resource
Description
Link
PYNQ Documentation
Official framework documentation
pynq.readthedocs.io
PYNQ GitHub
Source code and examples
github.com/Xilinx/PYNQ
PYNQ Workshop
Hands-on training materials
github.com/Xilinx/PYNQ_Workshop
DPU-PYNQ
Deep learning acceleration
github.com/Xilinx/DPU-PYNQ
PYNQ Community Forum
Technical support
discuss.pynq.io
Vitis AI Model Zoo
Pre-trained ML models
github.com/Xilinx/Vitis-AI
Amaranth Documentation
Python HDL reference
amaranth-lang.org/docs
LiteX Wiki
SoC building framework
github.com/enjoy-digital/litex/wiki
Frequently Asked Questions About Xilinx Python Development
Can I use PYNQ without knowing any Verilog or VHDL?
Absolutely. The entire point of PYNQ is enabling software developers to use FPGA acceleration without hardware design expertise. The pre-built overlays handle common scenarios like video processing, GPIO control, and machine learning inference. You only need HDL knowledge if you want to create custom hardware accelerators beyond what’s available in existing overlays.
What’s the performance difference between Python on PYNQ versus C/C++?
For control-plane operations (configuring registers, managing data flow), the performance difference is negligible. For data-plane operations, performance depends on where the work happens. If the heavy computation runs in the FPGA fabric, Python overhead is minimal since you’re just initiating DMA transfers. For CPU-intensive tasks, C/C++ remains faster, but PYNQ supports mixed Python/C++ workflows through ctypes and Cython.
Can I run Zynq Python code on a standard Raspberry Pi for development?
Not directly, since PYNQ requires the Zynq hardware for overlay operations. However, you can develop and test pure Python logic on any system. For hardware-in-the-loop development, consider using PYNQ’s built-in simulation capabilities or the cocotb framework for testbench development. The Jupyter notebook interface also allows remote development against a physical board.
Which board should I buy for machine learning projects?
For learning and prototyping, the Ultra96-V2 offers good value with WiFi, sufficient DPU performance, and reasonable cost. For production-oriented work or larger models, the Kria KV260 provides better performance and a system-on-module form factor suitable for custom carrier boards. The PYNQ-Z2, while excellent for general learning, lacks the UltraScale+ architecture needed for modern Vitis AI workloads.
How does PYNQ compare to other edge AI platforms like NVIDIA Jetson?
Both platforms excel at edge AI but serve different niches. NVIDIA Jetson provides GPU-based acceleration with mature CUDA tooling, making it ideal for applications already developed for GPU inference. PYNQ/Zynq offers more flexibility for custom hardware acceleration beyond neural networks, deterministic latency for control applications, and integration with other FPGA IP. For pure neural network inference, Jetson often provides simpler deployment; for mixed workloads requiring custom hardware, Zynq typically wins.
Video Processing with Zynq Python
One of the most compelling applications for Xilinx Python development is real-time video processing. The PYNQ-Z2’s HDMI input and output ports make it particularly suitable for this use case.
HDMI Pipeline Architecture
The base overlay provides a complete video pipeline that you can manipulate from Python:
from pynq.overlays.base import BaseOverlay
from pynq.lib.video import *
base = BaseOverlay(“base.bit”)
# Configure HDMI input and output
hdmi_in = base.video.hdmi_in
hdmi_out = base.video.hdmi_out
hdmi_in.configure()
hdmi_out.configure(hdmi_in.mode)
hdmi_in.start()
hdmi_out.start()
# Process frames in Python
while True:
frame = hdmi_in.readframe()
# Apply processing here
hdmi_out.writeframe(frame)
Software-only processing achieves roughly 3-5 frames per second for 1080p video. Adding hardware acceleration through custom overlays can push this to 30+ fps for many filters, demonstrating the practical value of the hybrid PS-PL architecture.
OpenCV Integration
PYNQ includes OpenCV, enabling familiar image processing workflows. You can capture frames from HDMI, process them with OpenCV functions, and display results – all from Python. For production applications, the compute-intensive OpenCV functions can be replaced with hardware-accelerated equivalents through custom overlays.
The Future of Python in FPGA Development
The trend toward higher-level FPGA development tools shows no signs of slowing. AMD’s acquisition of Xilinx has accelerated investment in software stacks, and PYNQ continues receiving updates with broader board support and improved integration with Vitis AI.
For embedded engineers, this means Python increasingly becomes a viable option for systems that previously demanded low-level HDL expertise. You can prototype in Python, identify performance bottlenecks, and selectively accelerate critical paths – all without completely changing your development workflow.
The combination of Xilinx Python tools, Zynq Python capabilities, and the broader Python ecosystem creates a genuinely productive environment for embedded AI and acceleration projects. Whether you’re using PYNQ’s overlays or diving deeper with Amaranth HDL, Python has earned its place in the FPGA developer’s toolkit.
Inquire: Call 0086-755-23203480, or reach out via the form below/your sales contact to discuss our design, manufacturing, and assembly capabilities.
Quote: Email your PCB files to Sales@pcbsync.com (Preferred for large files) or submit online. We will contact you promptly. Please ensure your email is correct.
Notes: For PCB fabrication, we require PCB design file in Gerber RS-274X format (most preferred), *.PCB/DDB (Protel, inform your program version) format or *.BRD (Eagle) format. For PCB assembly, we require PCB design file in above mentioned format, drilling file and BOM. Click to download BOM template To avoid file missing, please include all files into one folder and compress it into .zip or .rar format.