Contact Sales & After-Sales Service

Contact & Quotation

  • Inquire: Call 0086-755-23203480, or reach out via the form below/your sales contact to discuss our design, manufacturing, and assembly capabilities.
  • Quote: Email your PCB files to Sales@pcbsync.com (Preferred for large files) or submit online. We will contact you promptly. Please ensure your email is correct.
Drag & Drop Files, Choose Files to Upload You can upload up to 3 files.

Notes:
For PCB fabrication, we require PCB design file in Gerber RS-274X format (most preferred), *.PCB/DDB (Protel, inform your program version) format or *.BRD (Eagle) format. For PCB assembly, we require PCB design file in above mentioned format, drilling file and BOM. Click to download BOM template To avoid file missing, please include all files into one folder and compress it into .zip or .rar format.

Python FPGA Development with Xilinx: PYNQ & Beyond

When I first heard about programming FPGAs with Python, I was skeptical. After years of wrestling with Verilog timing constraints and cryptic synthesis errors, the idea of controlling hardware accelerators from a Jupyter notebook seemed almost too good to be true. But after spending considerable time with the PYNQ ecosystem and various Xilinx Python tools, I can say the landscape has genuinely changed for embedded engineers who want to leverage programmable logic without becoming HDL experts.

This guide covers everything from getting started with PYNQ to building custom hardware overlays, machine learning acceleration, and alternative Python-based HDL frameworks that can target Xilinx devices.

Understanding the Xilinx Python Ecosystem

The Xilinx Python ecosystem centers around PYNQ (Python + Zynq), but extends far beyond a single framework. Before diving into specifics, let’s understand how these pieces fit together.

What Makes Zynq Python Development Different

Traditional FPGA development requires intimate knowledge of hardware description languages like Verilog or VHDL. The Zynq architecture changed this equation by combining an ARM processing system (PS) with programmable logic (PL) on a single chip. This architecture enables a natural division of labor: Python runs on the ARM cores while hardware accelerators execute in the FPGA fabric.

The key insight behind Zynq Python development is that most applications don’t need custom RTL. Pre-built hardware overlays can handle common acceleration tasks, leaving software developers free to focus on their algorithms rather than clock domain crossings.

ComponentPurposeLanguage
Processing System (PS)Runs Linux, Python, application logicPython/C/C++
Programmable Logic (PL)Hardware accelerators, custom IPVHDL/Verilog/HLS
AXI InterconnectPS-PL communicationHardware protocol
OverlaysPre-built hardware configurationsBitstream + drivers

Getting Started with PYNQ on Zynq Boards

PYNQ makes Xilinx Python development accessible by providing a complete software stack, pre-built overlays, and example notebooks. The framework supports multiple boards across the Zynq, Zynq UltraScale+, Kria, and Alveo families.

Supported PYNQ Development Boards

Choosing the right board depends on your application requirements and budget. Here’s a comparison of popular options:

BoardSoC/FPGARAMKey FeaturesPrice Range
PYNQ-Z2Zynq XC7Z020512MB DDR3HDMI in/out, Audio, Arduino/Pmod headers~$100
Ultra96-V2Zynq UltraScale+ ZU3EG2GB DDR4WiFi/BT, 96Boards expansion~$220
ZCU104Zynq UltraScale+ XCZU7EV2GB DDR4Video codec, DisplayPort~$1,100
Kria KV260Zynq UltraScale+ K26 SOM4GB DDR4Vision AI, Smart camera~$250

For beginners, the PYNQ-Z2 offers the best value. The Zynq XC7Z020 provides enough resources for learning while the board includes practical peripherals like HDMI input/output for video processing projects.

Setting Up Your PYNQ Environment

Getting PYNQ running takes about 15 minutes. Here’s the basic workflow:

Download the appropriate SD card image from the PYNQ website. Flash it to a microSD card using tools like Etcher or dd. Insert the card into your board and connect Ethernet. Power on and wait for the boot sequence to complete (about 60 seconds). Access Jupyter notebooks at http://pynq:9090 or http://192.168.2.99:9090.

The default credentials are username “xilinx” and password “xilinx”. From there, you’re immediately ready to run Python code that interacts with hardware.

Read more Xilinx FPGA Series:

Working with PYNQ Overlays

Overlays are the secret sauce that makes Zynq Python development productive. Think of an overlay as a hardware library – you load it when needed, use its functions through a Python API, and swap it out for another overlay when your requirements change.

Understanding the Overlay Architecture

An overlay consists of three components: the bitstream (.bit file), the hardware handoff file (.hwh), and Python driver code. When you instantiate an overlay in Python, PYNQ automatically downloads the bitstream to the FPGA and creates Python objects for each IP block.

from pynq import Overlay

# Load the base overlay

overlay = Overlay(“base.bit”)

# Access hardware components through Python

overlay.leds[0].on()

overlay.buttons[0].read()

The base overlay included with each PYNQ board provides drivers for on-board peripherals: LEDs, buttons, switches, HDMI, audio, and GPIO interfaces. This lets you start experimenting immediately without building custom hardware.

Available Pre-Built Overlays

Beyond the base overlay, Xilinx and the community provide specialized overlays for common use cases:

OverlayPurposeKey IP Blocks
BaseBoard peripheralsGPIO, Video, Audio
LogictoolsDigital pattern generationPattern Generator, FSM, Boolean
PYNQ-ComputerVisionImage processingOpenCV-compatible filters
BNN-PYNQBinary neural networksQuantized inference engine
DPU-PYNQDeep learningVitis AI DPU

The logictools overlay deserves special mention for hardware debugging. It turns your PYNQ board into a configurable logic analyzer and pattern generator, perfect for testing external circuits or learning digital logic concepts.

Creating Custom Hardware Overlays

Pre-built overlays cover many scenarios, but eventually you’ll want to accelerate your own algorithms. Creating custom overlays requires Vivado and some understanding of the Zynq architecture, though not necessarily deep HDL expertise.

The Custom Overlay Development Flow

Building a custom overlay follows this general workflow:

  1. Create IP blocks using Vivado HLS (C/C++ to RTL) or traditional HDL
  2. Assemble the system in Vivado IP Integrator
  3. Connect IP to the Zynq PS through AXI interfaces
  4. Generate the bitstream and hardware handoff files
  5. Write Python driver code to interface with your IP

The most accessible path for software developers is Vivado HLS. You write C or C++ code with specific pragmas, and the tool synthesizes it into RTL. This doesn’t produce optimal hardware, but it’s often fast enough and dramatically reduces development time.

AXI Interface Considerations

The communication between PS and PL happens through AXI interfaces. Understanding which interface to use is crucial for performance:

AXI GP (General Purpose): Low bandwidth, PS acts as master. Use for configuration registers and control signals.

AXI HP (High Performance): High bandwidth, PL acts as master. Use for DMA transfers and video streams. The Zynq-7000 provides four HP ports.

AXI ACP (Accelerator Coherency Port): Cache-coherent access to DDR. Use when sharing data structures between PS and PL without explicit cache management.

For most acceleration scenarios, you’ll use AXI GP for control and AXI HP for bulk data transfer via DMA.

Machine Learning Acceleration with Zynq Python

Machine learning inference is where Xilinx Python development really shines. The combination of PYNQ’s Python accessibility and the DPU’s inference performance creates a compelling platform for edge AI applications.

Vitis AI and the DPU

The Deep Learning Processor Unit (DPU) is AMD/Xilinx’s configurable inference accelerator for neural networks. It supports common architectures including ResNet, YOLO, SSD, and various transformer models.

The DPU-PYNQ project provides pre-built overlays and Python APIs that make deployment remarkably simple:

from pynq_dpu import DpuOverlay

overlay = DpuOverlay(“dpu.bit”)

overlay.load_model(“resnet50.xmodel”)

# Run inference

result = overlay.run(input_image)

DPU configurations vary by board. Larger boards like the ZCU104 can run multiple DPU cores simultaneously, while the Ultra96-V2 runs a smaller B1024 configuration suitable for real-time video inference at reduced precision.

Performance Expectations

Edge AI performance depends heavily on the specific board and model. Here are typical results for image classification:

BoardDPU ConfigResNet-50 (fps)YOLO-v3 (fps)
Ultra96-V2B1024 x160-8010-15
ZCU104B4096 x2150-20030-50
Kria KV260B4096 x1100-14020-35

These numbers far exceed what’s achievable with pure CPU inference on the same ARM cores, demonstrating the value of hardware acceleration.

Read more Xilinx Products:

Beyond PYNQ: Alternative Python HDL Frameworks

PYNQ excels at leveraging pre-built hardware, but what if you want to describe hardware itself using Python? Several frameworks enable this, each with different philosophies and target applications.

Amaranth HDL (formerly nMigen)

Amaranth is a modern Python-based hardware description language that generates synthesizable Verilog. Unlike PYNQ, Amaranth doesn’t abstract away hardware – it provides Python syntax for describing logic that will become actual gates and flip-flops.

Key features of Amaranth include a clean module system, strong type checking, built-in simulation, and direct integration with open-source synthesis tools. It’s particularly popular in the open-source hardware community.

from amaranth import *

class Counter(Elaboratable):

    def __init__(self, width):

        self.value = Signal(width)

    def elaborate(self, platform):

        m = Module()

        m.d.sync += self.value.eq(self.value + 1)

        return m

MyHDL and Migen

MyHDL is one of the oldest Python HDL projects, dating back to 2004. It uses Python generators to model concurrent hardware processes and can convert designs to VHDL or Verilog. While development has slowed, it remains functional and has extensive documentation.

Migen emerged from the LiteX project and focuses on building complete SoC designs. The LiteX ecosystem provides IP cores for common peripherals (UART, SPI, Ethernet, DDR controllers) that can be assembled into working systems entirely from Python.

Comparison of Python HDL Approaches

FrameworkUse CaseOutputXilinx Support
PYNQUsing pre-built hardwarePython control codeNative
AmaranthDesigning hardwareVerilogVia Vivado
Migen/LiteXBuilding SoCsVerilogVia Vivado
MyHDLEducational, prototypingVHDL/VerilogVia Vivado
cocotbVerification/testingTestbenchesNative simulation

For most Xilinx Python projects, PYNQ remains the practical choice. The alternative frameworks shine when you need to create novel hardware or target platforms beyond Xilinx’s ecosystem.

Practical Xilinx Python Development Tips

After working with these tools across multiple projects, here are the lessons that aren’t always obvious from documentation.

Memory Management Matters

PYNQ provides allocate() for creating DMA-capable buffers. Use these contiguous memory regions for any data transferred to hardware accelerators. Standard NumPy arrays won’t work correctly with DMA operations.

from pynq import allocate

# Create buffer for hardware DMA

input_buffer = allocate(shape=(1920, 1080, 3), dtype=np.uint8)

# Copy data to buffer

input_buffer[:] = frame_data

# Now safe to pass to hardware

dma.sendchannel.transfer(input_buffer)

Overlay Loading Takes Time

Loading an overlay reconfigures the FPGA, which takes 100-500ms depending on bitstream size. Don’t load overlays in tight loops. Load once at startup and reuse the instance throughout your application.

Clock Domain Awareness

Even when using PYNQ, understanding clock domains prevents subtle bugs. The PS runs at its own frequency while PL clocks are independently configurable. Data crossing between domains needs proper synchronization, typically handled by the AXI infrastructure but sometimes requiring attention in custom designs.

Useful Resources for Xilinx Python Development

Here are the essential resources for deepening your Xilinx Python and Zynq Python knowledge:

ResourceDescriptionLink
PYNQ DocumentationOfficial framework documentationpynq.readthedocs.io
PYNQ GitHubSource code and examplesgithub.com/Xilinx/PYNQ
PYNQ WorkshopHands-on training materialsgithub.com/Xilinx/PYNQ_Workshop
DPU-PYNQDeep learning accelerationgithub.com/Xilinx/DPU-PYNQ
PYNQ Community ForumTechnical supportdiscuss.pynq.io
Vitis AI Model ZooPre-trained ML modelsgithub.com/Xilinx/Vitis-AI
Amaranth DocumentationPython HDL referenceamaranth-lang.org/docs
LiteX WikiSoC building frameworkgithub.com/enjoy-digital/litex/wiki

Frequently Asked Questions About Xilinx Python Development

Can I use PYNQ without knowing any Verilog or VHDL?

Absolutely. The entire point of PYNQ is enabling software developers to use FPGA acceleration without hardware design expertise. The pre-built overlays handle common scenarios like video processing, GPIO control, and machine learning inference. You only need HDL knowledge if you want to create custom hardware accelerators beyond what’s available in existing overlays.

What’s the performance difference between Python on PYNQ versus C/C++?

For control-plane operations (configuring registers, managing data flow), the performance difference is negligible. For data-plane operations, performance depends on where the work happens. If the heavy computation runs in the FPGA fabric, Python overhead is minimal since you’re just initiating DMA transfers. For CPU-intensive tasks, C/C++ remains faster, but PYNQ supports mixed Python/C++ workflows through ctypes and Cython.

Can I run Zynq Python code on a standard Raspberry Pi for development?

Not directly, since PYNQ requires the Zynq hardware for overlay operations. However, you can develop and test pure Python logic on any system. For hardware-in-the-loop development, consider using PYNQ’s built-in simulation capabilities or the cocotb framework for testbench development. The Jupyter notebook interface also allows remote development against a physical board.

Which board should I buy for machine learning projects?

For learning and prototyping, the Ultra96-V2 offers good value with WiFi, sufficient DPU performance, and reasonable cost. For production-oriented work or larger models, the Kria KV260 provides better performance and a system-on-module form factor suitable for custom carrier boards. The PYNQ-Z2, while excellent for general learning, lacks the UltraScale+ architecture needed for modern Vitis AI workloads.

How does PYNQ compare to other edge AI platforms like NVIDIA Jetson?

Both platforms excel at edge AI but serve different niches. NVIDIA Jetson provides GPU-based acceleration with mature CUDA tooling, making it ideal for applications already developed for GPU inference. PYNQ/Zynq offers more flexibility for custom hardware acceleration beyond neural networks, deterministic latency for control applications, and integration with other FPGA IP. For pure neural network inference, Jetson often provides simpler deployment; for mixed workloads requiring custom hardware, Zynq typically wins.

Video Processing with Zynq Python

One of the most compelling applications for Xilinx Python development is real-time video processing. The PYNQ-Z2’s HDMI input and output ports make it particularly suitable for this use case.

HDMI Pipeline Architecture

The base overlay provides a complete video pipeline that you can manipulate from Python:

from pynq.overlays.base import BaseOverlay

from pynq.lib.video import *

base = BaseOverlay(“base.bit”)

# Configure HDMI input and output

hdmi_in = base.video.hdmi_in

hdmi_out = base.video.hdmi_out

hdmi_in.configure()

hdmi_out.configure(hdmi_in.mode)

hdmi_in.start()

hdmi_out.start()

# Process frames in Python

while True:

    frame = hdmi_in.readframe()

    # Apply processing here

    hdmi_out.writeframe(frame)

Software-only processing achieves roughly 3-5 frames per second for 1080p video. Adding hardware acceleration through custom overlays can push this to 30+ fps for many filters, demonstrating the practical value of the hybrid PS-PL architecture.

OpenCV Integration

PYNQ includes OpenCV, enabling familiar image processing workflows. You can capture frames from HDMI, process them with OpenCV functions, and display results – all from Python. For production applications, the compute-intensive OpenCV functions can be replaced with hardware-accelerated equivalents through custom overlays.

The Future of Python in FPGA Development

The trend toward higher-level FPGA development tools shows no signs of slowing. AMD’s acquisition of Xilinx has accelerated investment in software stacks, and PYNQ continues receiving updates with broader board support and improved integration with Vitis AI.

For embedded engineers, this means Python increasingly becomes a viable option for systems that previously demanded low-level HDL expertise. You can prototype in Python, identify performance bottlenecks, and selectively accelerate critical paths – all without completely changing your development workflow.

The combination of Xilinx Python tools, Zynq Python capabilities, and the broader Python ecosystem creates a genuinely productive environment for embedded AI and acceleration projects. Whether you’re using PYNQ’s overlays or diving deeper with Amaranth HDL, Python has earned its place in the FPGA developer’s toolkit.

Leave a Reply

Your email address will not be published. Required fields are marked *

Contact Sales & After-Sales Service

Contact & Quotation

  • Inquire: Call 0086-755-23203480, or reach out via the form below/your sales contact to discuss our design, manufacturing, and assembly capabilities.

  • Quote: Email your PCB files to Sales@pcbsync.com (Preferred for large files) or submit online. We will contact you promptly. Please ensure your email is correct.

Drag & Drop Files, Choose Files to Upload You can upload up to 3 files.

Notes:
For PCB fabrication, we require PCB design file in Gerber RS-274X format (most preferred), *.PCB/DDB (Protel, inform your program version) format or *.BRD (Eagle) format. For PCB assembly, we require PCB design file in above mentioned format, drilling file and BOM. Click to download BOM template To avoid file missing, please include all files into one folder and compress it into .zip or .rar format.