Inquire: Call 0086-755-23203480, or reach out via the form below/your sales contact to discuss our design, manufacturing, and assembly capabilities.
Quote: Email your PCB files to Sales@pcbsync.com (Preferred for large files) or submit online. We will contact you promptly. Please ensure your email is correct.
Notes: For PCB fabrication, we require PCB design file in Gerber RS-274X format (most preferred), *.PCB/DDB (Protel, inform your program version) format or *.BRD (Eagle) format. For PCB assembly, we require PCB design file in above mentioned format, drilling file and BOM. Click to download BOM template To avoid file missing, please include all files into one folder and compress it into .zip or .rar format.
Building video pipelines on FPGAs is one of those tasks that looks straightforward on paper but gets complicated fast once you start connecting IP blocks. After debugging more video timing issues than I care to count, I’ve learned that understanding the core Xilinx video IPs—VDMA, TPG, and VTC—is essential before tackling any serious embedded vision project. This guide breaks down each IP core and shows how they work together.
AMD/Xilinx provides a robust set of video processing IP cores designed around the AXI4-Stream video protocol. Three cores form the backbone of most video pipeline designs: the Xilinx VDMA for memory transfers, the Xilinx TPG for test pattern generation, and the Xilinx VTC for timing control. Each serves a specific purpose, but they’re designed to work together seamlessly.
Why These Three IPs Matter
When you’re bringing up a new video system, you face a chicken-and-egg problem. How do you verify your display output works when you don’t have a camera connected? How do you test your frame buffer without a reliable video source? That’s where TPG comes in—it generates known patterns you can trace through your entire pipeline. VTC provides the timing signals that keep everything synchronized, and VDMA handles the critical task of moving frame data to and from DDR memory.
Xilinx VDMA: Video Direct Memory Access Deep Dive
The AXI Video Direct Memory Access core handles high-bandwidth data transfers between system memory and AXI4-Stream video peripherals. Unlike standard DMA controllers that move linear blocks of data, the Xilinx VDMA understands video frames—two-dimensional data structures with specific width, height, and stride requirements.
VDMA Core Architecture
The VDMA operates with two independent channels:
Channel
Direction
Function
MM2S
Memory to Stream
Reads frames from DDR, outputs AXI4-Stream
S2MM
Stream to Memory
Receives AXI4-Stream, writes frames to DDR
Both channels can operate simultaneously and asynchronously. This independence is crucial for real-time video processing where input and output frame rates might differ slightly.
Key VDMA Configuration Parameters
Parameter
Range
Impact
AXI4 Data Width
32-1024 bits
Memory bandwidth
Stream Data Width
8-1024 bits
Pixel bus width
Frame Buffers
Up to 32
Triple buffering support
Address Width
32 or 64 bits
Memory addressing range
Line Buffer Depth
Configurable
Clock domain crossing buffer
Genlock Synchronization
One feature that initially confused me was Genlock. In production systems, you need to prevent the read and write channels from accessing the same buffer simultaneously—that causes tearing artifacts. The VDMA supports four Genlock modes:
Mode
Description
Genlock Master
Controls frame buffer sequencing
Genlock Slave
Follows master’s buffer selection
Dynamic Genlock Master
Allows runtime buffer switching
Dynamic Genlock Slave
Follows dynamic master selection
For a typical capture-to-display pipeline, configure the S2MM channel as master and MM2S as slave. This ensures the display always reads from a completed frame.
VDMA Register Programming Essentials
The control registers sit at specific offsets from the base address. Here’s the quick reference I keep handy:
// Common register offsets
#define MM2S_CONTROL 0x00
#define MM2S_STATUS 0x04
#define S2MM_CONTROL 0x30
#define S2MM_STATUS 0x34
#define S2MM_VSIZE 0xA0
#define S2MM_HSIZE 0xA4
#define S2MM_STRIDE 0xA8
#define S2MM_FRAMEBUF1 0xAC
The startup sequence matters: configure frame dimensions first, then buffer addresses, then enable the channel by writing to the control register. Setting VSIZE last actually triggers the transfer—a detail that cost me hours of debugging.
The Video Test Pattern Generator creates synthetic video streams for system bring-up and debugging. I consider the Xilinx TPG mandatory in any video design, even if you plan to remove it in production. Being able to inject known patterns at any pipeline stage is invaluable for isolating problems.
Available Test Patterns
The TPG offers an impressive variety of patterns, each targeting specific verification needs:
Pattern ID
Pattern Type
Verification Purpose
0x00
Pass Through
Verify input path
0x01
Horizontal Ramp
DAC linearity
0x02
Vertical Ramp
Line timing
0x09
Color Bars
Color accuracy
0x0A
Zone Plate
Motion artifacts
0x0C
Cross Hatch
Geometry distortion
0x0F
Checker Board
Pixel alignment
0x10
Pseudorandom
Compression testing
TPG Operating Modes
The TPG can operate in two distinct modes that affect how it generates timing:
Slave Mode (AXI4-Stream Input Enabled): The TPG uses incoming video timing from an upstream source. It can either pass through the input video or substitute test patterns while maintaining the original timing.
Master Mode (AXI4-Stream Input Disabled): The TPG generates its own timing based on register settings. This mode requires connection to a VTC for timing generation or internal timing configuration.
Configuring TPG for Common Resolutions
Runtime configuration happens through AXI4-Lite registers:
Register
Offset
Purpose
CONTROL
0x00
Enable/reset
ACTIVE_HEIGHT
0x10
Vertical active pixels
ACTIVE_WIDTH
0x18
Horizontal active pixels
BACKGROUND_PATTERN
0x20
Pattern selection
MOTION_SPEED
0x38
Moving box speed
For 1080p output, set ACTIVE_HEIGHT to 1080 and ACTIVE_WIDTH to 1920. The TPG handles blanking internally based on these active dimensions.
Foreground Overlay Features
Beyond background patterns, the TPG can overlay foreground elements:
Overlay Type
Use Case
Moving Box
Motion blur assessment
Cross Hairs
Alignment verification
Color Box
Region identification
The moving box is particularly useful—its color indicates which video path is active when debugging multi-stream systems.
Xilinx VTC: Video Timing Controller Essentials
The Video Timing Controller serves as the timing reference for video systems. The Xilinx VTC can detect incoming video timing, generate output timing, or do both simultaneously. It’s the glue that synchronizes all the video IPs in your pipeline.
VTC Operational Modes
Mode
Function
Typical Use
Detector Only
Measures input timing
Video capture
Generator Only
Creates output timing
Video output
Both
Detect and generate
Pass-through with processing
Timing Parameters Explained
Video timing involves more than just resolution. The VTC manages all these parameters:
Parameter
Description
Active Width/Height
Visible pixel area
HSync Start/End
Horizontal sync pulse position
VSync Start/End
Vertical sync pulse position
HBlank Start/End
Horizontal blanking interval
VBlank Start/End
Vertical blanking interval
Total Width/Height
Including blanking
For 1080p60, the active area is 1920×1080, but total frame size is 2200×1125 including blanking. Getting these numbers wrong causes immediate visible problems—rolling, tearing, or complete loss of sync.
The VTC generator typically connects to the AXI4-Stream to Video Out bridge. Two timing modes are supported:
Slave Timing Mode: The Video Out core controls VTC timing through clock enable. This provides lower latency and tighter synchronization with the incoming stream.
Master Timing Mode: The VTC free-runs independently. Phase relationship between stream and output depends on startup conditions.
I recommend slave mode for most designs. It automatically handles the phase alignment between your video stream and output timing.
Frame Sync Signals
The VTC generates up to 16 configurable frame sync outputs. These pulses trigger events at specific positions within each frame:
fsync[0] – Start of active video
fsync[1] – Custom user position
…
fsync[15] – Another custom position
Use frame syncs to trigger VDMA transfers, synchronize external hardware, or create interrupt events for software processing.
Building a Complete Video Pipeline
Now let’s connect these IPs into a working system. A typical capture-to-display pipeline looks like this:
Video Input → VTC (Detect) → Video In to AXI4-Stream →
→ VDMA (S2MM) → DDR Memory → VDMA (MM2S) →
→ AXI4-Stream to Video Out → VTC (Generate) → Video Output
Clock Domain Considerations
Video pipelines typically span multiple clock domains:
Domain
Typical Frequency
Components
AXI4-Lite
100 MHz
Control registers
Video Stream
150-300 MHz
VDMA data path
Pixel Clock
148.5 MHz (1080p60)
VTC, video I/O
Memory Interface
200+ MHz
DDR controller
The VDMA includes internal line buffers for clock domain crossing between stream and memory interfaces. Size these buffers appropriately—undersized buffers cause underrun/overflow errors.
Pipeline Debugging Strategy
When things don’t work, use this systematic approach:
Substitute TPG for input source: Verify output path works independently
Check VTC lock status: Confirm timing detection if using external video
Monitor VDMA status registers: Look for frame count increments and error flags
Verify buffer addresses: Ensure proper memory allocation and alignment
Check clock frequencies: Use debug probes to confirm actual clock rates
Common VDMA Status Register Errors
Error Bit
Meaning
Likely Cause
DMAIntErr
Internal DMA error
Configuration mismatch
DMASlvErr
Slave error
Memory access issue
DMADecErr
Decode error
Invalid address
SGIntErr
Scatter-gather error
Descriptor problem
SOFEarlyErr
Early start of frame
Timing mismatch
EOLLateErr
Late end of line
HSIZE incorrect
Linux Driver Integration
For embedded Linux systems, Xilinx provides V4L2-based drivers for these IP cores. The xilinx_dma.c driver handles VDMA, while xilinx-tpg.c and xilinx-vtc.c manage their respective IPs.
Device Tree Configuration
Proper device tree bindings are essential:
vdma: dma@43000000 {
compatible = “xlnx,axi-vdma-6.3”;
reg = <0x43000000 0x10000>;
xlnx,num-fstores = <3>;
xlnx,flush-fsync = <1>;
dma-channel@43000030 {
compatible = “xlnx,axi-vdma-s2mm-channel”;
xlnx,datawidth = <32>;
};
};
tpg: v_tpg@43c10000 {
compatible = “xlnx,v-tpg-8.0”;
reg = <0x43c10000 0x10000>;
xlnx,max-height = <1080>;
xlnx,max-width = <1920>;
};
V4L2 Control Interface
Configure TPG patterns at runtime using standard V4L2 tools:
# Set color bars pattern
v4l2-ctl -d /dev/v4l-subdev0 -c test_pattern=9
# Configure motion speed
v4l2-ctl -d /dev/v4l-subdev0 -c motion_speed=4
Useful Resources and Documentation
Official AMD/Xilinx Documentation
Document
Product Guide
Coverage
AXI VDMA
PG020
Complete VDMA reference
Video Test Pattern Generator
PG103
TPG configuration
Video Timing Controller
PG016
VTC setup and timing
Video IP Reference Guide
UG934
AXI4-Stream video protocol
Vivado AXI Reference
UG1037
AXI interface specifications
Application Notes and Downloads
XAPP742: AXI VDMA Reference Design – Multi-stream video system example
XAPP1218: AXI VDMA for KC705 – Loopback configuration demo
XAPP1205: High-Performance Video on Zynq – Memory optimization techniques
Xilinx GitHub (embeddedsw): Bare-metal drivers and examples
Development Boards with Video Support
Board
Device
Video I/O
ZCU102
Zynq UltraScale+
HDMI, DisplayPort
KC705
Kintex-7
HDMI via FMC
Kria KV260
Zynq UltraScale+
MIPI, DisplayPort
ZYBO Z7
Zynq-7000
HDMI, VGA
Frequently Asked Questions
What’s the difference between VDMA and regular AXI DMA?
Regular AXI DMA handles one-dimensional data transfers—linear blocks of memory. The VDMA is specifically designed for two-dimensional video data, understanding concepts like frame width, height, and stride (the distance in bytes between the start of consecutive lines). VDMA also includes frame buffer management and Genlock synchronization features that standard DMA lacks.
Can I use TPG in production designs or is it only for testing?
While TPG is primarily a development and debug tool, there are legitimate production uses. Some designs keep TPG for fallback patterns when input video is lost, for displaying test screens during manufacturing, or for creating synthetic backgrounds in overlay applications. However, note that evaluation licenses time out, so production use requires proper licensing.
How do I calculate the correct pixel clock for a given resolution?
Pixel clock depends on total frame size (including blanking) and frame rate. For 1080p60: Total pixels = 2200 × 1125 = 2,475,000 per frame. At 60 fps: 2,475,000 × 60 = 148.5 MHz. The VTC needs this clock to generate proper timing. Common clocks are 74.25 MHz (1080i60/720p60), 148.5 MHz (1080p60), and 297 MHz (4K30).
Why does my VDMA transfer hang with no visible errors?
The most common cause is writing VSIZE before setting buffer addresses. VDMA starts the transfer when VSIZE is written, so buffer configuration must happen first. Also check that the stream clock is running—a stopped clock means VDMA waits forever. Use the park pointer register to verify which buffer VDMA is currently processing.
How many frame buffers should I configure for the VDMA?
Three buffers (triple buffering) is the standard choice for most applications. It guarantees a complete frame is always available for display while another is being written. Two buffers can work but risk tearing artifacts. More than three buffers increase latency without proportional benefit for most real-time video applications.
Wrapping Up
The Xilinx VDMA, TPG, and VTC form a well-integrated video processing foundation. Understanding how each IP contributes to the overall pipeline—memory buffering, test pattern generation, and timing control—helps you design robust video systems and troubleshoot issues efficiently. Start with simple configurations, verify each stage independently, and build complexity gradually. The investment in understanding these cores pays dividends across all your video FPGA projects.
————————————————————————————————————-
Suggested Meta Description:
Learn how to configure Xilinx VDMA, TPG, and VTC video IP cores for FPGA designs. Complete tutorial covering AXI Video DMA setup, test pattern generation, timing control, and pipeline integration with practical examples.
Inquire: Call 0086-755-23203480, or reach out via the form below/your sales contact to discuss our design, manufacturing, and assembly capabilities.
Quote: Email your PCB files to Sales@pcbsync.com (Preferred for large files) or submit online. We will contact you promptly. Please ensure your email is correct.
Notes: For PCB fabrication, we require PCB design file in Gerber RS-274X format (most preferred), *.PCB/DDB (Protel, inform your program version) format or *.BRD (Eagle) format. For PCB assembly, we require PCB design file in above mentioned format, drilling file and BOM. Click to download BOM template To avoid file missing, please include all files into one folder and compress it into .zip or .rar format.