DigitalCold's ARM Challenge FPGA Implementation
DigitalCold converted FreeFull's entry to our ARM challenge and made it run on an FPGA. The hardware used is a Digilent BASYS 2 FPGA development board: Digilent's BASYS 2 Website.
The signal from the clock which is running at 50MHz runs through the FPGA and drives 10 output signals towards the VGA port. Two of those wires are directly connected to the Horizontal and Virtual Sync pins. The other 8 are divided into 3 groups. Three for both red and green, and two for blue. Each group is passed through a DAC (digital to analog converter) to turn the digital value into a voltage between 0V and 0.7V which is then routed to the respective pins on the connector The other 10 pins on a standard VGA connector (DE-15) are either not connected, or connected to ground. Below the break DigitalCold explains it in more detail.
Technical Details (by DigitalCold)
Here is a still showing the demo running off of the FPGA. The laptop is running the original ARM version in the top left and a C (SDL driven) version in the bottom right:
Another picture showing a close up of the BASYS 2 board. Notice that there is no external oscillator installed in the 8 pin DIP slot:
Additional Pictures
Nearly a year ago, I was impressed by FreeFull's ARM demo entry. The code was so simple, yet it created such a nice effect. It only made sense that I started with this demo.
Recently, I acquired a BASYS 2 board for cheap (~$60 academic discount). The part caught my interest the most was the VGA output. I already had experience dealing with raw VGA from the ARM competition, so I wondered how difficult it would be to translate that knowledge over to an FPGA. After a few months casual tinkering, distractions (school), and research, I finally got a working VGA driver.
Now all I needed was a cool, but simple, graphics demo to go with it. Sure, I could have made something from scratch, but I knew that my inexperience with FPGA programming could lead to a lot of frustration if I just tried to jump straight in to an idea. This is where FreeFull's demo came in. It already works, the source is available, and it met the requirements, of which will be discussed below.
Hardware Discussion
The BASYS 2 FPGA development board is an entry level board sporting an older Xilinx FPGA, the Spartan 3E. It's one of the most affordable FPGA development boards out there with VGA output. There is one caveat: there isn't enough RAM on board to allow for a framebuffer much greater than 100x80 pixels (8KB @ 8 bits per pixel). This makes VGA programming quite hard. Think about the freedoms a framebuffer gives you:
- You are free to write any where in the framebuffer at (usually) anytime
- Whatever you write in to a framebuffer stays there (ignoring double buffering and screen clearing)
- Once you have access to the framebuffer, you no longer need to know about the low-level VGA timing information. You only need the width, depth (height), and bit width of the FB
Take all those away and what do you have? Vicious timing requirements. Without a framebuffer, you have no memory of the VGA output. A framebuffer allows you act asynchronously from the VGA driver, but instead, you must act synchronously with the display. This creates challenges when it comes to meeting timing constraints. Basically, you have to calculate the data for one pixel within the time frame before the next pixel. Not having a framebuffer also means your pixel state is lost the moment it gets written to the screen.
VGA Driver: Theory
A VGA driver isn't actually as complicated as you think. All it really does is maintain counters for the X and Y positions. Based on these counters, it knows when to send synchronization pulses to the display and when to actually display pixels. Here is a timing diagram of the process (not to scale): Here are the corresponding 640x480@60Hz VGA timing tables from tinyvga.com:
Vertical Region | Scanlines |
---|---|
Front Porch | 10 |
VSync Width | 2 |
Back Porch | 33 |
Display Area | 480 |
Frame Height | 525 |
Horizontal Region | Pixels |
---|---|
Front Porch | 16 |
HSync Width | 96 |
Back Porch | 48 |
Display Area | 640 |
Scanline Width | 800 |
Each of the intervals shown have different timings depending on the resolution and refresh rate you decide to display at. In the demo's case, it uses a modest 640x480 pixel display running at 60 Hz. The timing for this mode all revolves around a special clock value, specifically 25.175 MHz. This particular clock frequency allows each pulse to represent exactly one pixel. From now on I will refer to this as the pixel clock. In my implementation, I actually use a 25 MHz clock instead (easier to synthesize based off of internal oscillator).
Vertical timing is measured in scanlines and horizontal timing in pixels. In the vertical front porch, there are 10 scanlines or 8000 pixels. The usage of "pixels" is contextually dependent when talking about VGA. VGA is all about timing, so when talking about pixels not in the display area, we are really referring to a unit of time.
Knowing the total horizontal and vertical lengths, the refresh rate is defined as:
We can easily calculate the VGA driver's refresh rate: (25,000,000)/(800*525) ~= 60Hz. This is pretty close to 60 Hz, even though the 640x480@60Hz specification states a 25.175 MHz clock, my test monitor doesn't seem to mind.
VGA Driver: Implementation
Here is a block diagram showing all of the inputs and outputs for the VGA driver Verilog module:
Additional interface signals have been omitted for clarity. For detailed information, here is a table showing a detailed listing of all of the vga_driver's signals:
Signal | I/O | Note |
---|---|---|
CLK_50MHz | Input | CLK from FPGA resource |
RED[2:0] | Output (VGA) | DAC Red input |
GREEN[2:0] | Output (VGA) | DAC Green input |
BLUE[1:0] | Output (VGA) | DAC Blue input |
VS | Output (VGA) | VSync |
HS | Output (VGA) | HSync |
CLK_DATA | Output (Interface) | Asserted when pixel data is ready to be read in |
COLOR_DATA[7:0] | Input (Interface) | Read on CLK_DATA |
CURX[9:0] | Output (Interface) | Export of the current X position |
CURY[8:0] | Output (Interface) | Export of the current Y position |
VBLANK | Output (Interface) | New frame (posedge) |
HBLANK | Output (Interface) | New scanline (posedge) |
The VGA driver abstracts away from all the nasty timing details and provides a relatively clean and simple interface for clients to use. All they have to do is provide pixel data at the positive edge of CLK_DATA and it will display! No questions asked.
The actual demo code is in the top module of the Verilog project. It uses this simple interface to generate its pixel data and send it straight to the driver.
VGA Clock Stability
One of the out-of-box issues with the BASYS 2 board is its built-in oscillator. The oscillator produces a noisy clock signal, enough so that it affects timing critical circuits. VGA being very timing critical, shows these effects while running in the form of edge-tearing. Every scanline appears to be vibrating in a quite distracting manner. Thankfully, the Digilent knew that this would be a problem and installed an extra slot for an external oscillator. They also talk about it (briefly) in the BASYS 2 reference manual and they even recommend a part number! Of course this part number, SG-8002JF-PCC, is surface mount (which is WRONG), when it should be a 8 DIP, 4 lead (SGR-8002DC-PCC) instead.
Project Migration Considerations
This project is written in Verilog HDL and aims to be as portable as possible. The only variables that must be considered when transferring to a different device and development environment, are constraints.
- Clock constraints: the design expects a 50 MHz clock input coming from the designated BASYS 2 global clock input. This clock is then divided down to 25MHz to create the pixel clock. Considerations for migration include: input frequency and clock input location (pin constraint). If your design uses an external oscillator, be sure to update the UCF file to point to this location. If your clock does not match 50MHz, clock management logic will have to be added.
- Pin constraints: the project expects certain inputs and outputs to be located on specific pin names. These vary wildly from device, especially when it comes to package type (TQ144 vs CP132). Read the corresponding datasheet for your platform to complete migration.
- Timing constraints: as stated before, the method to generate pixel data is timing critical. The implementation on the Spartan 3E using Xilinx's ISE 14.4 for the BASYS 2 board, meets timing constraints. These constraints determine if the longest delay path goes beyond one clock cycle. If this constraint is not met, data will not reach the VGA driver in time for drawing. For your development environment, be sure to enable timing constraints to notify you if the synthesized design fails to meet timing.
Deliverables
- C Version (SDL Library Required, .c)
- Xilinx ISE 14.4 Project (.zip)
- Raw Verilog Files (.zip)
- Raw Bitfile (programming file, .bit)