TAPA RTL Simulation

While the TAPA-generated xo could be used for cosim with the Vitis tool chain, this step is quite slow. The Vitis cosim will employ a set of complicated simulation models for external IPs (e.g., AXI interconnect, HBM, DDR, etc.) to best approximate the actual running environment. Setting up the cosim for a basic vector-add application takes more than 10 minutes in Vitis, even though the actual simulation only takes a few seconds.

To address this long turn-around time, TAPA develops a lightweight simulation methodology. Instead of using sophisticated simulation models, we resort to using very basic objects that could be set up instantly. For example, we use a plain buffer wrapped in AXI interface to mimic an external DRAM. In this way, we could fire the simulation in just a few seconds.

While the internals of our simulation objects are not as accurate, they share the exact same interface (with the AXI protocal) as the Vitis counterparts. Therefore, they are more than enough to help filter most logic errors in the user logic. After the users fix the basic functional bugs with the TAPA fast cosim, they could always run another pass of Vitis cosim if they want a more realistic simulation (e.g., similar latency/bandwidth as real devices).

https://user-images.githubusercontent.com/32432619/164995378-a5d1ea4b-a673-42ef-9f9d-4e0dcc9ce527.png

Basic Usage

Both the TAPA fast cosim and the Vitis cosim could be executed interchangeably with the same host program. If you provide the .xo object from tapac, the TAPA fast cosim will be invoked. Meanwhile, if you provide the .xclbin object with hw_emu target generated by Vitis, the Vitis simulation will be invoked. Finally, if you provide the .xclbin object with hw target generated by Vitis, the on-board execution will be invoked. Take the vector-add design for example:

To run TAPA fast cosim:

./vadd --bitstream VecAdd.xo ${DATA_SIZE}

To run Vitis cosim:

./vadd --bitstream VecAdd_hw_emu.xclbin ${DATA_SIZE}

To run on-board execution:

./vadd --bitstream VecAdd_hw.xclbin ${DATA_SIZE}

View Waveform

If you run ./vadd --help, there will be two options related to the TAPA fast cosim:

  • ./vadd -xosim_work_dir xxx will save all intermediate data and file.

  • ./vadd -xosim_save_waveform will save all waveform into a .wdb file in the work dir. Unless the work dir is also specified (so that the work dir is kept), the waveform won’t be kept.

Simulation Frozen

Very often the simulation will get stuck forever (e.g., due to deadlock). In this case, you could do the following to debug your program:

  • use the -xosim_work_dir option to save all intermediate files.

  • ctrl-C to abort the simulation if stuck.

  • Look for [work-dir]/output/run/run_cosim.tcl, which is the script passed to Vivado for simulation

  • Run the script in Vivado GUI: vivado -mode gui -source run_cosim.tcl. This will pop out the Vivado window and you could see the real-time progress of the simulation. You could stop the simulation in the middle and observe the waveform.

Tips

Do not forget to add the option parsing code to the top of the main function in your host program.

int main(int argc, char* argv[]) {
  gflags::ParseCommandLineFlags(&argc, &argv, /*remove_flags=*/true);

  // ...

Limitation

At this point, we do not support cross-channel access for HBM. In other words, each AXI interface could only access one HBM channel.