Setup & Execution For OpenCL and XRT Data Transfers Benchmarks on Xilinx FPGA

Introduction

This test suite is used for benchmarking the bandwidth and latency of data transfers between the cpu and global memory. It focuses on benchmarking the transfer speeds of various APIs (XRT API and OpenCL/OCL) between the host and the FPGA board.

Below, you will find setup and execution instructions to reproduce my results.

Here's the content converted to markdown format:

Setup

Local Install

Operating System

The emulations were run on both CentOS 7.9.2009 and CentOS 8.4.2105. Their respective disk images (.iso) files can be found in the CentOS vault provided by Internet Initiative Japan. A link to the vault can be found here:

https://ftp.iij.ad.jp/pub/linux/centos-vault/

If you plan to install it locally, you will need to save the disk image and then flash it onto a bootable USB with at least 16 GB of storage. To flash it, I recommend using the software https://rufus.ie/.

Next, reboot your computer and load into BIOS. Here, you will be able to select to boot from the USB that you flashed. Do so and then follow the setup instructions on the screen to install CentOS. Here is a useful tutorial to help visualize the steps https://www.youtube.com/watch?v=4viTo4gulQk.

Software and Tools

If you plan to run the tests locally, before the tests can be run, the environment needs to be set up. For the following tools, you want to keep the software version consistent. For my emulation and hardware tests, the version of Xilinx tools used was 2021.2.

Below includes the needed software:

Vivado and Vitis, for simpler installation, install the Linux Self-Extracting web installer or the Single-File installer, and then follow the instructions from the installer.

Vivado: https://www.xilinx.com/support/download/index.html/content/xilinx/en/downloadNav/vivado-design-tools.html

Vitis: https://www.xilinx.com/support/download/index.html/content/xilinx/en/downloadNav/vitis.html

The default installation of these files is in the /opt/Xilinx/ folder. If they are not placed there automatically, you will need to copy them over manually.
Xilinx Runtime (XRT) and Platform Packages

For the target platform for emulation, Alveo U200 was used. However, any of the supported platforms found here https://xilinx.github.io/Vitis_Accel_Examples/2021.2/html/shells.html can be used.

Xilinx Runtime (XRT) & Deployment Target Platform: https://www.xilinx.com/support/download/index.html/content/xilinx/en/downloadNav/alveo.html

The above will provide you with 2 *.rpm files. Ensure that you have admin access and then run the command sudo yum install *.rpm for each .rpm file.

The default installation of XRT is in the /opt/xilinx/ folder, and the target platform can be found under /opt/xilinx/platforms/. If they are not placed there automatically, you will need to copy them over manually.
Ensure that Python >= 2.7.5 is installed by running python --version.

You will also need Python 3.6+ in order to run scripts used to parse python logs and to graph the data.

For CentOS 7, Python 2.7.5 is installed by default, however, that is not the case for CentOS 8. The latest Python version for CentOS 8 is Python 3.9.6, to install it, run sudo yum install python39

AWS Server

If you plan to run the tests on an AWS EC2 F1 Instance, you will need:

Setup AWS & AWS Permissions

Assuming that you already have an AWS account, you will need to:
1. Setup AWS IAM Permissions for Creating FPGA Images Navigate to IAM > Policies > Create policy
  
  Select Service EC2, check the permissions CreateFpgaImage and DescribeFpgaImages, and for Resources, select All, then click Next. Name the policy and then create it.
  
  Next, go to Users or User groups, then add the permissions to your user or group that you belong to.
2. Setup S3 Bucket for Compiled Target *.awsxclbin First, install the AWS CLI. then,
  
  Set your credentials (found in your console.aws.amazon.com page), region (us-east-1) and output (json)
```
aws configure
```
  Create an S3 bucket (choose a unique bucket name)
```
aws s3 mb s3://<bucket-name> --region us-east-1
```
  Create a temporary file used to create folders in the S3 bucket.
```
touch temp.txt
```
  Choose a Design Checkpoint (dcp) folder name.
```
aws s3 cp temp.txt s3://<bucket-name>/<dcp-folder-name>/
```
  Choose a logs folder name.
```
aws s3 cp temp.txt s3://<bucket-name>/<logs-folder-name>/
```
FPGA Developer AMI https://aws.amazon.com/marketplace/pp/prodview-gimv3gqbpe57k Subscribe to this software and continue to configuration.

From here, choose software version 1.12.2 (Jan 31, 2023) in order to use the 2021.2 version of the tools. Make sure that the region is set to US East (N. Virginia) as that is the closest location that supports EC2 F1 instances.

Next, continue to launch. In the Choose Action box, select Launch through EC2. If you are planning to run emulations, choose the z1d.2xlarge, if planning to run on hardware/fpga, choose the f1.2xlarge instance.

In the Configure storage block, click Advanced, expand Volume 2 and change Delete on termination to Yes. You can now click Launch instance.

From here, assuming AWS permissions are set up correctly, you will need to set up an Elastic IP for this EC2 instance. First, click Allocate Elastic IP address, and then click Allocate. Next, click Actions and then select the newly created instance and its private IP address.

Next, you want to edit your Security Group's > Inbound rules by creating a new rule for SSH, My IP, then Save rules to whitelist your IP to be allowed to access the instance.

Now, to connect to the EC2 instance through SSH, perform the command
```
ssh -i "<path>/<to>/<private>/<key>" centos@<public ip address>
```

Setup EC2 Instance

Patch Outdated CentOS Mirror List

sudo sed -i s/mirror.centos.org/vault.centos.org/g /etc/yum.repos.d/*.repo
sudo sed -i s/^#.*baseurl=http/baseurl=http/g /etc/yum.repos.d/*.repo
sudo sed -i s/^mirrorlist=http/#mirrorlist=http/g /etc/yum.repos.d/*.repo

NOTE: you will need to copy the commands from https://serverfault.com/a/1161847 due to the way latex formats the caret.

Once updated, refresh the cache.

yum clean all ; yum makecache

Setup Vitis

git clone https://github.com/aws/aws-fpga.git $AWS_FPGA_REPO_DIR
cd $AWS_FPGA_REPO_DIR
source vitis_setup.sh

Setup XRT Replace release tag with your version found in the table here, the value is correct for version 2021.2: https://github.com/aws/aws-fpga/blob/master/Vitis/docs/XRT_installation_instructions.md

XRT_RELEASE_TAG=202120.2.12.427

cd aws-fpga
source vitis_setup.sh
cd $VITIS_DIR/Runtime
export XRT_PATH="${VITIS_DIR}/Runtime/${XRT_RELEASE_TAG}"
git clone http://www.github.com/Xilinx/XRT.git -b ${XRT_RELEASE_TAG} ${XRT_PATH}

cd ${XRT_PATH}
sudo ./src/runtime_src/tools/scripts/xrtdeps.sh

cd build
scl enable devtoolset-9 bash
./build.sh

cd Release
sudo yum reinstall xrt_*-aws.rpm -y

If the above fails, you may need to clear the cache again and run the above commands again.

yum clean all ; yum makecache

NOTE: you will need to source the following 2 scripts before you run compile or run anything.

source /home/centos/src/project_data/aws-fpga/vitis_setup.sh
source /home/centos/src/project_data/aws-fpga/vitis_runtime_setup.sh

The value for the PLATFORM is $AWS_PLATFORM

Now that the environment is set up, refer to EXECUTION for steps to run the tests.

Here's the content converted to Markdown format:

Execution

Before you can run the execution, you will need to clone the repository, in which you will find 3 folders local_emu, aws_emu, and aws_hw. Note that emulation will take a long time for larger data sizes.

If your python version is not standard, you will need to EXPORT python=path/to/python in order for the bash scripts to run python scripts.

The way these tests are set up is that in the ./src/template, there are 6 designs, each to test one of the data transfer actions from the above list. Each design performs the same vector add function. 2 identical functions of variable size will be added and the result is returned and then verified. The size of the data transferred / vectors are incremented by powers of 2, starting from the smallest possible value to the largest possible value.

There are 3 parameters/variables that are needed:

Vector A: The first vector to be added.
Vector B: The second vector to be added.
Size: The size of the vectors and variable data size to let the FPGA know the size of the vectors for vector addition.

In a typical test run (./src/run_compile.sh), each design folder will be copied to ./src/$PROJECT where it will be compiled for the target platform and ran. The compilation stage is done asynchronously for optimisation and the execution is ran individually to record maximum performance. As the designs are compiled and ran, information regarding the run will be printed out, and everything will be logged in the logs found in ./logs, one for each design (6 logs per test run, minus the ones that are not supported or ran).

From here, in order to gather information (size of data transfer and elapsed time) from the test runs, run the script ./src/parse_logs.py which will create .csv files in ./logs_data with the needed information. To graph the .csv files, run ./src/graph_data.py, which will graph the transfer size on the x-axis and elapsed time on the y-axis in log10 timescale.

Local Emulation

Navigate to the local_emu folder.
```
cd local_emu
```
Change the PLATFORM variable in src/run_compile.sh to the target platform found in /opt/xilinx/platforms/ (the folder name of the platform). Save and exit.
Change the data_sizes variable in src/update_data_size.py data size values you want. Each value represents the number of elements to in the vector to transfer. For example, data_sizes = [16, 32] will transfer 16 integers and in the next run transfer 32 integers in the next test run. Note that to calculate the total bytes sent, multiply the number of integers by 4. Save and exit.
Go into the src folder and run run_compile.sh.
```
cd src; sh run_compile.sh
```

AWS EC2 Z1D Emulation

Navigate to the aws_emu folder.
```
cd aws_emu
```
Change the data_sizes variable in src/update_data_size.py data size values you want. Each value represents the number of elements to in the vector to transfer. For example, data_sizes = [16, 32] will transfer 16 integers and in the next run transfer 32 integers in the next test run. Note that to calculate the total bytes sent, multiply the number of integers by 4. Save and exit.
Go into the src folder and run run_compile.sh.
```
cd src; sh run_compile.sh
```

AWS EC2 F1 Hardware

Start on a z1d.2xlarge instance, and navigate to the aws_hw folder.
```
cd aws_hw
```
Change the data_sizes variable in src/update_data_size.py data size values you want. Each value represents the number of elements to in the vector to transfer. For example, data_sizes = [16, 32] will transfer 16 integers and in the next run transfer 32 integers in the next test run. Note that to calculate the total bytes sent, multiply the number of integers by 4. Save and exit.
Go into the src folder and run run_compile.sh. This step may take up to 3 hours.
```
cd src; sh run_compile.sh
```
Create a backup of the newly compiled binary files. This will create a backup folder .backup in the aws_hw folder.
```
sh backup.sh
```
Open create_afi.sh and edit the S3_BUCKET_NAME, S3_DCP, and S3_LOGS according to what you named the bucket and folders. Note that for this step, you must have setup AWS & Permissions according to SETUP).
Run create_afi.sh
```
sh create_afi.sh
```
Save the folders in the .backup/src containing the *.awsxclbin files locally. An easy method to set this up without further setup is by following this tutorial for SFTP: https://www.youtube.com/watch?v=o-dH2C_Nz-E
Wait until the Amazon FPGA Image (API) is available. You can check the status of it by finding the AFI id *_afi_id.txt file in the .backup/src/<PROJECT> folder and then running:
```
aws ec2 describe-fpga-images --fpga-image-ids <AFI ID>
```
The status JSON should show
```
...
"State": {
    "Code": "available"
},
...
```
when the AFI is ready.
Once ready, create a AWS EC2 F1 instance, setup Vitis and XRT as shown in SETUP and in the same manner as the z1d instance was setup.
Now in the same manner as before, copy the project files from the backup folder back into the src directory of the newly cloned project. Thus, when you navigate to aws_hw/src, you will find ocl_cpu_to_gmem, ocl_cpu_to_gmem_rw, ....
Now, execute the test on hardware by running run_hw.sh
```
sh run_hw.sh
```

Parsing Log Files & Graphing Results

You will notice that after every test, the ./logs folder will become populated with log files for each project/design. These log files contain execution information of the type of data transfer, the size, and the time it took to complete. Below are the steps to parse them into a CSV format and graph them.

Note that running parse_logs.py and graph_data.py require Python 3.6+.

Install the needed python library.
```
pip install plotnine
```
Navigate the the src folder.
```
cd ./src
```
Create the logs_data folder.
```
mkdir logs_data
```
Parse the logs and generate CSV files in log_data.
```
python parse_logs.py
```

Edit Graph titles and output file names to your liking. Edit lines:

labs(title=f'Benchmark {action} Speeds vs Data Sizes ({log_type})',
            x='Transfer Size (GB) (log scale)',
            y='Transfer Speed (GB/s)') +

and

plot.save(f'../logs_data/{log_type}_{action}_log.png', width=10, height=6, dpi=300)

Graph the CSV files into PNGs in log_data.
```
python graph_data.py
```

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
aws_emu/src		aws_emu/src
aws_hw/src		aws_hw/src
local_emu/src		local_emu/src
report		report
LICENSE		LICENSE
README.md		README.md
xilinx_benchmark_report.pdf		xilinx_benchmark_report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Setup & Execution For OpenCL and XRT Data Transfers Benchmarks on Xilinx FPGA

Introduction

Setup

Local Install

Operating System

Software and Tools

AWS Server

Execution

Local Emulation

AWS EC2 Z1D Emulation

AWS EC2 F1 Hardware

Parsing Log Files & Graphing Results

About

Uh oh!

Releases

Packages

Languages

License

0vp/xilinx-benchmark

Folders and files

Latest commit

History

Repository files navigation

Setup & Execution For OpenCL and XRT Data Transfers Benchmarks on Xilinx FPGA

Introduction

Setup

Local Install

Operating System

Software and Tools

AWS Server

Execution

Local Emulation

AWS EC2 Z1D Emulation

AWS EC2 F1 Hardware

Parsing Log Files & Graphing Results

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages