Vitis HLS 学习笔记--HLS入门示例集合-目录

本示例集对如下内容做记录：Interface（接口）：展示各种模式和接口协议使用的常见示例Pipelining（流水线）：展示循环和函数的流水线pragma使用的常见示例Task_Level_Parallelism（任务级并行）：展示任务级并行编程模型和拓扑结构示例Modeling（建模）：数学和DSP示例以及其他常见使用模型/算法Misc（其他）：例如C++中的RTL黑盒等其他示例

Hi94

1904人浏览 · 2024-04-25 22:38:05

Hi94 · 2024-04-25 22:38:05 发布

1. 示例集合概述

2. Interface 接口

2.1 Aggregation_Disaggregation

2.1.1 aggregation_of_m_axi_ports (DONE)

2.1.2 aggregation_of_nested_structs

2.1.3 aggregation_of_struct

2.1.4 auto_disaggregation_of_struct

2.1.5 disaggregation_of_axis_port

2.1.6 struct_ii_issue

2.2 Memory

2.2.1 ecc_flags

2.2.2 manual_burst (DONE)

2.2.3 max_widen_port_width (DONE)

2.2.4 memory_bottleneck (DONE)

2.2.5 ram_uram (DONE)

2.2.6 rom_lookup_table_math

2.2.7 using_axi_master (DONE)

2.3 Register

2.3.1 using_axi_lite (DONE)

2.3.2 using_axi_lite_with_user_defined_offset

2.4 Streaming

2.4.1 axi_stream_to_master (DONE)

2.4.2 using_array_of_streams

2.4.3 using_axi_stream_no_side_channel_data

2.4.4 using_axi_stream_with_side_channel_data

2.4.5 using_axi_stream_with_struct

2.4.6 using_axis_array_stream_no_side_channel_data

3. Pipelining

3.1 Functions

3.1.1 function_instantiate

3.1.2 hier_func

3.2 Loops

3.2.1 imperfect_loop (DONE)

3.2.2 perfect_loop (DONE)

3.3.3 pipelined_loop

3.3.4 using_free_running_pipeline (DONE)

4. Task_Level_Parallelism

4.1 Control_driven

4.1.1 Bypassing

4.1.1.1 input_bypass (DONE)

4.1.1.2 middle_bypass (DONE)

4.1.1.3 output_bypass (DONE)

4.1.2 Channels

4.1.2.1 Vitis

4.1.2.2 merge_split

4.1.2.3 simple_fifos (DONE)

4.1.2.4 using_fifos (DONE)

4.1.2.5 using_pipos (DONE)

4.1.2.6 using_stream_of_blocks

4.2 Data_driven

4.2.1 handling_deadlock (DONE)

4.2.2 mixed_control_and_data_driven

4.2.3 simple_data_driven (DONE)

5. Modeling

5.1 Pointers

5.1.1 basic_arithmetic (DONE)

5.1.2 basic_pointers (DONE)

5.1.3 multiple_pointers

5.1.4 native_casts

5.1.5 stream_better (DONE)

5.1.6 stream_good (DONE)

5.1.7 using_double (DONE)

5.2 basic_loops_primer

5.3 fixed_point_sqrt

5.4 free_running_kernel_remerge_ii4to1

5.5 using_C++_templates

5.6 using_arbitrary_precision_arith

5.7 using_arbitrary_precision_casting

5.8 using_fixed_point

5.9 using_float_and_double (DONE)

5.10 using_vectors (DONE)

5.11 variable_bound_loops (DONE)

6. Misc

6.1 initialization_and_reset

6.1.1 global_array_RAM (DONE)

6.1.2 static_array_RAM (DONE)

6.1.3 static_array_ROM (DONE)

6.1.4 static_array_of_struct_with_array_RAM

6.1.5 static_struct_with_array_RAM

6.1.6 static_struct_with_array_RAM_Versal

6.2 malloc_removed

6.3 rtl_as_blackbox

7. 学习规划

1. 示例集合概述

GitHub - Xilinx/Vitis-HLS-Introductory-ExamplesContribute to Xilinx/Vitis-HLS-Introductory-Examples development by creating an account on GitHub.https://github.com/Xilinx/Vitis-HLS-Introductory-Examples此示例集与先前的博客《Vitis HLS 学习笔记--HLS优化指令示例-目录-CSDN博客》相得益彰，分别聚焦于展示HLS功能和演示HLS优化指令。与之前的博客相比，需要同时编译宿主代码和PL（可编程逻辑）代码，而本示例集则可完全在Vitis HLS仿真环境下运行，使得效果展示更为直观。这两者互为补充，共同促进了对Vitis HLS的深入理解和掌握。

本示例集分类如下：

Interface（接口）：展示各种模式和接口协议使用的常见示例
Pipelining（流水线）：展示循环和函数的流水线pragma使用的常见示例
Task_Level_Parallelism（任务级并行）：展示任务级并行编程模型和拓扑结构示例
Modeling（建模）：数学和DSP示例以及其他常见使用模型/算法
Misc（其他）：例如C++中的RTL黑盒等其他示例

2. Interface 接口

2.1 Aggregation_Disaggregation

2.1.1 aggregation_of_m_axi_ports (DONE)

#pragma HLS AGGREGATE compact=auto

《Vitis HLS 学习笔记--聚合与解聚-AXI主接口-CSDN博客》

2.1.2 aggregation_of_nested_structs

嵌套结构体

2.1.3 aggregation_of_struct

2.1.4 auto_disaggregation_of_struct

2.1.5 disaggregation_of_axis_port

2.1.6 struct_ii_issue

迭代间隔违规

2.2 Memory

2.2.1 ecc_flags

Error Checking and Correcting

2.2.2 manual_burst (DONE)

如果在设计中并未发生自动突发，则可使用 hls::burst_maxi 数据类型执行手动突发。

《Vitis HLS 学习笔记--MAXI手动控制突发传输-CSDN博客》

2.2.3 max_widen_port_width (DONE)

可选参数max_widen_bitwidth，因为Compiler会根据数据类型自动进行数据位宽的调整。

《Vitis HLS 学习笔记--MAXI位宽拓展-CSDN博客》

2.2.4 memory_bottleneck (DONE)

achive II=1 by removing redundant memory accesses in the code。

《Vitis HLS 学习笔记--优化本地存储器访问瓶颈-CSDN博客》

2.2.5 ram_uram (DONE)

BIND_STORAGE type=ram_2p impl=uram，DEPENDENCE inter WAR false，WAR is Write-After-Read

《Vitis HLS 学习笔记--资源绑定-使用URAM-CSDN博客》

《Vitis HLS 学习笔记--资源绑定-使用URAM（1）-CSDN博客》

2.2.6 rom_lookup_table_math

sin_table[i] = (din1_t)(32768.0 * real_val);

2.2.7 using_axi_master (DONE)

《Vitis HLS 学习笔记--AXI4 主接口-CSDN博客》

2.3 Register

2.3.1 using_axi_lite (DONE)

2.3.2 using_axi_lite_with_user_defined_offset

2.4 Streaming

2.4.1 axi_stream_to_master (DONE)

hls::stream<int,…> count; 是为了更方便自动优化实现流水线设计。

《Vitis HLS 学习笔记--AXI_STREAM_TO_MASTER-CSDN博客》

《Vitis HLS 学习笔记--理解串流Stream(1)-CSDN博客》

《Vitis HLS 学习笔记--理解串流Stream(2)-CSDN博客》

《Vitis HLS 学习笔记--理解串流Stream(3)-CSDN博客》

2.4.2 using_array_of_streams

hls::stream<int> s_in[M]，array即数组

2.4.3 using_axi_stream_no_side_channel_data

无信道侧，hls::axis<type, 0, 0, 0>，区别于传输控制信号

2.4.4 using_axi_stream_with_side_channel_data

含信道侧，hls::axis<type, WUser, WId, WDest>;

2.4.5 using_axi_stream_with_struct

查看Slide：“HLS - 接口综合：典范”

2.4.6 using_axis_array_stream_no_side_channel_data

3. Pipelining

3.1 Functions

3.1.1 function_instantiate

实例化函数

3.1.2 hier_func

分层函数，__SYNTHESIS__

3.2 Loops

3.2.1 imperfect_loop (DONE)

循环边界是变量、循环体出现在外层。

《Vitis HLS 学习笔记--HLS眼中的完美循环嵌套-CSDN博客》

3.2.2 perfect_loop (DONE)

循环边界是固定常数，循环体只在最内层。

《Vitis HLS 学习笔记--HLS眼中的完美循环嵌套-CSDN博客》

3.3.3 pipelined_loop

3.3.4 using_free_running_pipeline (DONE)

DATAFLOW
《Vitis HLS 学习笔记--FRP自由运行流水线-CSDN博客》

4. Task_Level_Parallelism

4.1 Control_driven

4.1.1 Bypassing

4.1.1.1 input_bypass (DONE)

a -> tmp1 -> tmp4
                  + --> tmp3
     b    -> tmp2

----------------------------------

a -> tmp1 -> tmp4
                  + --> tmp3
b -> tmp2 -> tmp5

《Vitis HLS 学习笔记--抽象并行编程模型-不良示例-CSDN博客》

4.1.1.2 middle_bypass (DONE)

a -> tmp1 ------>
                  + --> tmp3
b -> tmp2 -> tmp4

----------------------------------

a -> tmp1 -> tmp5
                  + --> tmp3
b -> tmp2 -> tmp4

《Vitis HLS 学习笔记--抽象并行编程模型-不良示例-CSDN博客》

4.1.1.3 output_bypass (DONE)

a -> tmp1 -> b
a -> tmp2

-------------------------

a -> tmp1 -> b
a -> tmp3 -> tmp2

《Vitis HLS 学习笔记--抽象并行编程模型-不良示例-CSDN博客》

4.1.2 Channels

4.1.2.1 Vitis

use FIFOs instead of the default PIPOs on host

4.1.2.2 merge_split

<hls_np_channel.h> (number of parallel channel)

4.1.2.3 simple_fifos (DONE)

《Vitis HLS 学习笔记--抽象并行编程模型-控制驱动与数据驱动-CSDN博客》

4.1.2.4 using_fifos (DONE)

#pragma HLS performance target_ti=32，ti=transaction interval，事务间隔

《Vitis HLS 学习笔记--通道的FIFO/PIPO选择-CSDN博客》

4.1.2.5 using_pipos (DONE)

《Vitis HLS 学习笔记--通道的FIFO/PIPO选择-CSDN博客》

4.1.2.6 using_stream_of_blocks

hls::stream_of_blocks<block_data_t>

4.2 Data_driven

4.2.1 handling_deadlock (DONE)

hls_thread_local hls::stream<int, 100> s1;

《Vitis HLS 学习笔记--控制驱动TLP-处理deadlock_vitis hls数据流处理-CSDN博客》

4.2.2 mixed_control_and_data_driven

hls_thread_local hls::task t[4];

4.2.3 simple_data_driven (DONE)

《Vitis HLS 学习笔记--抽象并行编程模型-控制驱动与数据驱动-CSDN博客》

5. Modeling

5.1 Pointers

5.1.1 basic_arithmetic (DONE)

函数无返回，但指针修改了数组中的数据，实际上可以被视为函数的输出。

《Vitis HLS 学习笔记--基本指针和算术指针-CSDN博客》

5.1.2 basic_pointers (DONE)

static 变量通常会在硬件中实现为一个寄存器或存储器单元，其值会在多个调用之间保持不变。

《Vitis HLS 学习笔记--基本指针和算术指针-CSDN博客》

5.1.3 multiple_pointers

局部的静态变量，是靠编译器实现作用区域限制的

5.1.4 native_casts

5.1.5 stream_better (DONE)

《Vitis HLS 学习笔记--避免使用多重访问指针-CSDN博客》

5.1.6 stream_good (DONE)

可通过Tcl脚本命令实现ap_fifo，也可以通过编译指令#pragma HLS INTERFACE

《Vitis HLS 学习笔记--避免使用多重访问指针-CSDN博客》

5.1.7 using_double (DONE)

指向指针的指针，应尽量避免使用，因为双重指针会增加访问数据时的间接性，从而导致额外的逻辑开销。

5.2 basic_loops_primer

pipline off，unroll

5.3 fixed_point_sqrt

使用了自定义的sqrt函数，建议还是优先使用<hls_math.h>

5.4 free_running_kernel_remerge_ii4to1

Iteration Interval，ap_ctrl_none

5.5 using_C++_templates

5.6 using_arbitrary_precision_arith

<ap_int.h>

5.7 using_arbitrary_precision_casting

5.8 using_fixed_point

5.9 using_float_and_double (DONE)

typedef union {
    float fp_num;
    uint32_t raw_bits;
    struct {
        uint32_t mant : 23;  // 尾数
        uint32_t bexp : 8;   // 偏置指数
        uint32_t sign : 1;   // 符号
    };
} float_num_t;

float float_mul_pow2(float x, int8_t n);

这些函数实现了浮点数（单精度和双精度）乘以二的幂的操作。因为乘以二的幂可以简化为对偏置指数的简单 8 位或 11 位（分别针对单精度和双精度）的加法运算，所以比任意乘法要高效得多，并且伴随着一些基本的溢出和下溢检查（如果需要，可以通过定义预处理宏 AESL_FP_MATH_NO_BOUNDS_TESTS 来消除这些检查）。

5.10 using_vectors (DONE)

hls::vector<T, N>，适用于 SIMD(Single Instruction Multiple Data)

《Vitis HLS 学习笔记--矢量数据类型-CSDN博客》

5.11 variable_bound_loops (DONE)

变量循环边界问题：该变量为函数参数，在编译时未知，需要运行时传递。

Loop: for (x=0; x<width; x++) {
    out_accum += A[x];
}

《Vitis HLS 学习笔记--循环边界包含变量-CSDN博客》

6. Misc

6.1 initialization_and_reset

6.1.1 global_array_RAM (DONE)

全局数组，是指在函数外部定义的数组，ap_int<10> A[10] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};

《Vitis HLS 学习笔记--global_array_RAM初始化及复位-CSDN博客》

《Vitis HLS 学习笔记--初始化与复位-CSDN博客》

6.1.2 static_array_RAM (DONE)

static ap_int<10> A[10] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};

《Vitis HLS 学习笔记--static RAM/ROM-CSDN博客》

《Vitis HLS 学习笔记--初始化与复位-CSDN博客》

6.1.3 static_array_ROM (DONE)

BIND_STORAGE variable=A type=ROM_1P impl=BRAM;

《Vitis HLS 学习笔记--static RAM/ROM-CSDN博客》

《Vitis HLS 学习笔记--初始化与复位-CSDN博客》

6.1.4 static_array_of_struct_with_array_RAM

数组结构体;

6.1.5 static_struct_with_array_RAM

结构体

6.1.6 static_struct_with_array_RAM_Versal

6.2 malloc_removed

#include "malloc_removed.h"

6.3 rtl_as_blackbox

7. 学习规划

这个示例集含有丰富的内容，我将在未来的博客文章中，专门挑选其中的重要部分进行详细讨论，并会在这里附上相关链接。

这个目录也方便我快速检索到相关知识点。

开放原子开发者工作坊

开放原子开发者工作坊旨在鼓励更多人参与开源活动，与志同道合的开发者们相互交流开发经验、分享开发心得、获取前沿技术趋势。工作坊有多种形式的开发者活动，如meetup、训练营等，主打技术交流，干货满满，真诚地邀请各位开发者共同参与！

更多推荐

第二届开放原子大赛首批创新成果集结武汉，诚邀广大开发者共鉴开源技术盛宴

开放原子开发者工作坊

诚邀报名 | 开源基础设施能力建设分论坛：打造开源生态的“心脏”

开放原子开发者工作坊

诚邀报名 | 编程语言分论坛：AI时代的技术革新与开源实践

开放原子开发者工作坊

所有评论(0)

查看更多评论

Hi94

@DongDong314

已为社区贡献21条内容