Halide学习笔记----Halide tutorial源码阅读15

来源：互联网发布：matlab dsp编程编辑：程序博客网时间：2024/06/05 19:13

Halide入门教程15

本课分为两部分，第一部分讲述如何编写生成器
第二部分为一个shell脚本，告诉如何使用生成器编译出所需要的头文件和静态库

// Halide tutorial lesson 15: Generators part 1// Halide入门15课：生成器// This lesson demonstrates how to encapsulate Halide pipelines into// resuable components called generators.// 本课展示如何将Halide的pipeline封装到可以重复使用的生成器中// On linux, you can compile and run it like so:// g++ lesson_15*.cpp ../tools/GenGen.cpp -g -std=c++11 -fno-rtti -I ../include -L ../bin -lHalide -lpthread -ldl -o lesson_15_generate// bash lesson_15_generators_usage.sh#include "Halide.h"#include <stdio.h>using namespace Halide;// Generators are a more structured way to do ahead-of-time// compilation of Halide pipelines. Instead of writing an int main()// with an ad-hoc command-line interface like we did in lesson 10, we// define a class that inherits from Halide::Generator.// 生成器提供了一种更结构化的提前编译Halide pipeline的方法。除了像第十课中那样写main函数，然后用特定的// 命令行接口调用那样，我们可以从Halide::Generator继承得到一个子类，由这个子类结构化生成器达到相同的目的。class MyFirstGenerator : public Halide::Generator<MyFirstGenerator> {public:    // We declare the Inputs to the Halide pipeline as public    // member variables. They'll appear in the signature of our generated    // function in the same order as we declare them.    // 定义Halide输入变量为流水线的公有变量。它们在生成器生成的函数中的顺序同他们声明时的顺序一致。    Input<uint8_t> offset{"offset"};    Input<Buffer<uint8_t>> input{"input", 2};    // We also declare the Outputs as public member variables.    // 输出变量同样声明为公有变量。    Output<Buffer<uint8_t>> brighter{"brighter", 2};    // Typically you declare your Vars at this scope as well, so that    // they can be used in any helper methods you add later.    Var x, y;    // We then define a method that constructs and return the Halide    // pipeline:    // 定义generate，在这个函数中定义算法描述的的pipeline    // 这个函数的名字就是generate，如同main函数一样，名字规定就是这样，不要变动。    void generate() {        // In lesson 10, here is where we called        // Func::compile_to_file. In a Generator, we just need to        // define the Output(s) representing the output of the pipeline.        // 在第十课中，我们调用compile_to_file函数。这里我们只需要定义整个pipeline        brighter(x, y) = input(x, y) + offset;        // Schedule it.        brighter.vectorize(x, 16).parallel(y);    }};// We compile this file along with tools/GenGen.cpp. That file defines// an "int main(...)" that provides the command-line interface to use// your generator class. We need to tell that code about our// generator. We do this like so:// 通过调用tools/GenGen.cpp内的函数来编译这个文件。这个cpp文件定义了main函数，提供了对应的接口来使用// 自定义的生成器类。我们需要按照如下格式来告诉gengen.cpp中的代码如何编译你的生成器类HALIDE_REGISTER_GENERATOR(MyFirstGenerator, my_first_generator)// If you like, you can put multiple Generators in the one file. This// could be a good idea if they share some common code. Let's define// another more complex generator:// 如果有需要，可以将多个生成器放在一个文件中。如果有公用代码，这样的做法可以共用代码。class MySecondGenerator : public Halide::Generator<MySecondGenerator> {public:    // This generator will take some compile-time parameters    // too. These let you compile multiple variants of a Halide    // pipeline. We'll define one that tells us whether or not to    // parallelize in our schedule:    // 这个生成器在编译时接受编译时参数。这将允许我们编译出多个不同的Halide pipeline。    // 我们将定义一个运行时决定是否并行化的调度策略。    GeneratorParam<bool> parallel{"parallel", /* default value */ true};    // ... and another representing a constant scale factor to use:    GeneratorParam<float> scale{"scale",            1.0f /* default value */,            0.0f /* minimum value */,            100.0f /* maximum value */};    // You can define GeneratorParams of all the basic scalar    // types. For numeric types you can optionally provide a minimum    // and maximum value, as we did for scale above.    // 对于数值类型的参量，除可以提供默认值之外，还可以提供一个范围限制    // You can also define GeneratorParams for enums. To make this    // work you must provide a mapping from strings to your enum    // values.    // 可以定义枚举类型的的生成器变量，为了让它正常工作，你还必须提供字符串到枚举类型值的映射。    enum class Rotation { None, Clockwise, CounterClockwise };    GeneratorParam<Rotation> rotation{"rotation",            /* default value */            Rotation::None,            /* map from names to values */            {{ "none", Rotation::None },             { "cw",   Rotation::Clockwise },             { "ccw",  Rotation::CounterClockwise }}};    // We'll use the same Inputs as before:    Input<uint8_t> offset{"offset"};    Input<Buffer<uint8_t>> input{"input", 2};    // And a similar Output. Note that we don't specify a type for the Buffer:    // at compile-time, we must specify an explicit type via the "output.type"    // GeneratorParam (which is implicitly defined for this Output).    // 我们不指定输出buffer的数据类型，但是在编译时必须通过output.type显式指出其数据类型。    Output<Buffer<>> output{"output", 2};    // And we'll declare our Vars here as before.    Var x, y;    void generate() {        // Define the Func. We'll use the compile-time scale factor as        // well as the runtime offset param.        Func brighter;        brighter(x, y) = scale * (input(x, y) + offset);        // We'll possibly do some sort of rotation, depending on the        // enum. To get the value of a GeneratorParam, cast it to the        // corresponding type. This cast happens implicitly most of        // the time (e.g. with scale above).        Func rotated;        switch ((Rotation)rotation) {        case Rotation::None:            rotated(x, y) = brighter(x, y);            break;        case Rotation::Clockwise:            rotated(x, y) = brighter(y, 100-x);            break;        case Rotation::CounterClockwise:            rotated(x, y) = brighter(100-y, x);            break;        }        // We'll then cast to the desired output type.        // 将数据类型转换成编译时指定的数据类型        output(x, y) = cast(output.type(), rotated(x, y));        // The structure of the pipeline depended on the generator        // params. So will the schedule.        // pipeline的结构依赖于生成器变量。        // Let's start by vectorizing the output. We don't know the        // type though, so it's hard to pick a good factor. Generators        // provide a helper called "natural_vector_size" which will        // pick a reasonable factor for you given the type and the        // target you're compiling to.        // 由于不知道数据类型，很难选择一个合适的因子去进行向量化，生成器提供了一个函数，在编译时给定了        // 数据类型时，会自适应去选择一个合适的因子进行向量化。        output.vectorize(x, natural_vector_size(output.type()));        // Now we'll possibly parallelize it:        if (parallel) {            output.parallel(y);        }        // If there was a rotation, we'll schedule that to occur per        // scanline of the output and vectorize it according to its        // type.        if (rotation != Rotation::None) {            rotated                .compute_at(output, y)                .vectorize(x, natural_vector_size(rotated.output_types()[0]));        }    }};// Register our second generator:HALIDE_REGISTER_GENERATOR(MySecondGenerator, my_second_generator)// After compiling this file, see how to use it in// lesson_15_generators_build.sh

1. 结构化生成器编写格式class MyGenerator : public Halide::Generator<MyGenerator>{    Input a, b; //...    Output c, d; //...    GeneratorParam<T> ...    void generate()    {    }};HALIDE_REGIRTER_GENERATOR(MyGenerator, my_generator)

# Halide tutorial lesson 15: Generators part 2# 本课讲述了如何用编译好的生成器来生成对应的头文件和静态库# This shell script demonstrates how to use a binary containing# Generators from the command line. Normally you'd call these binaries# from your build system of choice rather than running them manually# like we do here.# This script assumes that you're in the tutorials directory, and the# generator has been compiled for the current system and is called# "lesson_15_generate".# To run this script:# bash lesson_15_generators_usage.sh# First we define a helper function that checks that a file existscheck_file_exists(){    FILE=$1    if [ ! -f $FILE ]; then        echo $FILE not found        exit -1    fi}# And another helper function to check if a symbol exists in an object filecheck_symbol(){    FILE=$1    SYM=$2    if !(nm $FILE | grep $SYM > /dev/null); then        echo "$SYM not found in $FILE"    exit -1    fi}# Bail out on error#set -e# Set up LD_LIBRARY_PATH so that we can find libHalide.soexport LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:../binexport DYLD_LIBRARY_PATH=${DYLD_LIBRARY_PATH}:../bin########################## Basic generator usage ########################### First let's compile the first generator for the host system:./lesson_15_generate -g my_first_generator -o . target=host# That should create a pair of files in the current directory:# "my_first_generator.a", and "my_first_generator.h", which define a# function "my_first_generator" representing the compiled pipeline.check_file_exists my_first_generator.acheck_file_exists my_first_generator.hcheck_symbol my_first_generator.a my_first_generator###################### Cross-compilation ####################### We can also use a generator to compile object files for some other# target. Let's cross-compile a windows 32-bit object file and header# for the first generator:./lesson_15_generate \    -g my_first_generator \    -f my_first_generator_win32 \    -o . \    target=x86-32-windows# This generates a file called "my_first_generator_win32.lib" in the# current directory, along with a matching header. The function# defined is called "my_first_generator_win32".check_file_exists my_first_generator_win32.libcheck_file_exists my_first_generator_win32.h################################# Generating pipeline variants ################################## The full set of command-line arguments to the generator binary are:# 重点解释一下generator参数如何使用# -g generator_name : Selects which generator to run. If you only have# one generator in your binary you can omit this.# -g 生成器名：选择使用那个生成器来执行，生成对应的头文件和库# -o directory : Specifies which directory to create the outputs# in. Usually a build directory.# -o 目录：指定生成文件的目录# -f name : Specifies the name of the generated function. If you omit# this, it defaults to the generator name.# -f 函数名：指定生成的函数名# -n file_base_name : Specifies the basename of the generated file(s). If# you omit this, it defaults to the name of the generated function.# -n 文件名： 指定生的头文件和静态库的文件名前缀# -e static_library,o,h,assembly,bitcode,stmt,html: A list of# comma-separated values specifying outputs to create. The default is# "static_library,h". "assembly" generates assembly equivalent to the# generated object file. "bitcode" generates llvm bitcode for the pipeline.# "stmt" generates human-readable pseudocode for the pipeline (similar to# setting HL_DEBUG_CODEGEN). "html" generates an html version of the# pseudocode, which can be much nicer to read than the raw .stmt file.# -e 生成的文件的类型：指定生成什么类型的目标文件，比如静态库/头文件/字节码/html等等# -r file_base_name : Specifies that the generator should create a# standalone file for just the runtime. For use when generating multiple# pipelines from a single generator, to be linked together in one# executable. See example below.# -r 文件名：指定生成器必须声场一个即使运行的单独文件。用来从一个生成器生成多个pipeline，然后链接到一起# 构成一个可执行文件# -x .old=new,.old2=.new2,... : A comma-separated list of file extension# pairs to substitute during file naming.# -x ：逗号分隔符构成的文件列表，用来替换文件名# target=... : The target to compile for.# garget=... ：用来指定编译的目标平台# my_generator_param=value : The value of your generator params.# my_generator_param=value: 用来指定生成器变量的值# Let's now generate some human-readable pseudocode for the first# generator:# 下面是一些具体使用生成器的例子，以lesson15的第一部分的生成器为例进行实验，再次不再赘述./lesson_15_generate -g my_first_generator -e stmt -o . target=hostcheck_file_exists my_first_generator.stmt# The second generator has generator params, which can be specified on# the command-line after the target. Let's compile a few different variants:./lesson_15_generate -g my_second_generator -f my_second_generator_1 -o . \target=host parallel=false scale=3.0 rotation=ccw output.type=uint16./lesson_15_generate -g my_second_generator -f my_second_generator_2 -o . \target=host scale=9.0 rotation=ccw output.type=float32./lesson_15_generate -g my_second_generator -f my_second_generator_3 -o . \target=host parallel=false output.type=float64check_file_exists my_second_generator_1.acheck_file_exists my_second_generator_1.hcheck_symbol      my_second_generator_1.a my_second_generator_1check_file_exists my_second_generator_2.acheck_file_exists my_second_generator_2.hcheck_symbol      my_second_generator_2.a my_second_generator_2check_file_exists my_second_generator_3.acheck_file_exists my_second_generator_3.hcheck_symbol      my_second_generator_3.a my_second_generator_3# Use of these generated object files and headers is exactly the same# as in lesson 10.####################### The Halide runtime ######################## Each generated Halide object file contains a simple runtime that# defines things like how to run a parallel for loop, how to launch a# cuda program, etc. You can see this runtime in the generated object# files.echo "The halide runtime:"nm my_second_generator_1.a | grep "[SWT] _\?halide_"# Let's define some functions to check that the runtime exists in a file.check_runtime(){    if !(nm $1 | grep "[TSW] _\?halide_" > /dev/null); then        echo "Halide runtime not found in $1"    exit -1    fi}check_no_runtime(){    if nm $1 | grep "[TSW] _\?halide_" > /dev/null; then        echo "Halide runtime found in $1"    exit -1    fi}# Declarations and documentation for these runtime functions are in# HalideRuntime.h# If you're compiling and linking multiple Halide pipelines, then the# multiple copies of the runtime should combine into a single copy# (via weak linkage). If you're compiling and linking for multiple# different targets (e.g. avx and non-avx), then the runtimes might be# different, and you can't control which copy of the runtime the# linker selects.# You can control this behavior explicitly by compiling your pipelines# with the no_runtime target flag. Let's generate and link several# different versions of the first pipeline for different x86 variants:# (Note that we'll ask the generators to just give us object files ("-e o"), # instead of static libraries, so that we can easily link them all into a # single static library.)./lesson_15_generate \    -g my_first_generator \    -f my_first_generator_basic \    -e o,h \    -o . \    target=host-x86-64-no_runtime./lesson_15_generate \    -g my_first_generator \    -f my_first_generator_sse41 \    -e o,h \    -o . \    target=host-x86-64-sse41-no_runtime./lesson_15_generate \    -g my_first_generator \    -f my_first_generator_avx \    -e o,h \    -o . \    target=host-x86-64-avx-no_runtime# These files don't contain the runtimecheck_no_runtime my_first_generator_basic.ocheck_symbol     my_first_generator_basic.o my_first_generator_basiccheck_no_runtime my_first_generator_sse41.ocheck_symbol     my_first_generator_sse41.o my_first_generator_sse41check_no_runtime my_first_generator_avx.ocheck_symbol     my_first_generator_avx.o my_first_generator_avx# We can then use the generator to emit just the runtime:./lesson_15_generate \    -r halide_runtime_x86 \    -e o,h \    -o . \    target=host-x86-64check_runtime halide_runtime_x86.o# Linking the standalone runtime with the three generated object files     # gives us three versions of the pipeline for varying levels of x86,      # combined with a single runtime that will work on nearly all x86     # processors.ar q my_first_generator_multi.a \    my_first_generator_basic.o \    my_first_generator_sse41.o \    my_first_generator_avx.o \    halide_runtime_x86.ocheck_runtime my_first_generator_multi.acheck_symbol  my_first_generator_multi.a my_first_generator_basiccheck_symbol  my_first_generator_multi.a my_first_generator_sse41check_symbol  my_first_generator_multi.a my_first_generator_avxecho "Success!"

阅读全文

'); })();