Halide学习笔记----Halide tutorial源码阅读15
来源:互联网 发布:matlab dsp编程 编辑:程序博客网 时间:2024/06/05 19:13
Halide入门教程15
本课分为两部分,第一部分讲述如何编写生成器
第二部分为一个shell脚本,告诉如何使用生成器编译出所需要的头文件和静态库
// Halide tutorial lesson 15: Generators part 1// Halide入门15课:生成器// This lesson demonstrates how to encapsulate Halide pipelines into// resuable components called generators.// 本课展示如何将Halide的pipeline封装到可以重复使用的生成器中// On linux, you can compile and run it like so:// g++ lesson_15*.cpp ../tools/GenGen.cpp -g -std=c++11 -fno-rtti -I ../include -L ../bin -lHalide -lpthread -ldl -o lesson_15_generate// bash lesson_15_generators_usage.sh#include "Halide.h"#include <stdio.h>using namespace Halide;// Generators are a more structured way to do ahead-of-time// compilation of Halide pipelines. Instead of writing an int main()// with an ad-hoc command-line interface like we did in lesson 10, we// define a class that inherits from Halide::Generator.// 生成器提供了一种更结构化的提前编译Halide pipeline的方法。除了像第十课中那样写main函数,然后用特定的// 命令行接口调用那样,我们可以从Halide::Generator继承得到一个子类,由这个子类结构化生成器达到相同的目的。class MyFirstGenerator : public Halide::Generator<MyFirstGenerator> {public: // We declare the Inputs to the Halide pipeline as public // member variables. They'll appear in the signature of our generated // function in the same order as we declare them. // 定义Halide输入变量为流水线的公有变量。它们在生成器生成的函数中的顺序同他们声明时的顺序一致。 Input<uint8_t> offset{"offset"}; Input<Buffer<uint8_t>> input{"input", 2}; // We also declare the Outputs as public member variables. // 输出变量同样声明为公有变量。 Output<Buffer<uint8_t>> brighter{"brighter", 2}; // Typically you declare your Vars at this scope as well, so that // they can be used in any helper methods you add later. Var x, y; // We then define a method that constructs and return the Halide // pipeline: // 定义generate,在这个函数中定义算法描述的的pipeline // 这个函数的名字就是generate,如同main函数一样,名字规定就是这样,不要变动。 void generate() { // In lesson 10, here is where we called // Func::compile_to_file. In a Generator, we just need to // define the Output(s) representing the output of the pipeline. // 在第十课中,我们调用compile_to_file函数。这里我们只需要定义整个pipeline brighter(x, y) = input(x, y) + offset; // Schedule it. brighter.vectorize(x, 16).parallel(y); }};// We compile this file along with tools/GenGen.cpp. That file defines// an "int main(...)" that provides the command-line interface to use// your generator class. We need to tell that code about our// generator. We do this like so:// 通过调用tools/GenGen.cpp内的函数来编译这个文件。这个cpp文件定义了main函数,提供了对应的接口来使用// 自定义的生成器类。我们需要按照如下格式来告诉gengen.cpp中的代码如何编译你的生成器类HALIDE_REGISTER_GENERATOR(MyFirstGenerator, my_first_generator)// If you like, you can put multiple Generators in the one file. This// could be a good idea if they share some common code. Let's define// another more complex generator:// 如果有需要,可以将多个生成器放在一个文件中。如果有公用代码,这样的做法可以共用代码。class MySecondGenerator : public Halide::Generator<MySecondGenerator> {public: // This generator will take some compile-time parameters // too. These let you compile multiple variants of a Halide // pipeline. We'll define one that tells us whether or not to // parallelize in our schedule: // 这个生成器在编译时接受编译时参数。这将允许我们编译出多个不同的Halide pipeline。 // 我们将定义一个运行时决定是否并行化的调度策略。 GeneratorParam<bool> parallel{"parallel", /* default value */ true}; // ... and another representing a constant scale factor to use: GeneratorParam<float> scale{"scale", 1.0f /* default value */, 0.0f /* minimum value */, 100.0f /* maximum value */}; // You can define GeneratorParams of all the basic scalar // types. For numeric types you can optionally provide a minimum // and maximum value, as we did for scale above. // 对于数值类型的参量,除可以提供默认值之外,还可以提供一个范围限制 // You can also define GeneratorParams for enums. To make this // work you must provide a mapping from strings to your enum // values. // 可以定义枚举类型的的生成器变量,为了让它正常工作,你还必须提供字符串到枚举类型值的映射。 enum class Rotation { None, Clockwise, CounterClockwise }; GeneratorParam<Rotation> rotation{"rotation", /* default value */ Rotation::None, /* map from names to values */ {{ "none", Rotation::None }, { "cw", Rotation::Clockwise }, { "ccw", Rotation::CounterClockwise }}}; // We'll use the same Inputs as before: Input<uint8_t> offset{"offset"}; Input<Buffer<uint8_t>> input{"input", 2}; // And a similar Output. Note that we don't specify a type for the Buffer: // at compile-time, we must specify an explicit type via the "output.type" // GeneratorParam (which is implicitly defined for this Output). // 我们不指定输出buffer的数据类型,但是在编译时必须通过output.type显式指出其数据类型。 Output<Buffer<>> output{"output", 2}; // And we'll declare our Vars here as before. Var x, y; void generate() { // Define the Func. We'll use the compile-time scale factor as // well as the runtime offset param. Func brighter; brighter(x, y) = scale * (input(x, y) + offset); // We'll possibly do some sort of rotation, depending on the // enum. To get the value of a GeneratorParam, cast it to the // corresponding type. This cast happens implicitly most of // the time (e.g. with scale above). Func rotated; switch ((Rotation)rotation) { case Rotation::None: rotated(x, y) = brighter(x, y); break; case Rotation::Clockwise: rotated(x, y) = brighter(y, 100-x); break; case Rotation::CounterClockwise: rotated(x, y) = brighter(100-y, x); break; } // We'll then cast to the desired output type. // 将数据类型转换成编译时指定的数据类型 output(x, y) = cast(output.type(), rotated(x, y)); // The structure of the pipeline depended on the generator // params. So will the schedule. // pipeline的结构依赖于生成器变量。 // Let's start by vectorizing the output. We don't know the // type though, so it's hard to pick a good factor. Generators // provide a helper called "natural_vector_size" which will // pick a reasonable factor for you given the type and the // target you're compiling to. // 由于不知道数据类型,很难选择一个合适的因子去进行向量化,生成器提供了一个函数,在编译时给定了 // 数据类型时,会自适应去选择一个合适的因子进行向量化。 output.vectorize(x, natural_vector_size(output.type())); // Now we'll possibly parallelize it: if (parallel) { output.parallel(y); } // If there was a rotation, we'll schedule that to occur per // scanline of the output and vectorize it according to its // type. if (rotation != Rotation::None) { rotated .compute_at(output, y) .vectorize(x, natural_vector_size(rotated.output_types()[0])); } }};// Register our second generator:HALIDE_REGISTER_GENERATOR(MySecondGenerator, my_second_generator)// After compiling this file, see how to use it in// lesson_15_generators_build.sh
1. 结构化生成器编写格式class MyGenerator : public Halide::Generator<MyGenerator>{ Input a, b; //... Output c, d; //... GeneratorParam<T> ... void generate() { }};HALIDE_REGIRTER_GENERATOR(MyGenerator, my_generator)
# Halide tutorial lesson 15: Generators part 2# 本课讲述了如何用编译好的生成器来生成对应的头文件和静态库# This shell script demonstrates how to use a binary containing# Generators from the command line. Normally you'd call these binaries# from your build system of choice rather than running them manually# like we do here.# This script assumes that you're in the tutorials directory, and the# generator has been compiled for the current system and is called# "lesson_15_generate".# To run this script:# bash lesson_15_generators_usage.sh# First we define a helper function that checks that a file existscheck_file_exists(){ FILE=$1 if [ ! -f $FILE ]; then echo $FILE not found exit -1 fi}# And another helper function to check if a symbol exists in an object filecheck_symbol(){ FILE=$1 SYM=$2 if !(nm $FILE | grep $SYM > /dev/null); then echo "$SYM not found in $FILE" exit -1 fi}# Bail out on error#set -e# Set up LD_LIBRARY_PATH so that we can find libHalide.soexport LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:../binexport DYLD_LIBRARY_PATH=${DYLD_LIBRARY_PATH}:../bin########################## Basic generator usage ########################### First let's compile the first generator for the host system:./lesson_15_generate -g my_first_generator -o . target=host# That should create a pair of files in the current directory:# "my_first_generator.a", and "my_first_generator.h", which define a# function "my_first_generator" representing the compiled pipeline.check_file_exists my_first_generator.acheck_file_exists my_first_generator.hcheck_symbol my_first_generator.a my_first_generator###################### Cross-compilation ####################### We can also use a generator to compile object files for some other# target. Let's cross-compile a windows 32-bit object file and header# for the first generator:./lesson_15_generate \ -g my_first_generator \ -f my_first_generator_win32 \ -o . \ target=x86-32-windows# This generates a file called "my_first_generator_win32.lib" in the# current directory, along with a matching header. The function# defined is called "my_first_generator_win32".check_file_exists my_first_generator_win32.libcheck_file_exists my_first_generator_win32.h################################# Generating pipeline variants ################################## The full set of command-line arguments to the generator binary are:# 重点解释一下generator参数如何使用# -g generator_name : Selects which generator to run. If you only have# one generator in your binary you can omit this.# -g 生成器名:选择使用那个生成器来执行,生成对应的头文件和库# -o directory : Specifies which directory to create the outputs# in. Usually a build directory.# -o 目录:指定生成文件的目录# -f name : Specifies the name of the generated function. If you omit# this, it defaults to the generator name.# -f 函数名:指定生成的函数名# -n file_base_name : Specifies the basename of the generated file(s). If# you omit this, it defaults to the name of the generated function.# -n 文件名: 指定生的头文件和静态库的文件名前缀# -e static_library,o,h,assembly,bitcode,stmt,html: A list of# comma-separated values specifying outputs to create. The default is# "static_library,h". "assembly" generates assembly equivalent to the# generated object file. "bitcode" generates llvm bitcode for the pipeline.# "stmt" generates human-readable pseudocode for the pipeline (similar to# setting HL_DEBUG_CODEGEN). "html" generates an html version of the# pseudocode, which can be much nicer to read than the raw .stmt file.# -e 生成的文件的类型:指定生成什么类型的目标文件,比如静态库/头文件/字节码/html等等# -r file_base_name : Specifies that the generator should create a# standalone file for just the runtime. For use when generating multiple# pipelines from a single generator, to be linked together in one# executable. See example below.# -r 文件名:指定生成器必须声场一个即使运行的单独文件。用来从一个生成器生成多个pipeline,然后链接到一起# 构成一个可执行文件# -x .old=new,.old2=.new2,... : A comma-separated list of file extension# pairs to substitute during file naming.# -x :逗号分隔符构成的文件列表,用来替换文件名# target=... : The target to compile for.# garget=... :用来指定编译的目标平台# my_generator_param=value : The value of your generator params.# my_generator_param=value: 用来指定生成器变量的值# Let's now generate some human-readable pseudocode for the first# generator:# 下面是一些具体使用生成器的例子,以lesson15的第一部分的生成器为例进行实验,再次不再赘述./lesson_15_generate -g my_first_generator -e stmt -o . target=hostcheck_file_exists my_first_generator.stmt# The second generator has generator params, which can be specified on# the command-line after the target. Let's compile a few different variants:./lesson_15_generate -g my_second_generator -f my_second_generator_1 -o . \target=host parallel=false scale=3.0 rotation=ccw output.type=uint16./lesson_15_generate -g my_second_generator -f my_second_generator_2 -o . \target=host scale=9.0 rotation=ccw output.type=float32./lesson_15_generate -g my_second_generator -f my_second_generator_3 -o . \target=host parallel=false output.type=float64check_file_exists my_second_generator_1.acheck_file_exists my_second_generator_1.hcheck_symbol my_second_generator_1.a my_second_generator_1check_file_exists my_second_generator_2.acheck_file_exists my_second_generator_2.hcheck_symbol my_second_generator_2.a my_second_generator_2check_file_exists my_second_generator_3.acheck_file_exists my_second_generator_3.hcheck_symbol my_second_generator_3.a my_second_generator_3# Use of these generated object files and headers is exactly the same# as in lesson 10.####################### The Halide runtime ######################## Each generated Halide object file contains a simple runtime that# defines things like how to run a parallel for loop, how to launch a# cuda program, etc. You can see this runtime in the generated object# files.echo "The halide runtime:"nm my_second_generator_1.a | grep "[SWT] _\?halide_"# Let's define some functions to check that the runtime exists in a file.check_runtime(){ if !(nm $1 | grep "[TSW] _\?halide_" > /dev/null); then echo "Halide runtime not found in $1" exit -1 fi}check_no_runtime(){ if nm $1 | grep "[TSW] _\?halide_" > /dev/null; then echo "Halide runtime found in $1" exit -1 fi}# Declarations and documentation for these runtime functions are in# HalideRuntime.h# If you're compiling and linking multiple Halide pipelines, then the# multiple copies of the runtime should combine into a single copy# (via weak linkage). If you're compiling and linking for multiple# different targets (e.g. avx and non-avx), then the runtimes might be# different, and you can't control which copy of the runtime the# linker selects.# You can control this behavior explicitly by compiling your pipelines# with the no_runtime target flag. Let's generate and link several# different versions of the first pipeline for different x86 variants:# (Note that we'll ask the generators to just give us object files ("-e o"), # instead of static libraries, so that we can easily link them all into a # single static library.)./lesson_15_generate \ -g my_first_generator \ -f my_first_generator_basic \ -e o,h \ -o . \ target=host-x86-64-no_runtime./lesson_15_generate \ -g my_first_generator \ -f my_first_generator_sse41 \ -e o,h \ -o . \ target=host-x86-64-sse41-no_runtime./lesson_15_generate \ -g my_first_generator \ -f my_first_generator_avx \ -e o,h \ -o . \ target=host-x86-64-avx-no_runtime# These files don't contain the runtimecheck_no_runtime my_first_generator_basic.ocheck_symbol my_first_generator_basic.o my_first_generator_basiccheck_no_runtime my_first_generator_sse41.ocheck_symbol my_first_generator_sse41.o my_first_generator_sse41check_no_runtime my_first_generator_avx.ocheck_symbol my_first_generator_avx.o my_first_generator_avx# We can then use the generator to emit just the runtime:./lesson_15_generate \ -r halide_runtime_x86 \ -e o,h \ -o . \ target=host-x86-64check_runtime halide_runtime_x86.o# Linking the standalone runtime with the three generated object files # gives us three versions of the pipeline for varying levels of x86, # combined with a single runtime that will work on nearly all x86 # processors.ar q my_first_generator_multi.a \ my_first_generator_basic.o \ my_first_generator_sse41.o \ my_first_generator_avx.o \ halide_runtime_x86.ocheck_runtime my_first_generator_multi.acheck_symbol my_first_generator_multi.a my_first_generator_basiccheck_symbol my_first_generator_multi.a my_first_generator_sse41check_symbol my_first_generator_multi.a my_first_generator_avxecho "Success!"
阅读全文