Halide学习笔记----Halide tutorial源码阅读13

来源：互联网发布：notepad++ mac 编辑：程序博客网时间：2024/06/04 22:06

Halide入门13

// Halide tutorial lesson 13: Tuples// Halide入门第13课：元组// This lesson describes how to write Funcs that evaluate to multiple// values.// 本科介绍如何编写多值函数// On linux, you can compile and run it like so:// g++ lesson_13*.cpp -g -I ../include -L ../bin -lHalide -lpthread -ldl -o lesson_13 -std=c++11// LD_LIBRARY_PATH=../bin ./lesson_13#include "Halide.h"#include <stdio.h>#include <algorithm>using namespace Halide;int main(int argc, char **argv) {    // So far Funcs (such as the one below) have evaluated to a single    // scalar value for each point in their domain.    // 之前介绍的函数为单值函数，即在每一个像素点计算一个值    Func single_valued;    Var x, y;    single_valued(x, y) = x + y;    // One way to write a Func that returns a collection of values is    // to add an additional dimension that indexes that    // collection. This is how we typically deal with color. For    // example, the Func below represents a collection of three values    // for every x, y coordinate indexed by c.    // 一种多值函数的描述方法是增加一个维度，通过这个额外的维度来索引多值函数。    // 像通常的颜色通道就是通过颜色这个维度来索引像素点的r(x,y)/g(x,y)/b(x,y)多值函数的。    Func color_image;    Var c;    color_image(x, y, c) = select(c == 0, 245, // Red value                                  c == 1, 42,  // Green value                                  132);        // Blue value    // This method is often convenient because it makes it easy to    // operate on this Func in a way that treats each item in the    // collection equally:    // 这种方法通常很简单，平等对待多值函数的每一个函数即可。    Func brighter;    brighter(x, y, c) = color_image(x, y, c) + 10;    // However this method is also inconvenient for three reasons.    //    // 1) Funcs are defined over an infinite domain, so users of this    // Func can for example access color_image(x, y, -17), which is    // not a meaningful value and is probably indicative of a bug.    //    // 2) It requires a select, which can impact performance if not    // bounded and unrolled:    // brighter.bound(c, 0, 3).unroll(c);    //    // 3) With this method, all values in the collection must have the    // same type. While the above two issues are merely inconvenient,    // this one is a hard limitation that makes it impossible to    // express certain things in this way.    // 上述哪种方法有如下三个不便之处    // 1）函数定义在一个无线区域上，因此调用函数会出现一些无意义的调用，这样会导致一些可能存在的很隐晦的bug    // 2）如果没有边界条件限制和代码平铺，会很影响程序的性能。    // 3）使用这种方法的多值函数必须是同种的数据类型。对于不同中数据类型的多值函数，本方法不是很方便。    // It is also possible to represent a collection of values as a    // collection of Funcs:    // 同样可以类似函数数组那样来表达多值函数。    Func func_array[3];    func_array[0](x, y) = x + y;    func_array[1](x, y) = sin(x);    func_array[2](x, y) = cos(y);    // This method avoids the three problems above, but introduces a    // new annoyance. Because these are separate Funcs, it is    // difficult to schedule them so that they are all computed    // together inside a single loop over x, y.    // 这种方法可以有效避免上面提到的三个问题，但是引入了新的问题，因为这三个函数是相互独立的，很难同时    // 调度这样的多值函数。    // A third alternative is to define a Func as evaluating to a    // Tuple instead of an Expr. A Tuple is a fixed-size collection of    // Exprs. Each Expr in a Tuple may have a different type. The    // following function evaluates to an integer value (x+y), and a    // floating point value (sin(x*y)).    // 第三种可行的方法是将函数定义成元组，而不是表达式。元组的每一个表达式可以有不同的数据类型。    // 如下的元组有整型值x+y和浮点型值sin(x*y)    Func multi_valued;    multi_valued(x, y) = Tuple(x + y, sin(x * y));    // Realizing a tuple-valued Func returns a collection of    // Buffers. We call this a Realization. It's equivalent to a    // std::vector of Buffer objects:    // 实现（这里的realize指的是halide的jit编译和执行）元组型函数返回一个多个buffer。    // 相当于一个buffer向量对象，需要注意的是，每个buffer可以有不同的数据类型。    {        Realization r = multi_valued.realize(80, 60);        assert(r.size() == 2);        Buffer<int> im0 = r[0];        Buffer<float> im1 = r[1];        assert(im0(30, 40) == 30 + 40);        assert(im1(30, 40) == sinf(30 * 40));    }    // All Tuple elements are evaluated together over the same domain    // in the same loop nest, but stored in distinct allocations. The    // equivalent C++ code to the above is:    {        int multi_valued_0[80*60];        float multi_valued_1[80*60];        for (int y = 0; y < 80; y++) {            for (int x = 0; x < 60; x++) {                multi_valued_0[x + 60*y] = x + y;                multi_valued_1[x + 60*y] = sinf(x*y);            }        }    }    // When compiling ahead-of-time, a Tuple-valued Func evaluates    // into multiple distinct output buffer_t structs. These appear in    // order at the end of the function signature:    // int multi_valued(...input buffers and params...,    //                  buffer_t *output_1, buffer_t *output_2);    // 当采用提前编译的方法编译时，元组型多值函数按照定义时的顺序，返回多值函数的buffer    // You can construct a Tuple by passing multiple Exprs to the    // Tuple constructor as we did above. Perhaps more elegantly, you    // can also take advantage of C++11 initializer lists and just    // enclose your Exprs in braces:    // 可以通过元组构造函数的方法将多值表达式传给多值函数，也可以采用c++11语法的初始化列表的方式用大括号    // 初始化。    Func multi_valued_2;    multi_valued_2(x, y) = {x + y, sin(x*y)};    // Calls to a multi-valued Func cannot be treated as Exprs. The    // following is a syntax error:    // Func consumer;    // 调用多值函数不再能将它们当作一个表达式处理了    // consumer(x, y) = multi_valued_2(x, y) + 10;    // Instead you must index a Tuple with square brackets to retrieve    // the individual Exprs:    // 取而代之的是可以采用元组的方法，用类似数组下标的方式分别索引处理。    Expr integer_part = multi_valued_2(x, y)[0];    Expr floating_part = multi_valued_2(x, y)[1];    Func consumer;    consumer(x, y) = {integer_part + 10, floating_part + 10.0f};    // Tuple reductions.    {        // Tuples are particularly useful in reductions, as they allow        // the reduction to maintain complex state as it walks along        // its domain. The simplest example is an argmax.        // 元组在约减操作中非常有用，因为它们允许在约减操作时维护非常复杂的状态。        // First we create a Buffer to take the argmax over.        Func input_func;        input_func(x) = sin(x);        Buffer<float> input = input_func.realize(100);        // Then we define a 2-valued Tuple which tracks the index of        // the maximum value and the value itself.        // 定义一个二值函数，返回最大值的坐标和最大值。        Func arg_max;        // Pure definition.        arg_max() = {0, input(0)};        // Update definition.        RDom r(1, 99);        Expr old_index = arg_max()[0];        Expr old_max   = arg_max()[1];        Expr new_index = select(old_max < input(r), r, old_index);        Expr new_max   = max(input(r), old_max);        arg_max() = {new_index, new_max};        // The equivalent C++ is:        int arg_max_0 = 0;        float arg_max_1 = input(0);        for (int r = 1; r < 100; r++) {            int old_index = arg_max_0;            float old_max = arg_max_1;            int new_index = old_max < input(r) ? r : old_index;            float new_max = std::max(input(r), old_max);            // In a tuple update definition, all loads and computation            // are done before any stores, so that all Tuple elements            // are updated atomically with respect to recursive calls            // to the same Func.            arg_max_0 = new_index;            arg_max_1 = new_max;        }        // Let's verify that the Halide and C++ found the same maximum        // value and index.        {            Realization r = arg_max.realize();            Buffer<int> r0 = r[0];            Buffer<float> r1 = r[1];            assert(arg_max_0 == r0(0));            assert(arg_max_1 == r1(0));        }        // Halide provides argmax and argmin as built-in reductions        // similar to sum, product, maximum, and minimum. They return        // a Tuple consisting of the point in the reduction domain        // corresponding to that value, and the value itself. In the        // case of ties they return the first value found. We'll use        // one of these in the following section.        // Halide提供了argmax/argmin的内置函数，返回最大值在的坐标和相应的最大值。    }    // Tuples for user-defined types.    {        // Tuples can also be a convenient way to represent compound        // objects such as complex numbers. Defining an object that        // can be converted to and from a Tuple is one way to extend        // Halide's type system with user-defined types.        // 元组可以方便的表达其他的复合对象，这里以复数为例，讲述如何拓展Halide的数据类型。        // 主要是元组的使用和Halide内置运算符重载等内容。这里的struc和公有成员变量和成员函数的class一致        struct Complex {            Expr real, imag;            // Construct from a Tuple            Complex(Tuple t) : real(t[0]), imag(t[1]) {}            // Construct from a pair of Exprs            Complex(Expr r, Expr i) : real(r), imag(i) {}            // Construct from a call to a Func by treating it as a Tuple            Complex(FuncRef t) : Complex(Tuple(t)) {}            // Convert to a Tuple            operator Tuple() const {                return {real, imag};            }            // Complex addition            Complex operator+(const Complex &other) const {                return {real + other.real, imag + other.imag};            }            // Complex multiplication            Complex operator*(const Complex &other) const {                return {real * other.real - imag * other.imag,                        real * other.imag + imag * other.real};            }            // Complex magnitude, squared for efficiency            Expr magnitude_squared() const {                return real * real + imag * imag;            }            // Other complex operators would go here. The above are            // sufficient for this example.        };        // Let's use the Complex struct to compute a Mandelbrot set.        Func mandelbrot;        // The initial complex value corresponding to an x, y coordinate        // in our Func.        Complex initial(x/15.0f - 2.5f, y/6.0f - 2.0f);        // Pure definition.        Var t;        mandelbrot(x, y, t) = Complex(0.0f, 0.0f);        // We'll use an update definition to take 12 steps.        // 通过RDom来限制循环次数        RDom r(1, 12);        Complex current = mandelbrot(x, y, r-1);        // The following line uses the complex multiplication and        // addition we defined above.        mandelbrot(x, y, r) = current*current + initial;        // We'll use another tuple reduction to compute the iteration        // number where the value first escapes a circle of radius 4.        // This can be expressed as an argmin of a boolean - we want        // the index of the first time the given boolean expression is        // false (we consider false to be less than true).  The argmax        // would return the index of the first time the expression is        // true.        // 用一个元组来计算迭代过程中，复数点一次超出了半径为4的圆。可以用argmin来实现。        Expr escape_condition = Complex(mandelbrot(x, y, r)).magnitude_squared() < 16.0f;        Tuple first_escape = argmin(escape_condition);        // We only want the index, not the value, but argmin returns        // both, so we'll index the argmin Tuple expression using        // square brackets to get the Expr representing the index.        // 由于我们需要的知识坐标，因此只需引用元组的第一个元素即可        Func escape;        escape(x, y) = first_escape[0];        // Realize the pipeline and print the result as ascii art.        Buffer<int> result = escape.realize(61, 25);        const char *code = " .:-~*={}&%#@";        for (int y = 0; y < result.height(); y++) {            for (int x = 0; x < result.width(); x++) {                printf("%c", code[result(x, y)]);            }            printf("\n");        }    }    printf("Success!\n");    return 0;}

编译和执行：

$ g++ lesson_13*.cpp -g -I ../include -L ../bin -lHalide -lpthread -ldl -o lesson_13 -std=c++11$ ./lesson_13

结果：

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::-------------------::::::::::::::::::::::::::::::::::::-------------------------------::::::::::::::::::::::::::---------------------------------------:::::::::::::::::::----------------------~~~~~~~~~--------------::::::::::::::------------------~~~~~~~~~*****~~~~~~-----------::::::::::----------------~~~~~~~~~****={#%@=**~~~~~-----------:::::::-------------~~~~~~~~~****==={%    %{=***~~~~----------::::------------~~~~~~~******==& @%       @#}} =*~~~~----------::--------~~~~~~~***======{{&               #{=*~~~~---------:-------~~~~~~***==} @% %%&%                  =**~~~----------------~~~***==={}&#                         }=**~~~~---------------~*                                   &{=**~~~~---------------~~~***==={}&#                         }=**~~~~----------------~~~~~~***==} @% %%&%                  =**~~~----------:--------~~~~~~~***======{{&               #{=*~~~~---------::------------~~~~~~~******==& @%       @#}} =*~~~~----------::::-------------~~~~~~~~~****==={%    %{=***~~~~----------:::::::----------------~~~~~~~~~****={#%@=**~~~~~-----------::::::::::------------------~~~~~~~~~*****~~~~~~-----------::::::::::::::----------------------~~~~~~~~~--------------:::::::::::::::::::---------------------------------------::::::::::::::::::::::::::-------------------------------::::::::::::::::::::::::::::::::::::-------------------::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::Success!

本节要点提炼：
1. 用元组构建多值函数
multi_value(x, y) = Tuple(x+y, x*y);
或者采用c++11风格的
multi_value(x, y) = {x+y, x*y};
元组多值函数的引用
multi_value(x, y)[0];// 多值函数的第一个分量
multi_value(x, y)[1];// 多值函数的第二分量

argmin/argmax内置约减函数，返回的是一个多值函数，第一个为对应的坐标，第二个为对应的最大值或者最小值
拓展Halide数据类型，运算符需要重载。
可以用RDom变量来达到指定次数的循环。

阅读全文

'); })();