TensorRT的workflow

来源：互联网发布：张一山演技知乎编辑：程序博客网时间：2024/05/09 08:17

TensorRT有两种workflow: 第一种是development阶段的workflow，第二种production阶段的workflow.

下图是 development阶段的流程路:

TensorRT里的sample的基本都是这个workflow. 先生成PLAN数据流，然后Validate它。

下图是execution 阶段的流程路:

以TensorRT里的samplePlugin为例，这个PLAN文件可以在如下代码段后保存生成:

    PluginFactory pluginFactory;    IHostMemory *gieModelStream{ nullptr };    caffeToGIEModel("mnist.prototxt", "mnist.caffemodel", std::vector < std::string > { OUTPUT_BLOB_NAME }, 1, &pluginFa\ctory, gieModelStream);    pluginFactory.destroyPlugin();

这个阶段把prototxt，modelfile和plugin加载进来生成TensorRT自定定义的文件 (包含prototxt和model)。在这段代码后，可以把gieModelStream的数据流存成PLAN文件（即 Serialize to dsik），其数据的指针为gieModelStream->data()，size是gieModelStream->size()。

在production阶段，直接Serialized (load)上面生成的PLAN文件，然后执行如下code做inference。这样可以避免在每次应用启动时都执行 caffeToGIEModel() ，因为这个阶段执行时间比较长(因为不仅解析prototxt和caffemodel，还会做一些优化的工作)。

    // parse the mean file and  subtract it from the image    const float *meanData = reinterpret_cast<const float*>(meanBlob->getData());    float data[INPUT_H*INPUT_W];    for (int i = 0; i < INPUT_H*INPUT_W; i++)        data[i] = float(fileData[i])-meanData[i];    meanBlob->destroy();    // deserialize the engine    IRuntime* runtime = createInferRuntime(gLogger);    ICudaEngine* engine = runtime->deserializeCudaEngine(gieModelStream->data(), gieModelStream->size(), &pluginFactory)\;    IExecutionContext *context = engine->createExecutionContext();    // run inference    float prob[OUTPUT_SIZE];    doInference(*context, data, prob, 1);    // destroy the engine    context->destroy();    engine->destroy();    runtime->destroy();    pluginFactory.destroyPlugin();

阅读全文

0 0