在DirectShow中支持DXVA 2.0

  第一篇翻译的Direct3D device manager,链接:http://blog.csdn.net/qq_33892166/article/details/53325887
  本主题描述如何在DirectShow的解码器中支持DirectX Video Acceleration (DXVA) 2.0。具体而言,是描述解码器与视频渲染器之间的联通(communication )。本主题不描述如何实现DXVA解码。
  本主题假定你熟悉如何写DirectShow过滤器。更多信息请参考DirectShow SDK文档的Writing DirectShow Filters主题(https://msdn.microsoft.com/en-us/library/dd391013(v=vs.85).aspx )。代码简例假定解码器继承自CTransformFilter类,定义如下:

class CDecoder : public CTransformFilter{public:    static CUnknown* WINAPI CreateInstance(IUnknown *pUnk, HRESULT *pHr);    HRESULT CompleteConnect(PIN_DIRECTION direction, IPin *pPin);    HRESULT InitAllocator(IMemAllocator **ppAlloc);    HRESULT DecideBufferSize(IMemAllocator *pAlloc, ALLOCATOR_PROPERTIES *pProp);    // TODO: The implementations of these methods depend on the specific decoder.    HRESULT CheckInputType(const CMediaType *mtIn);    HRESULT CheckTransform(const CMediaType *mtIn, const CMediaType *mtOut);    HRESULT CTransformFilter::GetMediaType(int,CMediaType *);private:    CDecoder(HRESULT *pHr);    ~CDecoder();    CBasePin * GetPin(int n);    HRESULT ConfigureDXVA2(IPin *pPin);    HRESULT SetEVRForDXVA2(IPin *pPin);    HRESULT FindDecoderConfiguration(        /* [in] */  IDirectXVideoDecoderService *pDecoderService,        /* [in] */  const GUID& guidDecoder,         /* [out] */ DXVA2_ConfigPictureDecode *pSelectedConfig,        /* [out] */ BOOL *pbFoundDXVA2Configuration        );private:    IDirectXVideoDecoderService *m_pDecoderService;    DXVA2_ConfigPictureDecode m_DecoderConfig;    GUID                      m_DecoderGuid;    HANDLE                    m_hDevice;    FOURCC                    m_fccOutputFormat;};

  本主题中,解码器是指decoder filter,包括接收压缩视频数据到输出解压缩的视频数据的过程。解码设备指图形驱动所实现的硬件视频加速器。
  一个解码器要支持DXVA 2.0必须有以下基本步骤:
  (4)提供一个客户分配器来分配Direct3D surfaces.
2.变更提示(Migration Notes)
  如果你是从DXVA 1.0变更到DXVA 2.0,你需要注意这两个版本之间的以下一些重大区别:
  (1)DXVA 2.0不使用 IAMVideoAccelerator 和 IAMVideoAcceleratorNotify 接口,因为解码器可以通过 IDirectXVideoDecoder 接口直接获得DXVA 2.0 的API
  (2)确定文件类型时(原文:During media type negotiation),解码器不用video acceleration GUID做为子类型,子类型直接为和软解一样的解压缩的视频格式(如NV12)
  (3)配置加速器的流程变更了。在DXVA 1.0 ,解码器调用带DXVA_ConfigPictureDecode结构的Execute函数来配置加速器。在DXVA 2.0中,解码器用IDirectXVideoDecoderService接口来配置,下一部分将会讲到。
  (6)解码器不再检查什么时候数据缓存是安全可更新的(原文:The decoder is no longer responsible for checking when data buffers are safe for updates)。因此DXVA 2.0没有任何方法(或函数,原文:method)是与IAMVideoAccelerator::QueryRenderStatus等效的。
  (7)子像素混合(原文:Subpicture blending)由视频渲染器调用DXVA2.0视频处理API来做。提供子像素的解码器(如DVD解码器)应当把子像素数据发送到一个独立的输出Pin。(原文:Subpicture blending is done by the video renderer, using the DXVA2.0 video processor APIs. Decoders that provide subpictures (for example, DVD decoders) should send subpicture data on a separate output pin.)
  对于解码操作,DXVA 2.0与DXVA 1.0用的相同的数据结构(原文:data structures)。(个人理解:这里的数据结构应该是指存储数据的结构体)
  EVR过滤器支持DXVA 2.0。视频混合器(原文:Video Mixing Renderer filters)(VMR-7 和 VMR-9)仅支持DXVA 1.0。
3.查找解码器配置(Finding a Decoder Configuration)
  解码器确定了输出媒体类型后,必须给DXVA解码器设备找到一个兼容的配置。你可以在输出Pin的CBaseOutputPin::CompleteConnect方法中完成这个步骤。这一步确保图形驱动器在解码器用DXVA之前支持解码器所需要的能力(原文:This step ensures that the graphics driver supports the capabilities needed by the decoder, before the decoder commits to using DXVA.)。
  3)调用IDirect3DDeviceManager9::OpenDeviceHandle以获取渲染器的Direct3D 设备的句柄。
  6)循环查找解码器GUID数组找到解码器支持的GUID。如,一个MPEG-2解码器,你可以查找DXVA2_ModeMPEG2_MOCOMP, DXVA2_ModeMPEG2_IDCT, 或者 DXVA2_ModeMPEG2_VLD。
  7)当你找到一个可能的解码设备GUID,把GUID传给IDirectXVideoDecoderService::GetDecoderRenderTargets方法。这个方法返回一个渲染器目标格式数组,指定为D3DFORMAT 格式(原文:This method returns an array of render target formats, specified as D3DFORMAT values.)。
  8)循环查找到匹配你的输出格式的渲染器目标格式。特别地,一个解码器只支持一个渲染目标格式。解码器将用这个子类型与渲染器连接。In the first call to CompleteConnect(不懂,不知道怎么翻译,大概CompleteConnect是个什么函数),解码器可以决定渲染目标格式,然后返回这个格式作为一个首选的输出类型。
  10)假定以上步骤都成功了,保存Direct3D 设备句柄、解码器设备GUID和所配置的结构(原文:and the configuration structure)。过滤器将用这个信息去创建解码器设备。

HRESULT CDecoder::ConfigureDXVA2(IPin *pPin){    UINT    cDecoderGuids = 0;    BOOL    bFoundDXVA2Configuration = FALSE;    GUID    guidDecoder = GUID_NULL;    DXVA2_ConfigPictureDecode config;    ZeroMemory(&config, sizeof(config));    // Variables that follow must be cleaned up at the end.    IMFGetService               *pGetService = NULL;    IDirect3DDeviceManager9     *pDeviceManager = NULL;    IDirectXVideoDecoderService *pDecoderService = NULL;    GUID   *pDecoderGuids = NULL; // size = cDecoderGuids    HANDLE hDevice = INVALID_HANDLE_VALUE;    // Query the pin for IMFGetService.    HRESULT hr = pPin->QueryInterface(IID_PPV_ARGS(&pGetService));    // Get the Direct3D device manager.    if (SUCCEEDED(hr))    {        hr = pGetService->GetService(            MR_VIDEO_ACCELERATION_SERVICE,            IID_PPV_ARGS(&pDeviceManager)            );    }    // Open a new device handle.    if (SUCCEEDED(hr))    {        hr = pDeviceManager->OpenDeviceHandle(&hDevice);    }     // Get the video decoder service.    if (SUCCEEDED(hr))    {        hr = pDeviceManager->GetVideoService(            hDevice, IID_PPV_ARGS(&pDecoderService));    }    // Get the decoder GUIDs.    if (SUCCEEDED(hr))    {        hr = pDecoderService->GetDecoderDeviceGuids(            &cDecoderGuids, &pDecoderGuids);    }    if (SUCCEEDED(hr))    {        // Look for the decoder GUIDs we want.        for (UINT iGuid = 0; iGuid < cDecoderGuids; iGuid++)        {            // Do we support this mode?            if (!IsSupportedDecoderMode(pDecoderGuids[iGuid]))            {                continue;            }            // Find a configuration that we support.             hr = FindDecoderConfiguration(pDecoderService, pDecoderGuids[iGuid],                &config, &bFoundDXVA2Configuration);            if (FAILED(hr))            {                break;            }            if (bFoundDXVA2Configuration)            {                // Found a good configuration. Save the GUID and exit the loop.                guidDecoder = pDecoderGuids[iGuid];                break;            }        }    }    if (!bFoundDXVA2Configuration)    {        hr = E_FAIL; // Unable to find a configuration.    }    if (SUCCEEDED(hr))    {        // Store the things we will need later.        SafeRelease(&m_pDecoderService);        m_pDecoderService = pDecoderService;        m_pDecoderService->AddRef();        m_DecoderConfig = config;        m_DecoderGuid = guidDecoder;        m_hDevice = hDevice;    }    if (FAILED(hr))    {        if (hDevice != INVALID_HANDLE_VALUE)        {            pDeviceManager->CloseDeviceHandle(hDevice);        }    }    SafeRelease(&pGetService);    SafeRelease(&pDeviceManager);    SafeRelease(&pDecoderService);    return hr;}HRESULT CDecoder::FindDecoderConfiguration(    /* [in] */  IDirectXVideoDecoderService *pDecoderService,    /* [in] */  const GUID& guidDecoder,     /* [out] */ DXVA2_ConfigPictureDecode *pSelectedConfig,    /* [out] */ BOOL *pbFoundDXVA2Configuration    ){    HRESULT hr = S_OK;    UINT cFormats = 0;    UINT cConfigurations = 0;    D3DFORMAT                   *pFormats = NULL;     // size = cFormats    DXVA2_ConfigPictureDecode   *pConfig = NULL;      // size = cConfigurations    // Find the valid render target formats for this decoder GUID.    hr = pDecoderService->GetDecoderRenderTargets(        guidDecoder,        &cFormats,        &pFormats        );    if (SUCCEEDED(hr))    {        // Look for a format that matches our output format.        for (UINT iFormat = 0; iFormat < cFormats;  iFormat++)        {            if (pFormats[iFormat] != (D3DFORMAT)m_fccOutputFormat)            {                continue;            }            // Fill in the video description. Set the width, height, format,             // and frame rate.            DXVA2_VideoDesc videoDesc = {0};            FillInVideoDescription(&videoDesc); // Private helper function.            videoDesc.Format = pFormats[iFormat];            // Get the available configurations.            hr = pDecoderService->GetDecoderConfigurations(                guidDecoder,                &videoDesc,                NULL, // Reserved.                &cConfigurations,                &pConfig                );            if (FAILED(hr))            {                break;            }            // Find a supported configuration.            for (UINT iConfig = 0; iConfig < cConfigurations; iConfig++)            {                if (IsSupportedDecoderConfig(pConfig[iConfig]))                {                    // This configuration is good.                    *pbFoundDXVA2Configuration = TRUE;                    *pSelectedConfig = pConfig[iConfig];                    break;                }            }            CoTaskMemFree(pConfig);            break;        } // End of formats loop.    }    CoTaskMemFree(pFormats);    // Note: It is possible to return S_OK without finding a configuration.    return hr;}


// Returns TRUE if the decoder supports a given decoding mode.BOOL IsSupportedDecoderMode(const GUID& mode);// Returns TRUE if the decoder supports a given decoding configuration.BOOL IsSupportedDecoderConfig(const DXVA2_ConfigPictureDecode& config);// Fills in a DXVA2_VideoDesc structure based on the input format.void FillInVideoDescription(DXVA2_VideoDesc *pDesc);

4.通知视频渲染器(Notifying the Video Renderer)
  1)为IMFGetService接口查询渲染器的输入Pin(原文:Query the renderer’s input pin for the IMFGetService interface.)
  3)循环调用IDirectXVideoMemoryConfiguration::GetAvailableSurfaceTypeByIndex,从0增长dwTypeIndex 变量。当该方法在pdwType 参数返回DXVA2_SurfaceType_DecoderRenderTarget 时停止循环。这一步确保视频渲染器支持硬件加速转码。对于EVR过滤器而言这一步总是成功的。

HRESULT CDecoder::SetEVRForDXVA2(IPin *pPin){    HRESULT hr = S_OK;    IMFGetService                       *pGetService = NULL;    IDirectXVideoMemoryConfiguration    *pVideoConfig = NULL;    // Query the pin for IMFGetService.    hr = pPin->QueryInterface(__uuidof(IMFGetService), (void**)&pGetService);    // Get the IDirectXVideoMemoryConfiguration interface.    if (SUCCEEDED(hr))    {        hr = pGetService->GetService(            MR_VIDEO_ACCELERATION_SERVICE, IID_PPV_ARGS(&pVideoConfig));    }    // Notify the EVR.     if (SUCCEEDED(hr))    {        DXVA2_SurfaceType surfaceType;        for (DWORD iTypeIndex = 0; ; iTypeIndex++)        {            hr = pVideoConfig->GetAvailableSurfaceTypeByIndex(iTypeIndex, &surfaceType);            if (FAILED(hr))            {                break;            }            if (surfaceType == DXVA2_SurfaceType_DecoderRenderTarget)            {                hr = pVideoConfig->SetSurfaceType(DXVA2_SurfaceType_DecoderRenderTarget);                break;            }        }    }    SafeRelease(&pGetService);    SafeRelease(&pVideoConfig);    return hr;}

  如果解码器找到了有效的配置并成功通知了视频渲染器,解码器就可以用DXVA来解码了。解码器必须给输出Pin实现客户分配器(原为:a custom allocator),如下面一部分描述的。
5.分配解码数据缓存(Allocating Uncompressed Buffers)
  在DXVA 2.0中,解码器负责分配作为解压缩视频数据缓存的Direct3D surfaces。因此,解码器必须实现一个创建surfaces的custom allocator(不知道怎么翻译,不翻译了,意思大概是由用户来实现的分配器)。这个分配器提供的media samples会有一个指向Direct3D surfaces的指针。EVR通过调用这个media sample的IMFGetService::GetService取回这个指向surface的指针。这个服务的标识符是MR_BUFFER_SERVICE。
  要实现custom allocator,需执行以下步骤:
  1)给media samples定义一个类。这个类继承自CMediaSample。在这个类中,做以下:
    a)保存一个指向the Direct3D surface的指针;
    b)实现IMFGetService接口。在GetService方法中,如果service GUID i是MR_BUFFER_SERVICE,query the Direct3D surface for the requested interface。否则,GetService 会返回MF_E_UNSUPPORTED_SERVICE。
    c)重写CMediaSample::GetPointer 方法来返回 E_NOTIMPL.
  2)给the allocator定义一个类。the allocator可以继承自CBaseAllocator类。在这个类中,做以下:
    a)重写CBaseAllocator::Alloc方法。在这个方法中,调用IDirectXVideoAccelerationService::CreateSurface创建surface。( IDirectXVideoDecoderService 接口从IDirectXVideoAccelerationService继承这个方法)。
  3)在你的过滤器的输出Pin中,重写CBaseOutputPin::InitAllocator方法。在这个方法中,创建一个你实现的custom allocator的实例。
  4)在你的filter中,实现CTransformFilter::DecideBufferSize方法。pProperties 参数表明EVR所需的surface的数量。把这个值增加的解码器所需的大小,并在allocator中调用IMemAllocator::SetProperties。
以下代码展示如何实现media sample类:

class CDecoderSample : public CMediaSample, public IMFGetService{    friend class CDecoderAllocator;public:    CDecoderSample(CDecoderAllocator *pAlloc, HRESULT *phr)        : CMediaSample(NAME("DecoderSample"), (CBaseAllocator*)pAlloc, phr, NULL, 0),          m_pSurface(NULL),          m_dwSurfaceId(0)    {     }    // Note: CMediaSample does not derive from CUnknown, so we cannot use the    //       DECLARE_IUNKNOWN macro that is used by most of the filter classes.    STDMETHODIMP QueryInterface(REFIID riid, void **ppv)    {        CheckPointer(ppv, E_POINTER);        if (riid == IID_IMFGetService)        {            *ppv = static_cast<IMFGetService*>(this);            AddRef();            return S_OK;        }        else        {            return CMediaSample::QueryInterface(riid, ppv);        }    }    STDMETHODIMP_(ULONG) AddRef()    {        return CMediaSample::AddRef();    }    STDMETHODIMP_(ULONG) Release()    {        // Return a temporary variable for thread safety.        ULONG cRef = CMediaSample::Release();        return cRef;    }    // IMFGetService::GetService    STDMETHODIMP GetService(REFGUID guidService, REFIID riid, LPVOID *ppv)    {        if (guidService != MR_BUFFER_SERVICE)        {            return MF_E_UNSUPPORTED_SERVICE;        }        else if (m_pSurface == NULL)        {            return E_NOINTERFACE;        }        else        {            return m_pSurface->QueryInterface(riid, ppv);        }    }    // Override GetPointer because this class does not manage a system memory buffer.    // The EVR uses the MR_BUFFER_SERVICE service to get the Direct3D surface.    STDMETHODIMP GetPointer(BYTE ** ppBuffer)    {        return E_NOTIMPL;    }private:    // Sets the pointer to the Direct3D surface.     void SetSurface(DWORD surfaceId, IDirect3DSurface9 *pSurf)    {        SafeRelease(&m_pSurface);        m_pSurface = pSurf;        if (m_pSurface)        {            m_pSurface->AddRef();        }        m_dwSurfaceId = surfaceId;    }    IDirect3DSurface9   *m_pSurface;    DWORD               m_dwSurfaceId;};


HRESULT CDecoderAllocator::Alloc(){    CAutoLock lock(this);    HRESULT hr = S_OK;    if (m_pDXVA2Service == NULL)    {        return E_UNEXPECTED;    }    hr = CBaseAllocator::Alloc();    // If the requirements have not changed, do not reallocate.    if (hr == S_FALSE)    {        return S_OK;    }    if (SUCCEEDED(hr))    {        // Free the old resources.        Free();        // Allocate a new array of pointers.        m_ppRTSurfaceArray = new (std::nothrow) IDirect3DSurface9*[m_lCount];        if (m_ppRTSurfaceArray == NULL)        {            hr = E_OUTOFMEMORY;        }        else        {            ZeroMemory(m_ppRTSurfaceArray, sizeof(IDirect3DSurface9*) * m_lCount);        }    }    // Allocate the surfaces.    if (SUCCEEDED(hr))    {        hr = m_pDXVA2Service->CreateSurface(            m_dwWidth,            m_dwHeight,            m_lCount - 1,            (D3DFORMAT)m_dwFormat,            D3DPOOL_DEFAULT,            0,            DXVA2_VideoDecoderRenderTarget,            m_ppRTSurfaceArray,            NULL            );    }    if (SUCCEEDED(hr))    {        for (m_lAllocated = 0; m_lAllocated < m_lCount; m_lAllocated++)        {            CDecoderSample *pSample = new (std::nothrow) CDecoderSample(this, &hr);            if (pSample == NULL)            {                hr = E_OUTOFMEMORY;                break;            }            if (FAILED(hr))            {                break;            }            // Assign the Direct3D surface pointer and the index.            pSample->SetSurface(m_lAllocated, m_ppRTSurfaceArray[m_lAllocated]);            // Add to the sample list.            m_lFree.Add(pSample);        }    }    if (SUCCEEDED(hr))    {        m_bChanged = FALSE;    }    return hr;}


void CDecoderAllocator::Free(){    CMediaSample *pSample = NULL;    do    {        pSample = m_lFree.RemoveHead();        if (pSample)        {            delete pSample;        }    } while (pSample);    if (m_ppRTSurfaceArray)    {        for (long i = 0; i < m_lAllocated; i++)        {            SafeRelease(&m_ppRTSurfaceArray[i]);        }        delete [] m_ppRTSurfaceArray;    }    m_lAllocated = 0;}

  2)释放IDirectXVideoDecoderService 和IDirectXVideoDecoder 指针
  DXVA 2.0解码操作所用数据的结构与DXVA 1.0相同。
  调用Execute之后,调用IMemInputPin::Receive把该帧传给视频渲染器,这与软解一样。Receive方法是异步的,它返回之后,解码器可以继续解码下一帧。显示驱动器(display driver)阻止任何解码命令在缓存使用期间覆写缓存。解码器不应该在渲染器释放sample之前重用surface来解码另一帧数据。当渲染器释放sample之后,分配器把sample放回可用sample池中。要获取下一个可用sample,调用CBaseOutputPin::GetDeliveryBuffer,它转而调用IMemAllocator::GetBuffer(原文:which in turn calls IMemAllocator::GetBuffer)。

