[wtl学习]-[第三天]-THUNK技术学习

来源:互联网 发布:ipadmini安装不了软件 编辑:程序博客网 时间:2024/06/07 23:05
还是前言:本文只是本人学习过程的思考和记录。有错误的地方诚请指教。

       今天继续看ATL Internals: Working with ATL 8, Second Edition By Christopher Tavares, Kirk Fertitta, Brent Rector, Chris Sells这本书第十章,学习了CWindowImpl的一些设计思想,边看书边翻看源码,感叹啊。要想学好一门语言还得对编译器有很好的理解。上篇说道的ATL_NO_VTABLE如此,今天学习的THUNK也是如是。
还是一样,分几个层面来学习。

首先:为什么要使用THUNK呢。它解决了什么问题
         先来看看作者是怎么说的:
To handle window messages, every window needs a window procedure (WndProc). This WndProc is set in the lpfnWndProc member of the WNDCLASSEX structure used during window registration. You might have noticed that in the expansion of DECLARE_WND_CLASS and DECLARE_WND_CLASS_EX, the name of the windows procedure is StartWindowProc. StartWindowProc is a static member function of CWindowImplBase. Its job is to establish the mapping between the CWindowImpl-derived object's HWND and the object's this pointer. The goal is to handle calls made by Windows to a WndProc global function and map them to a member function on an object. The mapping between HWND and an object's this pointer is done by the StartWindowProc when handling the first window message.[1] After the new HWND is cached in the CWindowImpl-derived object's member data, the object's real window procedure is substituted for the StartWindowProc。

      「我的理解」传统的WIN32程序在创建窗口前都需要使用API函数RegisterClass对一个WNDCLASS或WNDCLASSEX结构体进行窗口注册。而这个注册环节其中最重一项工作就是指定了一个回调函数(WndProc)的入口地址,这个回调函数用于窗口注册成功后接收并响应各种窗口事件。而ATL框架设计者将这些重复的操作过程封装成类,使用面象对象的开发模式使程序开发更符合人的思维。将每个窗口抽象成一个个窗口对象,每个对象自己去维护自己生命周期(窗口注册,创建,销毁包括样式定义等等),每个窗口对象自已去接收和响应各自的窗口消息。然而如果你了解Windows对窗口回调函数的调用机制:对于Windows API回调函数的定义typedef LRESULT (CALLBACK* WNDPROC)(HWND, UINT, WPARAM, LPARAM)。可以发现,回调函数的触发(或者说调用)是通过创建窗口(例如 hwnd = CreateWindow(...))通过创建窗口成功后获取到的窗口句柄进行关联的。(HUH??看到这里你一定想骂:这和把回调函数写到类方法中通过窗口对象调用有毛关系。好,来通过下面这个例子来试试。)

// thunkStudy1.cpp : Defines the entry point for the console application.#include <tchar.h>#include <Windows.h>#include <map>#include <assert.h>class Window{public:    Window();    ~Window();public:BOOL Create();        LRESULT WndProc(HWND hWnd, UINT message, WPARAM wParam, LPARAM lParam);protected:    HWND m_hWnd;};Window::Window(){ }Window::~Window(){}BOOL Window::Create(){    LPCTSTR lpszClassName = _T("ClassName");    HINSTANCE hInstance = GetModuleHandle(NULL);    WNDCLASSEX wcex = {sizeof(WNDCLASSEX)};    wcex.style= CS_HREDRAW | CS_VREDRAW;    wcex.lpfnWndProc= WndProc;   //直接调用类属方法做为回调函数    wcex.cbClsExtra= 0;    wcex.cbWndExtra= 0;    wcex.hInstance= hInstance;    wcex.hbrBackground= (HBRUSH)(COLOR_WINDOW+1);    wcex.lpszClassName= lpszClassName;    RegisterClassEx(&wcex);    m_hWnd = CreateWindow(lpszClassName, NULL, WS_OVERLAPPEDWINDOW,        CW_USEDEFAULT, 0, CW_USEDEFAULT, 0, NULL, NULL, hInstance, NULL);    if (m_hWnd == NULL)        return FALSE;    ShowWindow(m_hWnd, SW_SHOW);    UpdateWindow(m_hWnd);    return TRUE;}LRESULT Window::WndProc(HWND hWnd,UINT message, WPARAM wParam, LPARAM lParam){    switch (message)    {    case WM_LBUTTONUP:        MessageBox(m_hWnd, _T("LButtonUp"), _T("Message"), MB_OK | MB_ICONINFORMATION);        break;    case WM_RBUTTONUP:        MessageBox(m_hWnd, _T("RButtonUp"), _T("Message"), MB_OK | MB_ICONINFORMATION);        break;    case WM_DESTROY:        PostQuitMessage(0);        break;    default:        break;    }    return DefWindowProc(m_hWnd, message, wParam, lParam);}int APIENTRY _tWinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPTSTR lpCmdLine, int nCmdShow){    Window wnd;    wnd.Create();    MSG msg;    while (GetMessage(&msg, NULL, 0, 0))    {        TranslateMessage(&msg);        DispatchMessage(&msg);    }    return (int)msg.wParam;}

通过上面这个例子,直接将Window类属方法WndProc做为回调函数在窗口注册时传入WNDCLASSEX中。结果是编译阶段就直接报错啦。要明白这个问题要理解编译器对类做了什么。简单的理解:每个成员函数(static成员函数除外「为什么要提及static成员函数,标记下[*]。一会儿会说到」)都有一个额外的,隐含的形参this。在调用成员函数时,形参this都将初始化为调用该成员函数的类的对象的地址。成员函数不能定义this形参,而是由编译器隐含地定义。所以可以理解为Window类中的“回调函数”被编译器替换成了类似LRESULT WndProc(Window* this,HWND hWnd, UINT message, WPARAM wParam, LPARAM lParam)。当然这不是讨论的重点。这也是我Google后自己的理解所得。如有不对,欢迎指教,讨论。

        好了。闲话少扯。继续上面话题,既然无法直接把类成员函数做为回调函数来接收响应窗口消息,那通过什么方法来实现把窗口过程(WndProc)封装到类中呢?THUNK机制登场啦。上面说道: Its job(THUNK) is to establish the mapping between the CWindowImpl-derived object's HWND and the object's this pointer. The goal is to handle calls made by Windows to a WndProc global function and map them to a member function on an object. The mapping between HWND and an object's this pointer is done by the StartWindowProc when handling the first window message. Thunk的基本原理是分配一段内存,然后将窗口过程(WndProc)设置为这段内存。这段内存的作用是将窗口过程的第一个参数(窗口句柄HWND)替换成类的This指针,并jump到类的WinProc函数中。这样就完成了窗口过程到类的成员函数的一个转换。

明白了THUNK要解决的问题,接着就要看看它是怎么做到的

      先来看看THUNK做了什么:

//第一步: 窗口注册指定回调函数StartWindowProc#define DECLARE_WND_CLASS_EX(WndClassName, style, bkgnd) \static CWndClassInfo& GetWndClassInfo() { \                 static CWndClassInfo wc = { \                               { sizeof(WNDCLASSEX), style, StartWindowProc, \       <--指定回调函数入口      0, 0, NULL, NULL, NULL, (HBRUSH)(bkgnd + 1), NULL, \      WndClassName, NULL \                                    }, \                                                      NULL, NULL, IDC_ARROW, TRUE, 0, _T("") \                }; \                                                      return wc; \                                            }               //入口回调函数申明  static LRESULT CALLBACK                                        StartWindowProc(     HWND hWnd,                                              UINT uMsg,                                              WPARAM wParam,     LPARAM lParam);   //申明  窗口类的实际消息响应处理函数WindowProcstatic LRESULT CALLBACK                                        WindowProc(     HWND hWnd,     <--StartWindowProc通过THUNK处理将WindowProc的第一个参数HWND映射成为类的this指针                                         UINT uMsg,                                              WPARAM wParam,     LPARAM lParam);   ....//第二步:当有WM_XXX消息时进入注册窗口所指定的入口回调函数StartWindowProctemplate <class TBase, class TWinTraits>                LRESULT CALLBACK                                        CWindowImplBaseT< TBase, TWinTraits >::StartWindowProc(     HWND hWnd,                                              UINT uMsg,                                              WPARAM wParam, LPARAM lParam)                       {                                                           CWindowImplBaseT< TBase, TWinTraits >* pThis =              (CWindowImplBaseT< TBase, TWinTraits >*)                _AtlWinModule.ExtractCreateWndData();               ATLASSERT(pThis != NULL);                               pThis->m_hWnd = hWnd;                                                                                           pThis->m_thunk.Init(pThis->GetWindowProc(), pThis); <--m_thunk 后面讲。先明白它要干嘛    WNDPROC pProc = pThis->m_thunk.GetWNDPROC();            WNDPROC pOldProc = (WNDPROC)::SetWindowLongPtr(hWnd,        GWLP_WNDPROC, (LONG_PTR)pProc);                     return pProc(hWnd, uMsg, wParam, lParam);           }     //第三步:StartWindowProc通过THUNK处理将WindowProc的第一个参数HWND映射成为类的this指针并调用template <class TBase, class TWinTraits>                          LRESULT CALLBACK                                                  CWindowImplBaseT< TBase, TWinTraits >::WindowProc(HWND hWnd,          UINT uMsg, WPARAM wParam, LPARAM lParam) {                          CWindowImplBaseT< TBase, TWinTraits >* pThis =                        (CWindowImplBaseT< TBase, TWinTraits >*)hWnd;          <--入口参数hwnd被替换成了类的this指针                                                               ...                    // pass to the message map to process                             LRESULT lRes;                                                     BOOL bRet = pThis->ProcessWindowMessage(pThis->m_hWnd,     <--最终处理(响应)窗口消息           uMsg, wParam, lParam, lRes, 0);                 ...}                                                                               

StartWindowProc是一个静态函数,这也是为什么上面要提及static静态函数([*]标记处)的原因。所以可以作为窗口过程直接使用。我们可以看到,他的参数正是窗口过程的四个参数。上面的代码有那么多。总结起来很好理解,其实THUNK是在试图在做这么一件事:调用类成员函数做为窗口过程响应窗口消息。用代码来表达更直观:

template <class TBase, class TWinTraits>                LRESULT CALLBACK CWindowImplBaseT< TBase, TWinTraits >::StartWindowProc(HWND hWnd,                                              UINT uMsg,WPARAM wParam, LPARAM lParam)                       {     ((CWindowImplBaseT<TBase, TWinTraits> *)hWnd)->ProcessWindowMessage(...);   <--THUNK真正要做的就是替换第一个参数hwnd为this                                                                                    并调用类成员函数响应窗口消息}
这下把THUNK扒了个精光看到了其真正意途。那么接下来就来看看是怎么一步步实现这一目的的。说了这么半天。THUNK长什么样还没见过。
A thunk is a set of machine instructions built on-the-fly and executed. The thunk's job is to replace the HWND on the stack with the CWindowImpl object's this pointer before calling the CWindowImpl static member function WindowProc to further process the message.  The ASM instructions that replace the HWND with the object's this pointer are kept in a data structure called the _stdcallthunk. Versions of this structure are defined for 32-bit x86, AMD64, ALPHA, MIPS, SH3, ARM, and IA64 (Itanium) processors. The 32-bit x86 definition follows: #if defined(_M_IX86)                                    PVOID __stdcall __AllocStdCallThunk(VOID);              VOID  __stdcall __FreeStdCallThunk(PVOID);                                                                      #pragma pack(push,1)   <---这个编译器指令我就不解释了。GOOGLE下字节对齐更清楚。要全部展开来说估计得用一本书的长度。           struct _stdcallthunk {                                      DWORD   m_mov;     // mov dword ptr [esp+0x4], pThis                       // (esp+0x4 is hWnd)                 DWORD   m_this;    // Our CWindowImpl this pointer      BYTE    m_jmp;     // jmp WndProc                       DWORD   m_relproc; // relative jmp                                                                              BOOL Init(DWORD_PTR proc, void* pThis) {                    m_mov = 0x042444C7; //C7 44 24 0C                       m_this = PtrToUlong(pThis);                             m_jmp = 0xe9;                                           m_relproc = DWORD((INT_PTR)proc - ((INT_PTR)this+sizeof(_stdcallthunk)));             // write block from data cache and                      //  flush from instruction cache                        FlushInstructionCache(GetCurrentProcess(), this,            sizeof(_stdcallthunk));                             return TRUE;                                        }                                                                                                               // some thunks will dynamically allocate the            // memory for the code                                  void* GetCodeAddress() {                                    return this;                                        }                                                       void* operator new(size_t) {                                return __AllocStdCallThunk();                       }                                                                                                               void operator delete(void* pThunk) {                        __FreeStdCallThunk(pThunk);                         }                                                   };                                                      #pragma pack(pop)                                                                                               #elif defined(_M_AMD64)                                 ... // Other processors omitted for clarity

上面代码中的部分文摘其实已经说明了这个THUNK结构体的用途:它是一段分配在数据堆栈上的结构体,通过一段机器码(汇编指令)使它能够得以执行(上面结构体中Init函数):在调用类(这个类是继承自CWinodwImpl类)的静态消息处理函数(WndProc)前将WndProc的第一个参数hwnd替换成类的this指针。这里就有个知识点,因为windows xp sp2之后,为了对付层出不穷的缓冲区溢出攻击,windows推出了一个新的feature叫Data execution prevention。如果这个feature被启用,那么堆上和栈上的数据是不可以执行的,如果thunk是位于new出来的代码,那么一执行就会crash。这也就是为什么在THUNK结构体中会要重写(Override)用于内存申请和释放的关键字new和delete(深入分析可以看这里)。如果看不懂上面结构体中Init函数在干嘛。换个写法来理解:

BOOL Init(DWORD_PTR proc, void* pThis) {                m_mov = 0x042444C7; // mov dword ptr [esp+4] 的机器码为 C7 44 24 04,后面紧接着的一个 DWORD 是 mov 的第二个操作数既pThis。    m_this = PtrToUlong(pThis);                           m_jmp = 0xe9;       //jmp 的机器码是 e9,后面紧接着的一个 DWORD 是跳转的相对地址(m_relproc结果)。    m_relproc = DWORD((INT_PTR)proc - ((INT_PTR)this+sizeof(_stdcallthunk)));         ....    return TRUE;                                    }                    ||        ||  翻译成汇编        \/__asm{    mov dword ptr [esp+4], pThis  ;调用 WndProc 时,堆栈结构为:RetAddr, hWnd, message, wParam, lParam, ... 故 [esp+4]    jmp WndProc; //WndProc地址既为 DWORD((INT_PTR)proc-((INT_PTR)this+sizeof(_stdcallthunk)))计算结果}

至于跳转的相对地址是如何计算的,可以参考一下这篇文章C++ Thunk技术(初学版)

After the thunk has been set up in StartWindowProc, each window message is routed from the CWindowImpl object's thunk to a static member function of CWindowImpl, to a member function of the CWindowImpl object itself, shown below:

结尾语:陆陆续续这篇文章写了有一个星期。除了无节操的加班影响,关键我想把我学习THUNK机制过程和思路通过文章理清,但水平有限,感觉写的虎头蛇尾。想面面俱到但发现每个知识点想全部讲清好像永远写不完。好在网络资源丰富,我权当抛砖引玉啦。也希望有不对之处诚请指教。
最后感谢以下同学的文章分享:
http://blog.csdn.net/maplewasp/article/details/2343819
http://www.cnblogs.com/jasonsun/archive/2010/08/11/1797462.html
http://www.cppblog.com/Streamlet/archive/2010/10/24/131064.html
http://www.cnblogs.com/georgepei/archive/2012/03/30/2425472.html