QT之webkit 分析

来源：互联网发布：java接口和抽象类区别编辑：程序博客网时间：2024/04/30 15:19

QT之webkit 分析（一）

WebKit是QT4新整合的第三方构件。按照惯例动手分析之前，先了解大概。下面由其他地方转贴过来。
WebKit由三个模块组成：JavaScriptCore、WebCore 和 WebKit。WebKit作为了整个项目的名称。
其目录结构：（未校准）

WebCore

¨Page与外框相关的内容(Frame,Page,History,Focus,Window)

¨Loader加载资源及Cache

¨HTML-DOM HTML内容及解析

¨DOM- DOM CORE内容

¨XML- XML内容及解析

¨Render-排版功能

¨CSS-DOM CSS内容

¨Binding-DOM与JavascriptCore绑定的功能

¨Editing-所有与编辑相关的功能

JavascriptCore-javascript引擎

¨API-基本javascript功能

¨Binding与其它功能绑定的功能,如:DOM,C,JNI

¨DerviedSource自动产生的代码

¨ForwordHeads头文件,无实际意义

¨PCRE-Perl-Compatible Regular Expressions

¨KJS-Javascript Kernel

¨WTF-KDE的C++模板库

Unicode unicode 库

Tools tools库

CURL-url 客户端传输库

PlatForm- 与平台相关的功能,如图形图像,字体,Unicode, IO,输入法等.

在QT自带的例子中，有WebKit相关的例子。我选中previewer作为分析的项目。

QT之webkit 分析（二）

previewer是QT自带的例子，运行之后的样子：
QT分析之WebKit（二） - net_worm - net_worm 的博客

我是通过输入URL，进行跟踪分析的。下面是断点保存的调用堆栈，暂存资料。

     QtWebKitd4.dll!WebCore::MainResourceLoader::loadNow(WebCore::ResourceRequest & r={...}) 行458    C++
     QtWebKitd4.dll!WebCore::MainResourceLoader::load(const WebCore::ResourceRequest & r={...}, const WebCore::SubstituteData & substituteData={...}) 行494 + 0x12 字节    C++
     QtWebKitd4.dll!WebCore::DocumentLoader::startLoadingMainResource(unsigned long identifier=0x00000004) 行807 + 0x32 字节    C++
     QtWebKitd4.dll!WebCore::FrameLoader::continueLoadAfterWillSubmitForm(WebCore::PolicyAction __formal=PolicyUse) 行3274 + 0x16 字节    C++
     QtWebKitd4.dll!WebCore::FrameLoader::continueLoadAfterNavigationPolicy(const WebCore::ResourceRequest & __formal={...}, WTF::PassRefPtr<WebCore::FormState> formState={...}, bool shouldContinue=true) 行3968    C++
     QtWebKitd4.dll!WebCore::FrameLoader::callContinueLoadAfterNavigationPolicy(void * argument=0x01d424e8, const WebCore::ResourceRequest & request={...}, WTF::PassRefPtr<WebCore::FormState> formState={...}, bool shouldContinue=true) 行3906    C++
     QtWebKitd4.dll!WebCore::PolicyCheck::call(bool shouldContinue=true) 行4963 + 0x3b 字节    C++
     QtWebKitd4.dll!WebCore::FrameLoader::continueAfterNavigationPolicy(WebCore::PolicyAction policy=PolicyUse) 行3899    C++
     QtWebKitd4.dll!WebCore::FrameLoaderClientQt::slotCallPolicyFunction(int action=0x00000000) 行194    C++
     QtWebKitd4.dll!WebCore::FrameLoaderClientQt::dispatchDecidePolicyForNavigationAction(void (WebCore::PolicyAction)* function=0x10018f0c, const WebCore::NavigationAction & action={...}, const WebCore::ResourceRequest & request={...}, WTF::PassRefPtr<WebCore::FormState> __formal={...}) 行938    C++
     QtWebKitd4.dll!WebCore::FrameLoader::checkNavigationPolicy(const WebCore::ResourceRequest & request={...}, WebCore::DocumentLoader * loader=0x00f63ff8, WTF::PassRefPtr<WebCore::FormState> formState={...}, void (void *, const WebCore::ResourceRequest &, WTF::PassRefPtr<WebCore::FormState>, bool)* function=0x1004e661, void * argument=0x01d424e8) 行3868    C++
     QtWebKitd4.dll!WebCore::FrameLoader::loadWithDocumentLoader(WebCore::DocumentLoader * loader=0x00f63ff8, WebCore::FrameLoadType type=FrameLoadTypeRedirectWithLockedHistory, WTF::PassRefPtr<WebCore::FormState> prpFormState={...}) 行2291    C++
     QtWebKitd4.dll!WebCore::FrameLoader::loadWithNavigationAction(const WebCore::ResourceRequest & request={...}, const WebCore::NavigationAction & action={...}, WebCore::FrameLoadType type=FrameLoadTypeRedirectWithLockedHistory, WTF::PassRefPtr<WebCore::FormState> formState={...}) 行2226    C++
     QtWebKitd4.dll!WebCore::FrameLoader::loadURL(const WebCore::KURL & newURL={...}, const WebCore::String & referrer={...}, const WebCore::String & frameName={...}, WebCore::FrameLoadType newLoadType=FrameLoadTypeRedirectWithLockedHistory, WebCore::Event * event=0x00000000, WTF::PassRefPtr<WebCore::FormState> prpFormState={...}) 行2174    C++
     QtWebKitd4.dll!WebCore::FrameLoaderClientQt::createFrame(const WebCore::KURL & url={...}, const WebCore::String & name={...}, WebCore::HTMLFrameOwnerElement * ownerElement=0x00f681a0, const WebCore::String & referrer={...}, bool allowsScrolling=false, int marginWidth=0xffffffff, int marginHeight=0xffffffff) 行981 + 0x70 字节    C++
     QtWebKitd4.dll!WebCore::FrameLoader::loadSubframe(WebCore::HTMLFrameOwnerElement * ownerElement=0x00f681a0, const WebCore::KURL & url={...}, const WebCore::String & name={...}, const WebCore::String & referrer={...}) 行472 + 0x74 字节    C++
     QtWebKitd4.dll!WebCore::FrameLoader::requestFrame(WebCore::HTMLFrameOwnerElement * ownerElement=0x00f681a0, const WebCore::String & urlString={...}, const WebCore::AtomicString & frameName={...}) 行442 + 0x29 字节    C++
     QtWebKitd4.dll!WebCore::HTMLFrameElementBase::openURL() 行105    C++
     QtWebKitd4.dll!WebCore::HTMLFrameElementBase::setNameAndOpenURL() 行161    C++
     QtWebKitd4.dll!WebCore::HTMLFrameElementBase::setNameAndOpenURLCallback(WebCore::Node * n=0x00f681a0) 行166    C++
     QtWebKitd4.dll!WebCore::ContainerNode::dispatchPostAttachCallbacks() 行572 + 0x7 字节    C++
     QtWebKitd4.dll!WebCore::ContainerNode::attach() 行587    C++
     QtWebKitd4.dll!WebCore::Element::attach() 行648    C++
     QtWebKitd4.dll!WebCore::HTMLFrameElementBase::attach() 行194    C++
     QtWebKitd4.dll!WebCore::HTMLFrameElement::attach() 行67    C++
     QtWebKitd4.dll!WebCore::HTMLParser::insertNode(WebCore::Node * n=0x00f681a0, bool flat=false) 行351    C++
     QtWebKitd4.dll!WebCore::HTMLParser::parseToken(WebCore::Token * t=0x00f65fd0) 行256 + 0x19 字节    C++
>    QtWebKitd4.dll!WebCore::HTMLTokenizer::processToken() 行1902 + 0x20 字节    C++
     QtWebKitd4.dll!WebCore::HTMLTokenizer::parseTag(WebCore::SegmentedString & src={...}, WebCore::HTMLTokenizer::State state={...}) 行1484 + 0x12 字节    C++
     QtWebKitd4.dll!WebCore::HTMLTokenizer::write(const WebCore::SegmentedString & str={...}, bool appendData=true) 行1730 + 0x23 字节    C++
     QtWebKitd4.dll!WebCore::FrameLoader::write(const char * str=0x01d3f5c0, int len=0x000001df, bool flush=false) 行1039 + 0x23 字节    C++
     QtWebKitd4.dll!WebCore::FrameLoader::addData(const char * bytes=0x01d3f5c0, int length=0x000001df) 行1891    C++
     QtWebKitd4.dll!WebCore::FrameLoaderClientQt::committedLoad(WebCore::DocumentLoader * loader=0x00f881e0, const char * data=0x01d3f5c0, int length=0x000001df) 行680    C++
     QtWebKitd4.dll!WebCore::FrameLoader::committedLoad(WebCore::DocumentLoader * loader=0x00f881e0, const char * data=0x01d3f5c0, int length=0x000001df) 行3513    C++
     QtWebKitd4.dll!WebCore::DocumentLoader::commitLoad(const char * data=0x01d3f5c0, int length=0x000001df) 行356    C++
     QtWebKitd4.dll!WebCore::DocumentLoader::receivedData(const char * data=0x01d3f5c0, int length=0x000001df) 行368    C++
     QtWebKitd4.dll!WebCore::FrameLoader::receivedData(const char * data=0x01d3f5c0, int length=0x000001df) 行2342    C++
     QtWebKitd4.dll!WebCore::MainResourceLoader::addData(const char * data=0x01d3f5c0, int length=0x000001df, bool allAtOnce=false) 行147    C++
     QtWebKitd4.dll!WebCore::ResourceLoader::didReceiveData(const char * data=0x01d3f5c0, int length=0x000001df, __int64 lengthReceived=0x00000000000001df, bool allAtOnce=false) 行267    C++
     QtWebKitd4.dll!WebCore::MainResourceLoader::didReceiveData(const char * data=0x01d3f5c0, int length=0x000001df, __int64 lengthReceived=0x00000000000001df, bool allAtOnce=false) 行342    C++
     QtWebKitd4.dll!WebCore::ResourceLoader::didReceiveData(WebCore::ResourceHandle * __formal=0x00fb9aa0, const char * data=0x01d3f5c0, int length=0x000001df, int lengthReceived=0x000001df) 行418    C++
     QtWebKitd4.dll!WebCore::QNetworkReplyHandler::forwardData() 行341    C++
     QtWebKitd4.dll!WebCore::QNetworkReplyHandler::qt_metacall(QMetaObject::Call _c=InvokeMetaMethod, int _id=0x00000002, void * * _a=0x00fba378) 行74    C++
     QtCored4.dll!QMetaCallEvent::placeMetaCall(QObject * object=0x00f810d0) 行478    C++
     QtCored4.dll!QObject::event(QEvent * e=0x01d3ee18) 行1102 + 0x14 字节    C++
     QtGuid4.dll!QApplicationPrivate::notify_helper(QObject * receiver=0x00f810d0, QEvent * e=0x01d3ee18) 行4065 + 0x11 字节    C++
     QtGuid4.dll!QApplication::notify(QObject * receiver=0x00f810d0, QEvent * e=0x01d3ee18) 行3605 + 0x10 字节    C++
     QtCored4.dll!QCoreApplication::notifyInternal(QObject * receiver=0x00f810d0, QEvent * event=0x01d3ee18) 行610 + 0x15 字节    C++
     QtCored4.dll!QCoreApplication::sendEvent(QObject * receiver=0x00f810d0, QEvent * event=0x01d3ee18) 行213 + 0x39 字节    C++
     QtCored4.dll!QCoreApplicationPrivate::sendPostedEvents(QObject * receiver=0x00000000, int event_type=0x00000000, QThreadData * data=0x00e78f60) 行1247 + 0xd 字节    C++
     QtCored4.dll!QEventDispatcherWin32::processEvents(QFlags<enum QEventLoop::ProcessEventsFlag> flags={...}) 行679 + 0x10 字节    C++
     QtGuid4.dll!QGuiEventDispatcherWin32::processEvents(QFlags<enum QEventLoop::ProcessEventsFlag> flags={...}) 行1182 + 0x15 字节    C++
     QtCored4.dll!QEventLoop::processEvents(QFlags<enum QEventLoop::ProcessEventsFlag> flags={...}) 行150    C++
     QtCored4.dll!QEventLoop::exec(QFlags<enum QEventLoop::ProcessEventsFlag> flags={...}) 行201 + 0x2d 字节    C++
     QtCored4.dll!QCoreApplication::exec() 行888 + 0x15 字节    C++
     QtGuid4.dll!QApplication::exec() 行3526    C++
     previewer.exe!main(int argc=0x00000001, char * * argv=0x00e78e20) 行51 + 0x6 字节    C++
     previewer.exe!WinMain(HINSTANCE__ * instance=0x00400000, HINSTANCE__ * prevInstance=0x00000000, char * __formal=0x001520d9, int cmdShow=0x00000001) 行137 + 0x12 字节    C++
     previewer.exe!__tmainCRTStartup() 行574 + 0x35 字节    C
     previewer.exe!WinMainCRTStartup() 行399    C
     kernel32.dll!7c82f23b()
     [下面的框架可能不正确和/或缺失，没有为 kernel32.dll 加载符号]

QT之webkit 分析（三）

分三个阶段对QWebView进行分析：初始化（获取数据）、HTML解析、页面显示。从QT自带的文档中可以知道：

QWebView -> QWebPage => QWebFrame（一个QWebPage含多个QWebFrame）

在界面中选择了Open URL，输入URL之后，调用的是：void MainWindow::openUrl()

void MainWindow::openUrl()
{
    bool ok;
    QString url = QInputDialog::getText(this, tr("Enter a URL"),
                  tr("URL:"), QLineEdit::Normal, "http://", &ok);

    if (ok && !url.isEmpty()) {
        centralWidget->webView->setUrl(url);
    }
}

调用的是QWebView::setUrl()

void QWebView::setUrl(const QUrl &url)
{
page()->mainFrame()->setUrl(url);
}

其中page()是获取QWebPage指针，QWebPage::mainFrame()获取的是QWebFrame指针。

所以调用的是：QWebFrame::setUrl()

void QWebFrame::setUrl(const QUrl &url)
{
    d->frame->loader()->begin(ensureAbsoluteUrl(url));
    d->frame->loader()->end();
    load(ensureAbsoluteUrl(url));
}

ensureAbsoluteUrl()函数作用是，确保URL是绝对URL（完整URL）。所谓相对URL是指没有输入http://或者https://等前缀的web地址。先看第一句的调用。其中隐含了从QUrl到KURL的变换。

void FrameLoader::begin(const KURL& url, bool dispatch, SecurityOrigin* origin)
{
    // We need to take a reference to the security origin because |clear|
    // might destroy the document that owns it.
    RefPtr<SecurityOrigin> forcedSecurityOrigin = origin;

    bool resetScripting = !(m_isDisplayingInitialEmptyDocument && m_frame->document() && m_frame->document()->securityOrigin()->isSecureTransitionTo(url));
    clear(resetScripting, resetScripting);      // 清除上一次的数据，为本次装载准备
    if (resetScripting)
        m_frame->script()->updatePlatformScriptObjects();    // 在Windows平台下，这是空函数
    if (dispatch)
        dispatchWindowObjectAvailable();

    m_needsClear = true;
    m_isComplete = false;
    m_didCallImplicitClose = false;
    m_isLoadingMainResource = true;
    m_isDisplayingInitialEmptyDocument = m_creatingInitialEmptyDocument;

    KURL ref(url);
    ref.setUser(String());
    ref.setPass(String());
    ref.setRef(String());
    m_outgoingReferrer = ref.string();
    m_URL = url;

    RefPtr<Document> document;

    if (!m_isDisplayingInitialEmptyDocument && m_client->shouldUsePluginDocument(m_responseMIMEType))
        document = PluginDocument::create(m_frame);
    else
        document = DOMImplementation::createDocument(m_responseMIMEType, m_frame, m_frame->inViewSourceMode());    // 创建DOM文件，m_responseMIMEType不同实体不同。

// 如果是"text/html"创建HTMLDocument实体；"application/xhtml+xml"创建Document实体

// 如果是"application/x-ftp-directory"则是FTPDirectoryDocument实体

// text/vnd.wap.wml 对应 WMLDocument 实体（无线）

// "application/pdf" /"text/plain" 对应 PluginDocument实体

// 如果是MediaPlayer::supportsType(type)，创建的是MediaDocument实体

// "image/svg+xml" 对应 SVGDocument实体
m_frame->setDocument(document);

    document->setURL(m_URL);
    if (m_decoder)
        document->setDecoder(m_decoder.get());
    if (forcedSecurityOrigin)
        document->setSecurityOrigin(forcedSecurityOrigin.get());

m_frame->domWindow()->setURL(document->url());
m_frame->domWindow()->setSecurityOrigin(document->securityOrigin());

updatePolicyBaseURL(); // 更新排布策略的基础URL

Settings* settings = document->settings();
document->docLoader()->setAutoLoadImages(settings && settings->loadsImagesAutomatically());

    if (m_documentLoader) {
        String dnsPrefetchControl = m_documentLoader->response().httpHeaderField("X-DNS-Prefetch-Control");
        if (!dnsPrefetchControl.isEmpty())
            document->parseDNSPrefetchControlHeader(dnsPrefetchControl);
    }

#if FRAME_LOADS_USER_STYLESHEET
    KURL userStyleSheet = settings ? settings->userStyleSheetLocation() : KURL();
    if (!userStyleSheet.isEmpty())
        m_frame->setUserStyleSheetLocation(userStyleSheet);
#endif

restoreDocumentState();

document->implicitOpen();

if (m_frame->view())
m_frame->view()->setContentsSize(IntSize());

#if USE(LOW_BANDWIDTH_DISPLAY)
    // Low bandwidth display is a first pass display without external resources
    // used to give an instant visual feedback. We currently only enable it for
    // HTML documents in the top frame.
    if (document->isHTMLDocument() && !m_frame->tree()->parent() && m_useLowBandwidthDisplay) {
        m_pendingSourceInLowBandwidthDisplay = String();
        m_finishedParsingDuringLowBandwidthDisplay = false;
        m_needToSwitchOutLowBandwidthDisplay = false;
        document->setLowBandwidthDisplay(true);
    }
#endif
}

看其中document->implicitOpen()的代码：

void Document::implicitOpen()
{
cancelParsing();

    clear();
    m_tokenizer = createTokenizer();
    setParsing(true);
}

Tokenizer *HTMLDocument::createTokenizer()
{
    bool reportErrors = false;
    if (frame())
        if (Page* page = frame()->page())
            reportErrors = page->inspectorController()->windowVisible();

return new HTMLTokenizer(this, reportErrors);
}

新创建的HTMLTokenizer对象，就是HTML的解析器。

回到QWebFrame::setUrl()的第二句：d->frame->loader()->end();

只是把上次未完的解析停止：

void FrameLoader::endIfNotLoadingMainResource()
{
if (m_isLoadingMainResource || !m_frame->page())
return;

    // http://bugs.webkit.org/show_bug.cgi?id=10854
    // The frame's last ref may be removed and it can be deleted by checkCompleted(),
    // so we'll add a protective refcount
    RefPtr<Frame> protector(m_frame);

    // make sure nothing's left in there
    if (m_frame->document()) {
        write(0, 0, true);
        m_frame->document()->finishParsing();
   } else
        // WebKit partially uses WebCore when loading non-HTML docs. In these cases doc==nil, but
        // WebCore is enough involved that we need to checkCompleted() in order for m_bComplete to
        // become true. An example is when a subframe is a pure text doc, and that subframe is the
        // last one to complete.
        checkCompleted();
}

再来看QWebFrame::setUrl()的第三句：load(ensureAbsoluteUrl(url));

void QWebFrame::load(const QUrl &url)
{
load(QNetworkRequest(ensureAbsoluteUrl(url)));
}

新建一个QNetworkRequest对象，然后调用

    void load(const QNetworkRequest &request,
              QNetworkAccessManager::Operation operation = QNetworkAccessManager::GetOperation,
              const QByteArray &body = QByteArray());
看其代码：

void QWebFrame::load(const QNetworkRequest &req,
                     QNetworkAccessManager::Operation operation,
                     const QByteArray &body)
{
    if (d->parentFrame())
        d->page->d->insideOpenCall = true;

QUrl url = ensureAbsoluteUrl(req.url());

WebCore::ResourceRequest request(url);

    switch (operation) {
        case QNetworkAccessManager::HeadOperation:
            request.setHTTPMethod("HEAD");
            break;
        case QNetworkAccessManager::GetOperation:
            request.setHTTPMethod("GET");
            break;
        case QNetworkAccessManager::PutOperation:
            request.setHTTPMethod("PUT");
            break;
        case QNetworkAccessManager::PostOperation:
            request.setHTTPMethod("POST");
            break;
        case QNetworkAccessManager::UnknownOperation:
            // eh?
            break;
    }

    QList<QByteArray> httpHeaders = req.rawHeaderList();
    for (int i = 0; i < httpHeaders.size(); ++i) {
        const QByteArray &headerName = httpHeaders.at(i);
        request.addHTTPHeaderField(QString::fromLatin1(headerName), QString::fromLatin1(req.rawHeader(headerName)));
    }

if (!body.isEmpty())
request.setHTTPBody(WebCore::FormData::create(body.constData(), body.size()));

d->frame->loader()->load(request);

if (d->parentFrame())
d->page->d->insideOpenCall = false;
}

看关键的FrameLoader::load()

void FrameLoader::load(const ResourceRequest& request)
{
load(request, SubstituteData());
}

void FrameLoader::load(const ResourceRequest& request, const SubstituteData& substituteData)
{
    if (m_inStopAllLoaders)
        return;

    // FIXME: is this the right place to reset loadType? Perhaps this should be done after loading is finished or aborted.
    m_loadType = FrameLoadTypeStandard;
    load(m_client->createDocumentLoader(request, substituteData).get());
}

上面m_client对应的是FrameLoaderClientQt实体，m_client->createDocumentLoader()创建的是DocumentLoader对象。进一步看FrameLoader::load(DocumentLoader *)的代码：

void FrameLoader::load(DocumentLoader* newDocumentLoader)
{
    ResourceRequest& r = newDocumentLoader->request();
    addExtraFieldsToMainResourceRequest(r);
    FrameLoadType type;

    if (shouldTreatURLAsSameAsCurrent(newDocumentLoader->originalRequest().url())) {
        r.setCachePolicy(ReloadIgnoringCacheData);
        type = FrameLoadTypeSame;
    } else
        type = FrameLoadTypeStandard;

    if (m_documentLoader)
        newDocumentLoader->setOverrideEncoding(m_documentLoader->overrideEncoding());

    // When we loading alternate content for an unreachable URL that we're
    // visiting in the history list, we treat it as a reload so the history list
    // is appropriately maintained.
    //
    // FIXME: This seems like a dangerous overloading of the meaning of "FrameLoadTypeReload" ...
    // shouldn't a more explicit type of reload be defined, that means roughly
    // "load without affecting history" ?
    if (shouldReloadToHandleUnreachableURL(newDocumentLoader)) {
        ASSERT(type == FrameLoadTypeStandard);
        type = FrameLoadTypeReload;
    }

loadWithDocumentLoader(newDocumentLoader, type, 0);
}

QT之webkit 分析（四）

接昨天的分析，看FrameLoader::loadWithDocumentLoader()的代码：
void FrameLoader::loadWithDocumentLoader(DocumentLoader* loader, FrameLoadType type, PassRefPtr<FormState> prpFormState)
{
    ASSERT(m_client->hasWebView());

    // Unfortunately the view must be non-nil, this is ultimately due
    // to parser requiring a FrameView. We should fix this dependency.

    ASSERT(m_frame->view());

    m_policyLoadType = type;
    RefPtr<FormState> formState = prpFormState;
    bool isFormSubmission = formState;

    const KURL& newURL = loader->request().url();

    if (shouldScrollToAnchor(isFormSubmission, m_policyLoadType, newURL)) {
        RefPtr<DocumentLoader> oldDocumentLoader = m_documentLoader;
        NavigationAction action(newURL, m_policyLoadType, isFormSubmission);

        oldDocumentLoader->setTriggeringAction(action);
        stopPolicyCheck();
        checkNavigationPolicy(loader->request(), oldDocumentLoader.get(), formState,
            callContinueFragmentScrollAfterNavigationPolicy, this);
    } else {
        if (Frame* parent = m_frame->tree()->parent())
            loader->setOverrideEncoding(parent->loader()->documentLoader()->overrideEncoding());

        stopPolicyCheck();
        setPolicyDocumentLoader(loader);

        checkNavigationPolicy(loader->request(), loader, formState,
            callContinueLoadAfterNavigationPolicy, this);
    }
}
上面调用checkNavigationPolicy()是关键，看其实现：
void FrameLoader::checkNavigationPolicy(const ResourceRequest& request, DocumentLoader* loader,
    PassRefPtr<FormState> formState, NavigationPolicyDecisionFunction function, void* argument)
{
    NavigationAction action = loader->triggeringAction();
    if (action.isEmpty()) {
        action = NavigationAction(request.url(), NavigationTypeOther);
        loader->setTriggeringAction(action);
    }

    // Don't ask more than once for the same request or if we are loading an empty URL.
    // This avoids confusion on the part of the client.
    if (equalIgnoringHeaderFields(request, loader->lastCheckedRequest()) || (!request.isNull() && request.url().isEmpty())) {
        function(argument, request, 0, true);
        loader->setLastCheckedRequest(request);
        return;
    }

    // We are always willing to show alternate content for unreachable URLs;
    // treat it like a reload so it maintains the right state for b/f list.
    if (loader->substituteData().isValid() && !loader->substituteData().failingURL().isEmpty()) {
        if (isBackForwardLoadType(m_policyLoadType))
            m_policyLoadType = FrameLoadTypeReload;
        function(argument, request, 0, true);
        return;
    }

    loader->setLastCheckedRequest(request);

    m_policyCheck.set(request, formState.get(), function, argument);

    m_delegateIsDecidingNavigationPolicy = true;
    m_client->dispatchDecidePolicyForNavigationAction(&FrameLoader::continueAfterNavigationPolicy,
        action, request, formState);
    m_delegateIsDecidingNavigationPolicy = false;
}
其中m_client是FrameLoaderClientQt实体指针，
void FrameLoaderClientQt::dispatchDecidePolicyForNavigationAction(FramePolicyFunction function, const WebCore::NavigationAction& action, const WebCore::ResourceRequest& request, PassRefPtr<WebCore::FormState>)
{
    Q_ASSERT(!m_policyFunction);
    Q_ASSERT(m_webFrame);
    m_policyFunction = function;
#if QT_VERSION < 0x040400
    QWebNetworkRequest r(request);
#else
    QNetworkRequest r(request.toNetworkRequest());
#endif
    QWebPage*page = m_webFrame->page();

    if (!page->d->acceptNavigationRequest(m_webFrame, r, QWebPage::NavigationType(action.type()))) {
        if (action.type() == NavigationTypeFormSubmitted || action.type() == NavigationTypeFormResubmitted)
            m_frame->loader()->resetMultipleFormSubmissionProtection();

        if (action.type() == NavigationTypeLinkClicked && r.url().hasFragment()) {
            ResourceRequest emptyRequest;
            m_frame->loader()->activeDocumentLoader()->setLastCheckedRequest(emptyRequest);
        }

        slotCallPolicyFunction(PolicyIgnore);
        return;
    }
    slotCallPolicyFunction(PolicyUse);
}
void FrameLoaderClientQt::slotCallPolicyFunction(int action)
{
    if (!m_frame || !m_policyFunction)
        return;
    FramePolicyFunction function = m_policyFunction;
    m_policyFunction = 0;
    (m_frame->loader()->*function)(WebCore::PolicyAction(action));
}
用函数指针回调，FrameLoader::continueAfterNavigationPolicy(PolicyAction policy)，参数为PolicyUse
void FrameLoader::continueAfterNavigationPolicy(PolicyAction policy)
{
    PolicyCheck check = m_policyCheck;
    m_policyCheck.clear();

    bool shouldContinue = policy == PolicyUse;

    switch (policy) {
        case PolicyIgnore:
            check.clearRequest();
            break;
        case PolicyDownload:
            m_client->startDownload(check.request());
            check.clearRequest();
            break;
        case PolicyUse: {
            ResourceRequest request(check.request());

            if (!m_client->canHandleRequest(request)) {
                handleUnimplementablePolicy(m_client->cannotShowURLError(check.request()));
                check.clearRequest();
                shouldContinue = false;
            }
            break;
        }
    }

    check.call(shouldContinue);
}
上面调用的是PolicyCheck::call()，参数为true
void PolicyCheck::call(bool shouldContinue)
{
    if (m_navigationFunction)
        m_navigationFunction(m_argument, m_request, m_formState.get(), shouldContinue);
    if (m_newWindowFunction)
        m_newWindowFunction(m_argument, m_request, m_formState.get(), m_frameName, shouldContinue);
    ASSERT(!m_contentFunction);
}
m_navigationFunction又是一个函数指针，指向的是FrameLoader::callContinueLoadAfterNavigationPolicy()
void FrameLoader::callContinueLoadAfterNavigationPolicy(void* argument,
    const ResourceRequest& request, PassRefPtr<FormState> formState, bool shouldContinue)
{
    FrameLoader* loader = static_cast<FrameLoader*>(argument);
    loader->continueLoadAfterNavigationPolicy(request, formState, shouldContinue);
}

void FrameLoader::continueLoadAfterNavigationPolicy(const ResourceRequest&, PassRefPtr<FormState> formState, bool shouldContinue)
{
    // If we loaded an alternate page to replace an unreachableURL, we'll get in here with a
    // nil policyDataSource because loading the alternate page will have passed
    // through this method already, nested; otherwise, policyDataSource should still be set.
    ASSERT(m_policyDocumentLoader || !m_provisionalDocumentLoader->unreachableURL().isEmpty());

    bool isTargetItem = m_provisionalHistoryItem ? m_provisionalHistoryItem->isTargetItem() : false;

    // Two reasons we can't continue:
    //    1) Navigation policy delegate said we can't so request is nil. A primary case of this
    //       is the user responding Cancel to the form repost nag sheet.
    //    2) User responded Cancel to an alert popped up by the before unload event handler.
    // The "before unload" event handler runs only for the main frame.
    bool canContinue = shouldContinue && (!isLoadingMainFrame() || m_frame->shouldClose());

    if (!canContinue) {
        // If we were waiting for a quick redirect, but the policy delegate decided to ignore it, then we
        // need to report that the client redirect was cancelled.
        if (m_quickRedirectComing)
            clientRedirectCancelledOrFinished(false);

        setPolicyDocumentLoader(0);

        // If the navigation request came from the back/forward menu, and we punt on it, we have the
        // problem that we have optimistically moved the b/f cursor already, so move it back. For sanity,
        // we only do this when punting a navigation for the target frame or top-level frame.
        if ((isTargetItem || isLoadingMainFrame()) && isBackForwardLoadType(m_policyLoadType))
            if (Page* page = m_frame->page()) {
                Frame* mainFrame = page->mainFrame();
                if (HistoryItem* resetItem = mainFrame->loader()->m_currentHistoryItem.get()) {
                    page->backForwardList()->goToItem(resetItem);
                    Settings* settings = m_frame->settings();
                    page->setGlobalHistoryItem((!settings || settings->privateBrowsingEnabled()) ? 0 : resetItem);
                }
            }
        return;
    }

    FrameLoadType type = m_policyLoadType;
    stopAllLoaders();

    // <rdar://problem/6250856> - In certain circumstances on pages with multiple frames, stopAllLoaders()
    // might detach the current FrameLoader, in which case we should bail on this newly defunct load.
    if (!m_frame->page())
        return;

    setProvisionalDocumentLoader(m_policyDocumentLoader.get());
    m_loadType = type;
    setState(FrameStateProvisional);

    setPolicyDocumentLoader(0);

    if (isBackForwardLoadType(type) && loadProvisionalItemFromCachedPage())
        return;

    if (formState)
        m_client->dispatchWillSubmitForm(&FrameLoader::continueLoadAfterWillSubmitForm, formState);
    else
        continueLoadAfterWillSubmitForm();
}

void FrameLoader::continueLoadAfterWillSubmitForm(PolicyAction)
{
    if (!m_provisionalDocumentLoader)
        return;

    // DocumentLoader calls back to our prepareForLoadStart
    m_provisionalDocumentLoader->prepareForLoadStart();

    // The load might be cancelled inside of prepareForLoadStart(), nulling out the m_provisionalDocumentLoader,
    // so we need to null check it again.
    if (!m_provisionalDocumentLoader)
        return;
    // 先看活动的DocumentLoader能否装载
    DocumentLoader* activeDocLoader = activeDocumentLoader();
    if (activeDocLoader && activeDocLoader->isLoadingMainResource())
        return;
    // 看Cache中能否装载
    m_provisionalDocumentLoader->setLoadingFromCachedPage(false);

    unsigned long identifier = 0;

    if (Page* page = m_frame->page()) {
        identifier = page->progress()->createUniqueIdentifier();
        dispatchAssignIdentifierToInitialRequest(identifier, m_provisionalDocumentLoader.get(), m_provisionalDocumentLoader->originalRequest());
    }

    if (!m_provisionalDocumentLoader->startLoadingMainResource(identifier))
        m_provisionalDocumentLoader->updateLoading();
}
上面的装载过程，如果是第一次并且只有m_provisionalDocumentLoader的话，只会执行最后一中装载。
bool DocumentLoader::startLoadingMainResource(unsigned long identifier)
{
    ASSERT(!m_mainResourceLoader);
    m_mainResourceLoader = MainResourceLoader::create(m_frame);
    m_mainResourceLoader->setIdentifier(identifier);

    // FIXME: Is there any way the extra fields could have not been added by now?
    // If not, it would be great to remove this line of code.
    frameLoader()->addExtraFieldsToMainResourceRequest(m_request);

    if (!m_mainResourceLoader->load(m_request, m_substituteData)) {
        // FIXME: If this should really be caught, we should just ASSERT this doesn't happen;
        // should it be caught by other parts of WebKit or other parts of the app?
        LOG_ERROR("could not create WebResourceHandle for URL %s -- should be caught by policy handler level", m_request.url().string().ascii().data());
        m_mainResourceLoader = 0;
        return false;
    }

    return true;
}
创建MainResourceLoader对象，并调用load()
bool MainResourceLoader::load(const ResourceRequest& r, const SubstituteData& substituteData)
{
    ASSERT(!m_handle);

    m_substituteData = substituteData;

#if ENABLE(OFFLINE_WEB_APPLICATIONS)
    // Check if this request should be loaded from the application cache
    if (!m_substituteData.isValid() && frameLoader()->frame()->settings() && frameLoader()->frame()->settings()->offlineWebApplicationCacheEnabled()) {
        ASSERT(!m_applicationCache);

        m_applicationCache = ApplicationCacheGroup::cacheForMainRequest(r, m_documentLoader.get());

        if (m_applicationCache) {
            // Get the resource from the application cache. By definition, cacheForMainRequest() returns a cache that contains the resource.
            ApplicationCacheResource* resource = m_applicationCache->resourceForRequest(r);
            m_substituteData = SubstituteData(resource->data(),
                                              resource->response().mimeType(),
                                              resource->response().textEncodingName(), KURL());
        }
    }
#endif

    ResourceRequest request(r);
    bool defer = defersLoading();
    if (defer) {
        bool shouldLoadEmpty = shouldLoadAsEmptyDocument(r.url());
        if (shouldLoadEmpty)
            defer = false;
    }
    if (!defer) {
        if (loadNow(request)) {
            // Started as an empty document, but was redirected to something non-empty.
            ASSERT(defersLoading());
            defer = true;
        }
    }
    if (defer)
        m_initialRequest = request;

    return true;
}
继续深入看MainResourceLoader::loadNow()
bool MainResourceLoader::loadNow(ResourceRequest& r)
{
    bool shouldLoadEmptyBeforeRedirect = shouldLoadAsEmptyDocument(r.url());

    ASSERT(!m_handle);
    ASSERT(shouldLoadEmptyBeforeRedirect || !defersLoading());

    // Send this synthetic delegate callback since clients expect it, and
    // we no longer send the callback from within NSURLConnection for
    // initial requests.
    willSendRequest(r, ResourceResponse());

    // <rdar://problem/4801066>
    // willSendRequest() is liable to make the call to frameLoader() return NULL, so we need to check that here
    if (!frameLoader())
        return false;

    const KURL& url = r.url();
    bool shouldLoadEmpty = shouldLoadAsEmptyDocument(url) && !m_substituteData.isValid();

    if (shouldLoadEmptyBeforeRedirect && !shouldLoadEmpty && defersLoading())
        return true;

    if (m_substituteData.isValid())
        handleDataLoadSoon(r);
    else if (shouldLoadEmpty || frameLoader()->representationExistsForURLScheme(url.protocol()))
        handleEmptyLoad(url, !shouldLoadEmpty);
    else
        m_handle = ResourceHandle::create(r, this, m_frame.get(), false, true, true);

    return false;
}
主要两个调用：willSendRequest()和ResourceHandle::create()，前面一个估计是发送请求前的相关设定；后一个就是请求发送了。先看前一个：
void MainResourceLoader::willSendRequest(ResourceRequest& newRequest, const ResourceResponse& redirectResponse)
{
    // Note that there are no asserts here as there are for the other callbacks. This is due to the
    // fact that this "callback" is sent when starting every load, and the state of callback
    // deferrals plays less of a part in this function in preventing the bad behavior deferring
    // callbacks is meant to prevent.
    ASSERT(!newRequest.isNull());

    // The additional processing can do anything including possibly removing the last
    // reference to this object; one example of this is 3266216.
    RefPtr<MainResourceLoader> protect(this);

    // Update cookie policy base URL as URL changes, except for subframes, which use the
    // URL of the main frame which doesn't change when we redirect.
    if (frameLoader()->isLoadingMainFrame())
        newRequest.setMainDocumentURL(newRequest.url());

    // If we're fielding a redirect in response to a POST, force a load from origin, since
    // this is a common site technique to return to a page viewing some data that the POST
    // just modified.
    // Also, POST requests always load from origin, but this does not affect subresources.
    if (newRequest.cachePolicy() == UseProtocolCachePolicy && isPostOrRedirectAfterPost(newRequest, redirectResponse))
        newRequest.setCachePolicy(ReloadIgnoringCacheData);

    ResourceLoader::willSendRequest(newRequest, redirectResponse);

    // Don't set this on the first request. It is set when the main load was started.
    m_documentLoader->setRequest(newRequest);

    // FIXME: Ideally we'd stop the I/O until we hear back from the navigation policy delegate
    // listener. But there's no way to do that in practice. So instead we cancel later if the
    // listener tells us to. In practice that means the navigation policy needs to be decided
    // synchronously for these redirect cases.

    ref(); // balanced by deref in continueAfterNavigationPolicy
    frameLoader()->checkNavigationPolicy(newRequest, callContinueAfterNavigationPolicy, this);
}
主要是调用ResourceLoader::willSendRequest()函数：
void ResourceLoader::willSendRequest(ResourceRequest& request, const ResourceResponse& redirectResponse)
{
    // Protect this in this delegate method since the additional processing can do
    // anything including possibly derefing this; one example of this is Radar 3266216.
    RefPtr<ResourceLoader> protector(this);

    ASSERT(!m_reachedTerminalState);

    if (m_sendResourceLoadCallbacks) {
        if (!m_identifier) {
            m_identifier = m_frame->page()->progress()->createUniqueIdentifier();
            frameLoader()->assignIdentifierToInitialRequest(m_identifier, request);
        }

        frameLoader()->willSendRequest(this, request, redirectResponse);
    }

    m_request = request;
}
进一步调用FrameLoader::willSendRequest()
void FrameLoader::willSendRequest(ResourceLoader* loader, ResourceRequest& clientRequest, const ResourceResponse& redirectResponse)
{
    applyUserAgent(clientRequest);
    dispatchWillSendRequest(loader->documentLoader(), loader->identifier(), clientRequest, redirectResponse);
}
更多的调用：
void FrameLoader::dispatchWillSendRequest(DocumentLoader* loader, unsigned long identifier, ResourceRequest& request, const ResourceResponse& redirectResponse)
{
    StringImpl* oldRequestURL = request.url().string().impl();
    m_documentLoader->didTellClientAboutLoad(request.url());

    m_client->dispatchWillSendRequest(loader, identifier, request, redirectResponse);

    // If the URL changed, then we want to put that new URL in the "did tell client" set too.
    if (oldRequestURL != request.url().string().impl())
        m_documentLoader->didTellClientAboutLoad(request.url());

    if (Page* page = m_frame->page())
        page->inspectorController()->willSendRequest(loader, identifier, request, redirectResponse);
}
囧~~还有下一步吗？？
m_client->dispatchWillSendRequest()实际调用的是FrameLoaderClientQt::dispatchWillSendRequest()，目前是一个空函数（仅在dump的时候打印信息）。
void InspectorController::willSendRequest(DocumentLoader*, unsigned long identifier, ResourceRequest& request, const ResourceResponse& redirectResponse)
{
    if (!enabled())
        return;

    InspectorResource* resource = m_resources.get(identifier).get();
    if (!resource)
        return;

    resource->startTime = currentTime();

    if (!redirectResponse.isNull()) {
        updateResourceRequest(resource, request);
        updateResourceResponse(resource, redirectResponse);
    }

    if (resource != m_mainResource && windowVisible()) {
        if (!resource->scriptObject)
            addScriptResource(resource);
        else
            updateScriptResourceRequest(resource);

        updateScriptResource(resource, resource->startTime, resource->responseReceivedTime, resource->endTime);

        if (!redirectResponse.isNull())
            updateScriptResourceResponse(resource);
    }
}
在这里设定了开始时间，猜测是供请求超时判断用的，请求超时的定时器在何处设定有待进一步分析。
看都是一些Resource的更新，感觉意义不大，不再进一步追踪。回到MainResourceLoader::loadNow()，看下一步ResourceHandle::create()
PassRefPtr<ResourceHandle> ResourceHandle::create(const ResourceRequest& request, ResourceHandleClient* client,
    Frame* frame, bool defersLoading, bool shouldContentSniff, bool mightDownloadFromHandle)
{
    RefPtr<ResourceHandle> newHandle(adoptRef(new ResourceHandle(request, client, defersLoading, shouldContentSniff, mightDownloadFromHandle)));

    if (!request.url().isValid()) {
        newHandle->scheduleFailure(InvalidURLFailure);
        return newHandle.release();
    }
    // 检查端口号(port)是否合法
    if (!portAllowed(request)) {
        newHandle->scheduleFailure(BlockedFailure);
        return newHandle.release();
    }

    if (newHandle->start(frame))
        return newHandle.release();

    return 0;
}
看关键的ResourceHandle::start调用：
bool ResourceHandle::start(Frame* frame)
{
    if (!frame)
        return false;

    Page *page = frame->page();
    // If we are no longer attached to a Page, this must be an attempted load from an
    // onUnload handler, so let's just block it.
    if (!page)
        return false;

    getInternal()->m_frame = static_cast<FrameLoaderClientQt*>(frame->loader()->client())->webFrame();
#if QT_VERSION < 0x040400
    return QWebNetworkManager::self()->add(this, getInternal()->m_frame->page()->d->networkInterface);
#else
    ResourceHandleInternal *d = getInternal();
    d->m_job = new QNetworkReplyHandler(this, QNetworkReplyHandler::LoadMode(d->m_defersLoading));
    return true;
#endif
}
新创建了一个QNetworkReplyHandler对象，QNetworkReplyHandler在构造的时候会调用QNetworkReplyHandler::start()
void QNetworkReplyHandler::start()
{
    m_shouldStart = false;

    ResourceHandleInternal* d = m_resourceHandle->getInternal();

    QNetworkAccessManager* manager = d->m_frame->page()->networkAccessManager();

    const QUrl url = m_request.url();
    const QString scheme = url.scheme();
    // Post requests on files and data don't really make sense, but for
    // fast/forms/form-post-urlencoded.html and for fast/forms/button-state-restore.html
    // we still need to retrieve the file/data, which means we map it to a Get instead.
    if (m_method == QNetworkAccessManager::PostOperation
        && (!url.toLocalFile().isEmpty() || url.scheme() == QLatin1String("data")))
        m_method = QNetworkAccessManager::GetOperation;

    m_startTime = QDateTime::currentDateTime().toTime_t();

    switch (m_method) {
        case QNetworkAccessManager::GetOperation:
            m_reply = manager->get(m_request);
            break;
        case QNetworkAccessManager::PostOperation: {
            FormDataIODevice* postDevice = new FormDataIODevice(d->m_request.httpBody());
            m_reply = manager->post(m_request, postDevice);
            postDevice->setParent(m_reply);
            break;
        }
        case QNetworkAccessManager::HeadOperation:
            m_reply = manager->head(m_request);
            break;
        case QNetworkAccessManager::PutOperation: {
            FormDataIODevice* putDevice = new FormDataIODevice(d->m_request.httpBody());
            m_reply = manager->put(m_request, putDevice);
            putDevice->setParent(m_reply);
            break;
        }
        case QNetworkAccessManager::UnknownOperation: {
            m_reply = 0;
            ResourceHandleClient* client = m_resourceHandle->client();
            if (client) {
                ResourceError error(url.host(), 400 /*bad request*/,
                                    url.toString(),
                                    QCoreApplication::translate("QWebPage", "Bad HTTP request"));
                client->didFail(m_resourceHandle, error);
            }
            return;
        }
    }

    m_reply->setParent(this);

    connect(m_reply, SIGNAL(finished()),
            this, SLOT(finish()), Qt::QueuedConnection);

    // For http(s) we know that the headers are complete upon metaDataChanged() emission, so we
    // can send the response as early as possible
    if (scheme == QLatin1String("http") || scheme == QLatin1String("https"))
        connect(m_reply, SIGNAL(metaDataChanged()),
                this, SLOT(sendResponseIfNeeded()), Qt::QueuedConnection);

    connect(m_reply, SIGNAL(readyRead()),
            this, SLOT(forwardData()), Qt::QueuedConnection);
}
看到了熟悉的QNetworkAccessManager、QNetworkReply。跟踪至此，初始化和URL请求发送基本完成。

QT之webkit 分析（五）

前面分析WebView初始化的时候，在QNetworkReplyHandler::start()里有设定读取数据的处理函数：
    connect(m_reply, SIGNAL(finished()),
            this, SLOT(finish()), Qt::QueuedConnection);

    // For http(s) we know that the headers are complete upon metaDataChanged() emission, so we
    // can send the response as early as possible
    if (scheme == QLatin1String("http") || scheme == QLatin1String("https"))
        connect(m_reply, SIGNAL(metaDataChanged()),
                this, SLOT(sendResponseIfNeeded()), Qt::QueuedConnection);

    connect(m_reply, SIGNAL(readyRead()),
            this, SLOT(forwardData()), Qt::QueuedConnection);

先看QNetworkReplyHandler::forwardData()
void QNetworkReplyHandler::forwardData()
{
    m_shouldForwardData = (m_loadMode == LoadDeferred);
    if (m_loadMode == LoadDeferred)
        return;

    sendResponseIfNeeded();

    // don't emit the "Document has moved here" type of HTML
    if (m_redirected)
        return;

    if (!m_resourceHandle)
        return;

    QByteArray data = m_reply->read(m_reply->bytesAvailable());

    ResourceHandleClient* client = m_resourceHandle->client();
    if (!client)
        return;

    if (!data.isEmpty())
        client->didReceiveData(m_resourceHandle, data.constData(), data.length(), data.length() /*FixMe*/);
}
实际就是两个调用：read()和didReceiveData()。其中QNetworkReply::read()前面分析过不再重复；
ResourceHandleClient* client->didReceiveData()实际调用的是MainResourceLoader::didReceiveData()
void MainResourceLoader::didReceiveData(const char* data, int length, long long lengthReceived, bool allAtOnce)
{
    ASSERT(data);
    ASSERT(length != 0);

    // There is a bug in CFNetwork where callbacks can be dispatched even when loads are deferred.
    // See <rdar://problem/6304600> for more details.
#if !PLATFORM(CF)
    ASSERT(!defersLoading());
#endif

    // The additional processing can do anything including possibly removing the last
    // reference to this object; one example of this is 3266216.
    RefPtr<MainResourceLoader> protect(this);

    ResourceLoader::didReceiveData(data, length, lengthReceived, allAtOnce);
}
进一步看其调用：
void ResourceLoader::didReceiveData(const char* data, int length, long long lengthReceived, bool allAtOnce)
{
    // Protect this in this delegate method since the additional processing can do
    // anything including possibly derefing this; one example of this is Radar 3266216.
    RefPtr<ResourceLoader> protector(this);

    addData(data, length, allAtOnce);
    // FIXME: If we get a resource with more than 2B bytes, this code won't do the right thing.
    // However, with today's computers and networking speeds, this won't happen in practice.
    // Could be an issue with a giant local file.
    if (m_sendResourceLoadCallbacks && m_frame)
        frameLoader()->didReceiveData(this, data, length, static_cast<int>(lengthReceived));
}
在ResourceLoader类中addData()是虚函数，client->didReceiveData()中client指针实际的实体为MainResourceLoader对象，所以addData()先调用MainResourceLoader::addData()
void MainResourceLoader::addData(const char* data, int length, bool allAtOnce)
{
    ResourceLoader::addData(data, length, allAtOnce);
    frameLoader()->receivedData(data, length);
}
这里只有两个调用，前一个是将接收到的数据保存到一个buffer中，供后续语法扫描使用（猜测的），暂不深入分析。看frameLoader->receivedData()
void FrameLoader::receivedData(const char* data, int length)
{
    activeDocumentLoader()->receivedData(data, length);
}

void DocumentLoader::receivedData(const char* data, int length)
{
    m_gotFirstByte = true;
    if (doesProgressiveLoad(m_response.mimeType()))
        commitLoad(data, length);
}
其中doesProgressiveLoad()会测试MIME的类型，重点是commitLoad()

void DocumentLoader::commitLoad(const char* data, int length)
{
    // Both unloading the old page and parsing the new page may execute JavaScript which destroys the datasource
    // by starting a new load, so retain temporarily.
    RefPtr<DocumentLoader> protect(this);

    commitIfReady();
    if (FrameLoader* frameLoader = DocumentLoader::frameLoader())
        frameLoader->committedLoad(this, data, length);
}
前面一个调用：commitIfReady()是清理前一次页面扫描的中间数据；committedLoad()才是正题。

void FrameLoader::committedLoad(DocumentLoader* loader, const char* data, int length)
{
    if (ArchiveFactory::isArchiveMimeType(loader->response().mimeType()))
        return;
    m_client->committedLoad(loader, data, length);
}
其中m_client指向的是FrameLoaderClientQT对象实体。

void FrameLoaderClientQt::committedLoad(WebCore::DocumentLoader* loader, const char* data, int length)
{
    if (!m_pluginView) {
        if (!m_frame)
            return;
        FrameLoader *fl = loader->frameLoader();
        if (m_firstData) {
            fl->setEncoding(m_response.textEncodingName(), false);
            m_firstData = false;
        }
        fl->addData(data, length);
    }

    // We re-check here as the plugin can have been created
    if (m_pluginView) {
        if (!m_hasSentResponseToPlugin) {
            m_pluginView->didReceiveResponse(loader->response());
            // didReceiveResponse sets up a new stream to the plug-in. on a full-page plug-in, a failure in
            // setting up this stream can cause the main document load to be cancelled, setting m_pluginView
            // to null
            if (!m_pluginView)
                return;
            m_hasSentResponseToPlugin = true;
        }
        m_pluginView->didReceiveData(data, length);
    }
}

其中fl->setEncoding()是根据服务器返回的HTML数据流设定编码格式（例如：中文gb2312），另外处理了其他一些事情，例如Redirect等。fl->addData()是关键：

void FrameLoader::addData(const char* bytes, int length)
{
    ASSERT(m_workingURL.isEmpty());
    ASSERT(m_frame->document());
    ASSERT(m_frame->document()->parsing());
    write(bytes, length);
}
上面的FrameLoader::write()调用，启动了HTML/JS分析扫描。后一篇深入HTML扫描分析。

QT之webkit 分析（六）

在继续分析FrameLoader::write()之前，先回到《QT分析之WebKit（二）》。那里曾经保存了一个完整的调用堆栈，
……
QtWebKitd4.dll!WebCore::HTMLTokenizer::write(const WebCore::SegmentedString & str={...}, bool appendData=true) 行1730 + 0x23 字节    C++
QtWebKitd4.dll!WebCore::FrameLoader::write(const char *
可知调用的次序为：FrameLoader::write()调用了HTMLTokenizer::write()。
下面是FrameLoader::write()的定义：
        void write(const char* str, int len = -1, bool flush = false);
这里包含了两个缺省值调用定义，在前一篇，调用的形式是：write(bytes, length);
实际传递的的是：write(bytes, length, false);
接着看write()的实现：
void FrameLoader::write(const char* str, int len, bool flush)
{
    if (len == 0 && !flush)
        return;

    if (len == -1)
        len = strlen(str);

    Tokenizer* tokenizer = m_frame->document()->tokenizer();
    if (tokenizer && tokenizer->wantsRawData()) {
        if (len > 0)
            tokenizer->writeRawData(str, len);
        return;
    }

    if (!m_decoder) {
        Settings* settings = m_frame->settings();
        m_decoder = TextResourceDecoder::create(m_responseMIMEType, settings ? settings->defaultTextEncodingName() : String());
        if (m_encoding.isEmpty()) {
            Frame* parentFrame = m_frame->tree()->parent();
            if (parentFrame && parentFrame->document()->securityOrigin()->canAccess(m_frame->document()->securityOrigin()))
                m_decoder->setEncoding(parentFrame->document()->inputEncoding(), TextResourceDecoder::DefaultEncoding);
        } else {
            m_decoder->setEncoding(m_encoding,
                m_encodingWasChosenByUser ? TextResourceDecoder::UserChosenEncoding : TextResourceDecoder::EncodingFromHTTPHeader);
        }
        m_frame->document()->setDecoder(m_decoder.get());
    }

    String decoded = m_decoder->decode(str, len);
    if (flush)
        decoded += m_decoder->flush();
    if (decoded.isEmpty())
        return;

#if USE(LOW_BANDWIDTH_DISPLAY)
    if (m_frame->document()->inLowBandwidthDisplay())
        m_pendingSourceInLowBandwidthDisplay.append(decoded);
#endif

    if (!m_receivedData) {
        m_receivedData = true;
        if (m_decoder->encoding().usesVisualOrdering())
            m_frame->document()->setVisuallyOrdered();
        m_frame->document()->recalcStyle(Node::Force);
    }

    if (tokenizer) {
        ASSERT(!tokenizer->wantsRawData());
        tokenizer->write(decoded, true);
    }
}
怎么和HTMLTokenizer关联的呢？就是在《QT分析之WebKit（三）》初始化Document对象的时候关联上的。
DOMImplementation::createDocument()
上面程序做了一些边缘的工作，例如设定编码（因为可以在HTTP协议、HTML的TITLE部分或者浏览器特别指定编码），主要是新建一个decoder另外一个是调用tokenizer->write()

QT之webkit 分析（七）

接着前面的分析，先看m_decoder->decode(str, len);
String TextResourceDecoder::decode(const char* data, size_t len)
{
    if (!m_checkedForBOM)
        checkForBOM(data, len);  // 检查是否为Unicode编码

    bool movedDataToBuffer = false;

    if (m_contentType == CSS && !m_checkedForCSSCharset)
        if (!checkForCSSCharset(data, len, movedDataToBuffer))  // 如果是CSS，则检查CSS的字符集
            return "";

    if ((m_contentType == HTML || m_contentType == XML) && !m_checkedForHeadCharset) // HTML and XML
        if (!checkForHeadCharset(data, len, movedDataToBuffer))  // 检查HTML/XML的字符集
            return "";

    // Do the auto-detect if our default encoding is one of the Japanese ones.
    // FIXME: It seems wrong to change our encoding downstream after we have already done some decoding.
    if (m_source != UserChosenEncoding && m_source != AutoDetectedEncoding && encoding().isJapanese())
        detectJapaneseEncoding(data, len);  // 检查日文编码（为什么没有检查中文编码的啊？）

    ASSERT(encoding().isValid());

    if (m_buffer.isEmpty())
        return m_decoder.decode(data, len, false, m_contentType == XML, m_sawError);

    if (!movedDataToBuffer) {
        size_t oldSize = m_buffer.size();
        m_buffer.grow(oldSize + len);
        memcpy(m_buffer.data() + oldSize, data, len);
    }

    String result = m_decoder.decode(m_buffer.data(), m_buffer.size(), false, m_contentType == XML, m_sawError);
    m_buffer.clear();
    return result;
}
再回到tokenizer->write(decoded, true);看其具体实现：
bool HTMLTokenizer::write(const SegmentedString& str, bool appendData)
{
    if (!m_buffer)
        return false;

    if (m_parserStopped)
        return false;

    SegmentedString source(str);
    if (m_executingScript)
        source.setExcludeLineNumbers();

    if ((m_executingScript && appendData) || !m_pendingScripts.isEmpty()) {
        // don't parse; we will do this later
        if (m_currentPrependingSrc)
            m_currentPrependingSrc->append(source);
        else {
            m_pendingSrc.append(source);
#if PRELOAD_SCANNER_ENABLED
            if (m_preloadScanner && m_preloadScanner->inProgress() && appendData)
                m_preloadScanner->write(source);
#endif
        }
        return false;
    }

#if PRELOAD_SCANNER_ENABLED
    if (m_preloadScanner && m_preloadScanner->inProgress() && appendData)
        m_preloadScanner->end();
#endif

    if (!m_src.isEmpty())
        m_src.append(source);
    else
        setSrc(source);

    // Once a timer is set, it has control of when the tokenizer continues.
    if (m_timer.isActive())
        return false;

    bool wasInWrite = m_inWrite;
    m_inWrite = true;

#ifdef INSTRUMENT_LAYOUT_SCHEDULING
    if (!m_doc->ownerElement())
        printf("Beginning write at time %d ", m_doc->elapsedTime());
#endif

    int processedCount = 0;
    double startTime = currentTime();

    Frame* frame = m_doc->frame();

    State state = m_state;

    while (!m_src.isEmpty() && (!frame || !frame->loader()->isScheduledLocationChangePending())) {
        if (!continueProcessing(processedCount, startTime, state))
            break;

        // do we need to enlarge the buffer?
        checkBuffer();

        UChar cc = *m_src;

        bool wasSkipLF = state.skipLF();
        if (wasSkipLF)
            state.setSkipLF(false);

        if (wasSkipLF && (cc == ' '))
            m_src.advance();
        else if (state.needsSpecialWriteHandling()) {
            // it's important to keep needsSpecialWriteHandling with the flags this block tests
            if (state.hasEntityState())
                state = parseEntity(m_src, m_dest, state, m_cBufferPos, false, state.hasTagState());
            else if (state.inPlainText())
                state = parseText(m_src, state);
            else if (state.inAnySpecial())
                state = parseSpecial(m_src, state);
            else if (state.inComment())
                state = parseComment(m_src, state);
            else if (state.inDoctype())
                state = parseDoctype(m_src, state);
            else if (state.inServer())
                state = parseServer(m_src, state);
            else if (state.inProcessingInstruction())
                state = parseProcessingInstruction(m_src, state);
            else if (state.hasTagState())
                state = parseTag(m_src, state);
            else if (state.startTag()) {
                state.setStartTag(false);

                switch(cc) {
                case '/':
                    break;
                case '!': {
                    // or
                    searchCount = 1; // Look for '                    m_doctypeSearchCount = 1;
                    break;
                }
                case '?': {
                    // xml processing instruction
                    state.setInProcessingInstruction(true);
                    tquote = NoQuote;
                    state = parseProcessingInstruction(m_src, state);
                    continue;

                    break;
                }
                case '%':
                    if (!m_brokenServer) {
                        // <% server stuff, handle as comment %>
                        state.setInServer(true);
                        tquote = NoQuote;
                        state = parseServer(m_src, state);
                        continue;
                    }
                    // else fall through
                default: {
                    if( ((cc >= 'a') && (cc <= 'z')) || ((cc >= 'A') && (cc <= 'Z'))) {
                        // Start of a Start-Tag
                    } else {
                        // Invalid tag
                        // Add as is
                        *m_dest = '<';
                        m_dest++;
                        continue;
                    }
                }
                }; // end case

                processToken();

                m_cBufferPos = 0;
                state.setTagState(TagName);
                state = parseTag(m_src, state);
            }
        } else if (cc == '&' && !m_src.escaped()) {
            m_src.advancePastNonNewline();
            state = parseEntity(m_src, m_dest, state, m_cBufferPos, true, state.hasTagState());
        } else if (cc == '<' && !m_src.escaped()) {
            m_currentTagStartLineNumber = m_lineNumber;
            m_src.advancePastNonNewline();
            state.setStartTag(true);
            state.setDiscardLF(false);
        } else if (cc == ' ' || cc == ' ') {
            if (state.discardLF())
                // Ignore this LF
                state.setDiscardLF(false); // We have discarded 1 LF
            else {
                // Process this LF
                *m_dest++ = ' ';
                if (cc == ' ' && !m_src.excludeLineNumbers())
                    m_lineNumber++;
            }

            /* Check for MS-DOS CRLF sequence */
            if (cc == ' ')
                state.setSkipLF(true);
            m_src.advance(m_lineNumber);
        } else {
            state.setDiscardLF(false);
            *m_dest++ = cc;
            m_src.advancePastNonNewline();
        }
    }

#ifdef INSTRUMENT_LAYOUT_SCHEDULING
    if (!m_doc->ownerElement())
        printf("Ending write at time %d ", m_doc->elapsedTime());
#endif

    m_inWrite = wasInWrite;

    m_state = state;

    if (m_noMoreData && !m_inWrite && !state.loadingExtScript() && !m_executingScript && !m_timer.isActive()) {
        end(); // this actually causes us to be deleted
        return true;
    }
    return false;
}
在调用的时候，因为调用参数decoded是String类型的，所以先隐含转化成SegmentedString。SegmentedString可以附带行号，也可以不带行号（可以设定）。上面程序中的while循环主体，就是一个分析程序主体。

QT之webkit 分析（八）

分析到HTML解析，看到一个博士的blog，对WebKit结构的解析相当犀利，转贴如下：

邓侃的博客

http://blog.sina.com.cn/s/blog_46d0a3930100d5pt.html
【20】WebKit的结构与解构

从指定一个HTML文本文件，到绘制出一幅布局复杂，字体多样，内含图片音频视频等等多媒体内容的网页，这是一个复杂的过程。在这个过程中Webkit所做的一切，都是围绕DOM Tree和Rendering Tree这两个核心。上一章我们谈到这两棵树各自的功用，这一章，我们借一个简单的HTML文件，展示一下DOM Tree和Rendering Tree的具体构成，同时解剖一下Webkit是如何构造这两棵树的。Figure 1. From HTML to webpage, and the underlying DOM
QT分析之WebKit（八） - net_worm - net_worm 的博客

QT分析之WebKit（八） - net_worm - net_worm 的博客

tree and rendering tree.
Courtesy http://farm4.static.flickr.com/3351/3556972420_23a30366c2_o.jpg

1. DOM Tree 与 Rendering Tree 的结构

Figure 1中左上是一个简单的HTML文本文件，右上是Webkit rendering engine绘制出来的页面。页面的内容包括一个标题，“AI”，一行正文，“Ape's Intelligence”，以及一幅照片。整个页面分成前后两个层面，标题和正文绘制在前一个层面，照片处于后一个层面。L君和我亦步亦趋地跟踪了，从解析这个HTML文本文件，到生成DOM Tree和Rendering Tree的整个流程，目的是为了了解DOM Tree和Rendering Tree的具体成份，以及构造的各个步骤。

先说Figure 1中左下角的DOM Tree。基本上HTML文本文件中每个tag，在webkit/webcore/html中都有一个class与之对应。譬如<HTML> tag 对应HTMLHtmlElement，<HEAD> tag 对应HTMLHeadElement，<STYLE> tag 对应HTMLStyleElement 等等。比较特别的是DOM Tree的根节点，HTMLDocument，在HTML文本文件中没有哪个tag与之对应。关于HTMLDocument的作用，我们稍后介绍。整个 DOM Tree的结构，与HTML文本文件中各个tags的嵌套关系也一一对应。一言以蔽之，DOM Tree就是把HTML文本文件翻译成object树状结构。

需要强调的是，DOM Tree是一个通用数据结构，任何XML文本文件都可以翻译成DOM Tree，而不仅仅限于HTML文本文件。webkit/webcore/html 中林林总总html classes，基本上都是webkit/webcore/dom 中的某个class的子类，也就是说，/html 是 /dom的一个特例。这样的设计，为将来把Webkit拓展到HTML格式以外的页面的布局和渲染，埋下了伏笔。所以严格地讲，Figure 1中左下的DOM Tree，实际上是一个HTML DOM Tree。

再看Rendering Tree，显著的特点在于，

a. 整个Rendering Tree树状结构，与HTML DOM Tree树状结构一一对应。也就是说，几乎每个HTML DOM Tree中的节点，在Rendering Tree中都有对应的节点。节点与节点之间的父子或兄弟关系也一一对应。

例外的是，在HTML DOM Tree有HTMLStyleElement叶子节点，而在Rendering Tree中，没有相应的叶子节点。原因是，Rendering Tree各个节点，都涉及页面中某块区域的布局和渲染。而HTMLStyleElement，并不直接涉及某块区域的布局和渲染，HTML DOM Tree中HTMLStyleElement叶子节点包含的内容，已经融入Rendering Tree中RenderImage叶子节点的属性中去了。另外，因为Rendering Tree中不存在与HTMLStyleElement相应的叶子节点，所以，与HTMLHeadElement对应的节点也没有必要存在。

b. webkit/webcore/rendering中各个class与HTML tags并没有一一对应的关系。

Rendering Tree是一个通用的规划页面布局和渲染的机制，这个通用机制可以服务于HTML页面，但是并不仅仅限于为HTML页面服务，我们可以用 Rendering Tree来规划其它格式的页面的布局和渲染。以DOM Tree和Rendering Tree为核心的Webkit渲染机，是一个功能强大，扩展性良好的通用渲染机。它不仅可以用来绘制HTML页面，也可以用来渲染其它格式的页面，譬如可以用它来制作email阅读和管理器，制作数据库管理工具，甚至制作游戏界面。

稍微让人有点吃惊的是，对于 HTMLHtmlElement，HTMLBodyElement，HTMLHeadingElement和HTMLParagraphElement，在Rendering Tree中通通以RenderBlock呼应。如果说HTMLHeadingElement和HTMLParagraphElement的区别不大，仅仅是字体和对齐方式有些微小的差别，所以Rendering Tree可以用RenderBlock来统一应对。那么问题是，HTMLHtmlElement和HTMLBodyElement是两种容器，总是出现在 DOM Tree的中部，而从来不会作为叶子节点出现，对应于这样的容器节点，为什么Rendering Tree不另设一种class，与RenderBlock有所区别呢？不过话又说回来，这不是个大问题，最多是个美感的问题。
QT分析之WebKit（八） - net_worm - net_worm 的博客

Figure 2. The construction sequence of the root of the DOM tree.
Courtesy http://farm4.static.flickr.com/3010/3554310018_e34d271344_o.jpg

2. DOM Tree 与 Rendering Tree 的根节点

前一节中我们提到HTMLDocument是一个比较特殊的class，它是整个HTML DOM Tree的根节点，但是不对应任何HTML tag。JavaScript中经常出现的document，指的就是这个根。例如，

“document.getElementById(x).style.background="yellow";”

HTML文本文件，通常是以<HTML>开头，以</HTML>结尾。但是<HTML> tag并不对应DOM Tree的根节点，而是根以下的第一个子节点，即HTMLHtmlElement节点。

初看Figure 2 觉得有点意外，当用户在浏览器里打开一个空白页面的时候，立刻生成了DOM Tree的根节点HTMLDocument，与Rendering Tree的根节点RenderView。而这个时候，用户并没有给定URL，也就是说，对于浏览器来讲，这时候具体的HTML文本文件并不存在。根节点与具体HTML内容相脱节，或许暗示了Webkit的两个设计思路，

a. DOM Tree的根节点HTMLDocument，与Rendering Tree的根节点RenderView，可以重复利用。

当用户在同一个浏览器页面中，先后打开两个不同的URLs，也就是两个不同的HTML文本文时，HTMLDocument和RenderView两个根节点并没有发生改变，改变的是HTMLHtmlElement以下的子树，以及对应的Rendering Tree的子树。

为什么这样设计？原因是HTMLDocument和RenderView服从于浏览器页面的设置，譬如页面的大小和在整个屏幕中的位置等等。这些设置与页面中要显示什么的内容无关。同时HTMLDocument绑定HTMLTokenizer和HTMLParser，这两个构件也与某一个具体的HTML内容无关。

b. 同一个DOM Tree的根节点可以悬挂多个HTML子树，同一个Rendering Tree的根节点可以悬挂多个RenderBlock子树。

在我们目前所见到的浏览器中，每一个页面通常只显示一个HTML文件。虽然一个HTML文件可以分割成多个frames，每个frame承载一个独立的 HTML文件，但是从DOM Tree结构来讲，HTMLDocument根节点以下，只有一个子节点，这个子节点是HTMLHtmlElement，它领衔某个HTML文本文件对应的子树。Rendering Tree也一样，目前我们见到的网页中，一个RenderView根节点以下，也只有一个RenderBlock子节点。

但是Webkit的设计，却允许同一个根以下，悬挂多个HTML子树。虽然我们目前没有看到一个页面中，并存多个HTML文件，并存多个布局和渲染风格的情景，但是Webkit为将来的拓展留下了空间。前文中所设想的个性化，多皮肤，多视角的浏览器页面绘制，用Webkit实现起来难度不大。
QT分析之WebKit（八） - net_worm - net_worm 的博客

Figure 3. The construction sequence of the DOM Tree and the Rendering Tree.
Courtesy http://farm4.static.flickr.com/3627/3554182242_b0bec88534_b.jpg

3. DOM Tree 与 Rendering Tree 的构筑

HTMLDocument 根节点包含的最重要的构件是HTMLTokenizer，而HTMLTokenizer又包含HTMLParser这个构件。HTMLTokenizer 从前到后读取HTML文本文件中每一个字符，并从中提取出各个HTML tags以及它们的内容。而HTMLParser不仅负责HTML DOM Tree的构筑，而且也同时负责Rendering Tree的构筑。

在Figure 3中，从第8步到第11步，HTMLParser根据一个HTML Tag生成一个HTML DOM Tree节点。从第12步到第17步，生成相应的Rendering Tree的节点，并把它和HTML DOM Tree的节点勾连在一起。这张图的细节过多，读解不容易。Figure 4把第8步到第17步演示了一下。
QT分析之WebKit（八） - net_worm - net_worm 的博客

Figure 4. An illustration of the construction of a DOM tree node and its corresponding Rendering tree node.
Courtesy http://farm4.static.flickr.com/3306/3554259140_3deb9736ea_o.jpg

值得注意的是，每当HTMLParser生成一个DOM Tree的节点的时候，相应地，也同时生成一个Rendering Tree节点。然后把它们两个新节点勾连在一起。换而言之，Rendering Tree与DOM Tree同步生长。

Webkit 值得赞赏的地方非常多，但是HTMLParser让DOM Tree和Rendering Tree同步生长的做法，却值得商榷。如果同步生长，那么Rendering Tree必然平铺直叙地刻板地忠实于DOM Tree。假设先生成DOM Tree，再生成Rendering Tree，把两者割裂开，就有机会让Webkit发挥更加奇妙的布局和渲染。平铺直叙固然符合大多数人在大多数时间里的阅读习惯，但是离经叛道的设计，也会有市场。一个例子就是上一章末尾处那张多视点的地图。如果让DOM Tree与Rendering Tree同步生长，这样的布局和渲染是难以想像的。

QT之webkit 分析（九）

WebKit的显示，继续转邓侃博士的blog。
【21】WebKit，为了布局，忙并美丽着

如果没有1440年以后活字印刷术的大规模普及，或许就不会有文艺复兴运动，更不会有后来的启蒙运动。如果没有这两个运动的开展，或许就不会有世界范围的工业化。

在活字印刷术出现以前，每出版一本书，都必须先刻制一套模版，称为雕版，每套雕版上的每一个字，都是手工雕刻的。不仅制作雕版费时费力，而且有了错误不容易更改。活字印刷术的进步在于，可以预先批量生产各种样式和大小的字体，称为活字。需要出版某一本书籍时，先制作该书的页面模版，模版做好以后，只需要把这些活字摆放在模版上即可。如果出现错误，只需要调换某些活字，既省时又省力。如果某本书的模版不需要长期保存，还可以把模版中摆放的活字拆解下来，在印刷其它图书时用，节约成本。

活字印刷术没有解决的问题，1. 图像的印刷。起初不能印刷笔触丰富，层次复杂的图像，一直到1796年，石板印刷术(lithography)出现以后，才能印刷表现手段丰富的图像。 2. 灵活的布局排版。纸张大小不同，布局排版也不同，布局变了，需要重新摆放活字，而且有时候还需要改变字体和大小。

灵活的布局排版对于纸质书籍来说，或许并不太重要，但是对于电脑浏览器来说，却必须实现完全的自动化。否则，每当用户改变浏览器窗口的大小的时候，页面内容就不能正确显示。对于手机浏览器来说，布局排版的自动化尤其重要，因为不同手机的屏幕不一致，而且屏幕分辨率也不同。

但是即便是浏览器，也没有摆脱传统的排版方式。所谓传统的排版方式，基本是横平竖直的，单一的鸟瞰视角。

QT分析之WebKit（九） - net_worm - net_worm 的博客

Figure 1. Incunabulum, the end of 15'th century
Courtesy http://www.citrinitas.com/history_of_viscom/images/printing/venice-1505.jpg

Figure 2. City of Words, by Vito Acconi, 1999
Courtesy http://upload.wikimedia.org/wikipedia/en/6/63/%27City_of_Words%27%2C_lithograph_by_Vito_Acconci%2C_1999.jpg

Figure 1 中显示的是1490年代的书籍，不难看出，现代书报中广泛使用的双列，边注，页码，首字母大写等等，都是继承了500多年以前的做法。而CSS规范，囊括了所有这些页面设计的要素。

在当今信息爆炸的形势下，如何安排页面的布局排版，在有限的页面面积内，承载更多内容，突出读者关注的内容，增强页面设计的视觉美感，成为不可回避的问题。例如，手机购物的UI设计，既要包含商品简介，又要包含用户意见反馈，还要包含实物照片，以及各个不同商场的标价等等。完美的页面设计，不仅要求简练而清晰，而且也不能遗漏相关内容，实在是一件困难的事情。可以说，手机购物之所以不普及，与手机购物的UI设计笨拙而丑陋是相关的。

要巧妙地设计手机应用的UI设计，终极而言，需要突破传统的单一鸟瞰视角的方式，Figure 2 就是这方面的尝试。Webkit能不能做到这一点？原理上是可以做到的，但是必须修改源代码。但是在改造以前，我们还是先踏踏实实研究一下，Webkit 的布局排版的内部机制是什么。只有充分了解对方之长，才有可能改进对方之短。

读解Webkit排版布局与绘制的具体实现以前，首先需要明确的是，Webkit把排版布局(layout)，与绘制(paint)，分开处理。

Layout负责确定Render Tree中，每个叶子和中间节点的位置。每个节点在屏幕上的显示，都呈长方形格局。所谓位置，指的是这个长方形左上角起始坐标(X,Y)，以及长方形的宽度和高度。每个中间节点的长方形，里面嵌套着若干小长方形，对应这个中间节点的后代节点等等。

在Layout过程结束以后，Webkit启动 Paint过程，负责把Render Tree中各个叶子节点，在相应的位置绘制出来。Webkit 把具体绘制的工作，交给第三方图形工具库(Graphics Library)去完成。常用的第三方图形工具库包括QT，GTK+，Wx，Skia，Cairo等等。

打个比方，图形工具库相当于活字，以及绘制图像的石板(lithography)，它们负责paint。而从严格意义上来说，Webkit的主要工作是layout，也就是排版布局，相当于版面模版。

关于图形库，台湾的开源高手，黄敬群(Jim Huang / jserv)，写过一篇介绍Google Skia 图形库的文章(http://blog.linux.org.tw/~jserv/archives/002095.html)。文中谈到，

Google 为了搭建Android平台，于2005年8月并购了Android公司。同年11月份，Google还收购了Skia公司。2007年11 月，Google发布Android，并公开部分源代码。当人们热衷于探究Android Dalvik VM的奥秘的时候，忽略了Skia的意义。

2008年9月，Google发布了以改良的Webkit为核心的Chrome PC浏览器。当人们热衷于探究V8 JavaScript引擎等等功能模块时，再次忽略了Skia的意义。

Skia是一个2D图形工具库，该产品的特色在于，能够在手机等等移动设备中，以较低的内存和CPU消耗，呈现高品质的2D图形。

Skia 的创办人，Mike Reed，是图形技术方面的顶尖人物。Mike早年任职于Apple，参与QuickDraw GX项目，处理字型和图像显示。后来他跳槽到OpenWave，开发手机浏览器。在OpenWave工作期间，与Benoit Schillings合作，在50-300KB的内存空间内，提供图层之间alpha blended方式的预览，以及全功能向量矩阵转换等等，真可谓螺丝壳里做道场。后来Benoit Schillings离开OpenWave，去Trolltech任职CTO。Trolltech的主打产品是大名鼎鼎的QT。再后来Trolltech 被Nokia并购，Benoit随之加入Nokia。Benoit Schillings离开OpenWave不久，Mike Reed也离开了OpenWave，去创建Skia公司。

Figure 3. Layout implementation in Webkit
Courtesy http://www.flickr.com/photos/87209438%40N00/3609632247/sizes/l/

Figure 4. Paint implementation in Webkit
Courtesy http://www.flickr.com/photos/87209438%40N00/3609632249/sizes/l/

Figure 3 和 Figure 4，分别显示了Webkit执行排版布局(layout)，以及绘制(paint)的两个过程。仔细查看这两张sequence diagrams，会发现以下特点，

1. Layout 和 Paint 这两个过程完全分开。开始执行Paint过程以前，必然预先执行过Layout，否则图形库就不知道在哪里写字以及显示图像。但是这并不意味着，Layout执行结束后，随即就立刻执行Paint。实际上，Layout执行结束后，触发一个事件，这个事件启动Paint过程。但是Paint过程也可以被其它事件触发，譬如屏幕内容的切换，以及把隐藏的浏览器窗口复原等等。

2. Layout 涵盖了所有CSS规定的布局要素。包括页面边缘与内容之间的空白，文字对插入图像的避让(floating)，单列与多列，上下层覆盖(z-index)等等。

3. 图像，视频播放器插件，Applet等等，在 Layout 被称作 Replaced Render Object。这些 Replaced 元素的宽度和高度可以由CSS规定。如果CSS没有规定，就解析这些元素的数据流，譬如一个JPG照片的metadata里，规定了这幅照片原件的宽度和高度。如果元素自己也没有规定宽度高度，就使用Webkit提供的缺省值。

4. 文字的宽度根据页面的排版来确定。譬如一页中包含多列文字，则每列文字宽度相等。每列文字的宽度，乘以列数，加上列与列之间的夹缝，加上页面边缘空白等等，应当等于页面总的宽度。假设页面总的宽度已知，边缘空白，和列与列之间的夹缝的宽度也已知，就可以反推文字的宽度。

5. Render Tree中每个节点在屏幕上的显示，都呈长方形格局。前面第3点和第4点，描述了宽度的确定。而高度的确定，取决于这个中间节点的所有后代节点的高度的总和。对于 Replaced 元素来说，它的高度相对比较容易确定，而文字段落的高度，需要根据字数，字型，以及字体大小计算得出。

6. 在 Layout 过程中，反复出现以 Repaint 为开头的子过程，例如 repaintAfterLayoutIfNeed

ed()。这些子过程的意义在于，当确定了某个节点的高度和宽度以后，需要对其前辈节点，和左右兄弟节点的位置，做适当调整。严格意义上来讲，这不是repaint，而是relayout。

7. 相对于 Layout 过程，Paint 过程的逻辑要简单得多。Paint的过程，大致按照深度优先的顺序，遍历整棵RenderTree。也就是说，从最左边的叶子节点开始，从左向右逐个绘制 RenderTree所有可以显示的叶子节点。所谓“可以显示的叶子节点”，是因为CSS中可以规定，不显示某些叶子。

反复研究以上Layout和Paint的过程，我们有以下看法。

1. Layout 是一个计算量很繁重的过程。之所以繁重，主要体现在估算完每个RenderTree节点的宽度尤其是高度以后，需要相应调整这个节点的前辈节点以及左邻右舍兄弟节点的位置。对于文字段落而言，它的高度有赖于字数，字体和大小，所以估算不容易准确。

有没有可能把Layout 过程，与第一遍 Paint 过程合二为一？只要遍历一次RenderTree的所有叶子节点，绘制图像并码字。Paint过程结束后，各个叶子节点对应的长方形的起始位置的(X,Y)坐标，以及宽度和高度都自然迎刃而解。然后再由叶子节点开始，逐步确定RenderTree中，各个中间节点的起始位置和宽度高度。这样做的好处是，可以大大降低 Layout 过程的成本。

2. Layout 过程假设每个RenderTree 的节点都对应一个长方形屏幕区域。受限于这个规定，类似于Figure 2的效果，就显示不出来。有没有可能取消这个限制？SVG不仅提供了强大的绘图能力，而且也提供了强大的排版布局能力。能不能把CSS当着SVG格式的一个子集来看待？