Hyperlink Analysis for the Web

来源:互联网 发布:软件系统报价方案 编辑:程序博客网 时间:2024/05/09 05:20
        Information retrieval is a computer science subfield whose goal is to find all documents relevant to a user query in a given collection of documents. As such, information retrieval should really be called document retrieval. Before the advent of the Web, IR systems were typically installed in libraries for use mostly by reference librarians. The retrieval algorithm for these systems was usually based exclusively on analysis of the words in the document.
        The Web changed all this. Now each Web user has access to various search engines whose retrieval algorithms often use not only the words in the documents but also information like the hyperlink structure of the Web or markup language tags.
        How are hyperlinks useful? The hyperlink functionality alone—that is, the hyperlink to Web page B that is contained in Web page A—is not directly useful in information retrieval. However, the way Web page authors use hyperlinks can give them valuable information content. Authors usually create hyperlinks they think will be useful to readers. Some may be navigational aids that, for example, take the reader back to the site’s home page; others provide access to documents that augment the content of the current page. The latter tend to point to highquality pages that might be on the same topic as the page containing the hyperlink. Web information retrieval systems can exploit this information to refine searches for relevant documents.
        Hyperlink analysis significantly improves the relevance of the search results, so much so that all major Web search engines claim to use some type of hyperlink analysis. However, the search engines do not disclose details about the type of hyperlink analysis they perform—mostly to avoid manipulation of search results by Web-positioning companies.
       In this article, I discuss how hyperlink analysis can be applied to ranking algorithms, and survey other ways Web search engines can use this analysis.

这篇文章来自于Monika R. Henzinger的文章《Hyperlink Analysis for the Web》,上面只是概述,有兴趣的朋友自己用GOOGLE搜索一下呵