自然语言处理中的N-Gram模型详解

来源:互联网 发布:阿里云网站监控 编辑:程序博客网 时间:2024/05/29 04:27
<body><div id="BAIDU_DUP_fp_wrapper" style="position: absolute; left: -1px; bottom: -1px; z-index: 0; width: 0px; height: 0px; overflow: hidden; visibility: hidden; display: none;"><iframe id="BAIDU_DUP_fp_iframe" src="https://pos.baidu.com/wh/o.htm?ltr=" style="width: 0px; height: 0px; visibility: hidden; display: none;"></iframe></div><div style="visibility: hidden; overflow: hidden; position: absolute; top: 0px; height: 1px; width: auto; padding: 0px; border: 0px; margin: 0px; text-align: left; text-indent: 0px; text-transform: none; line-height: normal; letter-spacing: normal; word-spacing: normal;"><div id="MathJax_Hidden"></div></div><div id="MathJax_Message" style="display: none;"></div><iframe frameborder="0" style="display: none;"></iframe><div class="csdn-toolbar csdn-toolbar-skin-black ">        <div class="container row center-block ">          <div class="col-md-3 pull-left logo clearfix"><a href="http://www.csdn.net?ref=toolbar" title="CSDN首页" target="_blank" class="icon"></a><a title="频道首页" href="http://blog.csdn.net?ref=toolbar_logo" class="img blog-icon"></a></div>          <div class="pull-right login-wrap unlogin">            <ul class="btns">              <li class="loginlink"><a href="https://passport.csdn.net/account/login?ref=toolbar" target="_top">登录 </a>|<a target="_top" href="http://passport.csdn.net/account/mobileregister?ref=toolbar&action=mobileRegister"> 注册</a></li>              <li class="search">                <div class="icon on-search-icon">                  <div class="wrap">                    <div class="curr-icon-wrap">                      <div class="curr-icon"></div>                    </div>                    <form action="http://so.csdn.net/search" id="toolbar_search" method="get" target="_blank">                      <input type="hidden" value="toolbar" name="ref" accesskey="2">                      <div class="border">                        <input placeholder="搜索" type="text" value="" name="q" accesskey="2"><span class="icon-enter-sm"></span>                      </div>                    </form>                  </div>                </div>              </li>              <li class="favor">                <div class="icon on-favor-icon">                  <div class="wrap">                    <div class="curr-icon-wrap">                      <div class="curr-icon"></div>                    </div>                    <div style="display:none;" class="favor-success"><span class="msg">收藏成功</span>                      <div class="btns"><span class="btn btn-primary ok">确定</span></div>                    </div>                    <div style="display:none;" class="favor-failed"><span class="icon-danger-lg"></span><span class="msg">收藏失败,请重新收藏</span>                      <div class="btns"><span class="btn btn-primary ok">确定</span></div>                    </div>                    <form role="form" class="form-horizontal favor-form">                      <div class="form-group">                        <div class="clearfix">                          <label for="input-title" class="col-sm-2 control-label"><span class="red_txt">*</span>标题</label>                          <div class="col-sm-10">                            <input id="inputTitle" type="text" placeholder="" class="title form-control">                          </div>                        </div>                        <div class="alert alert-danger"><strong></strong>标题不能为空</div>                      </div>                      <div class="form-group" style="display:none;">                        <label for="input-url" class="col-sm-2 control-label">网址</label>                        <div class="col-sm-10">                          <input id="input-url" type="text" placeholder="" class="url form-control">                        </div>                      </div>                      <div class="form-group">                        <label for="input-tag" class="col-sm-2 tag control-label">标签</label>                        <div class="col-sm-10">                          <input id="input-tag" type="text" class="form-control tag">                        </div>                      </div>                      <div class="form-group">                        <label for="input-description" class="description col-sm-2 control-label">位置</label>                        <div class="col-sm-10">                          <div class="my_lib_box">                            个人主页 - <a href="http://my.csdn.net/" target="_blank">我的知识</a>                          </div>                          <div class="checkbox">                            <div class="pull-left">                              <label>                                <input type="checkbox" name="share" class="save_lib_map" checked="checked">同时保存至:                              </label>                            </div>                            <div class="pull-left">                              <div class="dropdown">                                <button id="toolbar_sele_map" type="button">                                  选择知识图谱                                  <i class="fa fa-chevron-down"></i>                                </button>                                <div class="top_arr"></div>                                <div class="outside">                                  <ul class="dropdown-menu" id="toolbar_Design_knowledge"><li>选择知识图谱</li></ul>                                </div>                              </div>                            </div>                            <div class="pull-left new_txt">                              <a href="http://lib.csdn.net/my/create/structure" target="_blank">新建?</a>                            </div>                          </div>                        </div>                      </div>                      <div class="form-group">                        <div class="col-sm-offset-2 col-sm-10 ft">                          <div class="col-sm-4 pull-left" style="display:none">                            <div class="checkbox">                              <label>                                <input type="checkbox" name="share" checked="checked" class="share">公开                              </label>                            </div>                          </div>                          <div class="col-sm-8 pull-right favor-btns">                            <button type="button" class="cancel btn btn-default">取消</button>                            <button type="submit" class="submit btn btn-primary">收藏</button>                          </div>                        </div>                      </div>                    </form>                  </div>                </div>              </li>              <li class="notify">                <div style="display:none" class="number"></div>                <div style="display:none" class="icon-hasnotes-sm"></div>                <div id="header_notice_num"></div>                <div class="icon on-notify-icon">                  <div class="wrap">                    <div class="curr-icon-wrap">                      <div class="curr-icon"></div>                    </div>                    <div id="note1" class="csdn_note">                      <div class="box"></div>                    <iframe src="about:block" frameborder="0" allowtransparency="true" style="z-index:-1;position:absolute;top:0;left:0;width:100%;height:100%;background:transparent"></iframe></div>                  </div>                </div>              </li>              <li class="ugc">                <div class="icon on-ugc-icon">                  <div class="wrap clearfix">                    <div class="curr-icon-wrap">                      <div class="curr-icon"></div>                    </div>                    <dl>                      <dt><a href="http://geek.csdn.net/news/expert?ref=toolbar" target="_blank" class="p-news clearfix" style="display:none;"><em class="icon"></em><span>分享资讯</span></a></dt>                      <dt style="border: none;"><a href="http://u.download.csdn.net/upload?ref=toolbar" target="_blank" class="p-doc clearfix"><em class="icon"></em><span>传PPT/文档</span></a></dt>                      <dt><a href="http://bbs.csdn.net/topics/new?ref=toolbar" target="_blank" class="p-ask clearfix"><em class="icon"></em><span>提问题</span></a></dt>                      <dt><a href="http://write.blog.csdn.net/postedit?ref=toolbar" target="_blank" class="p-blog clearfix"><em class="icon"></em><span>写博客</span></a></dt>                      <dt><a href="http://u.download.csdn.net/upload?ref=toolbar" target="_blank" class="p-src clearfix"><em class="icon"></em><span>传资源</span></a></dt>                      <dt><a href="https://code.csdn.net/projects/new?ref=toolbar" target="_blank" class="c-obj clearfix"><em class="icon"></em><span>创建项目</span></a></dt>                      <dt><a href="https://code.csdn.net/snippets/new?ref=toolbar" target="_blank" class="c-code clearfix"><em class="icon"></em><span>创建代码片</span></a></dt>                    </dl>                  </div>                </div>              </li>              <li class="profile">                <div class="icon on-profile-icon"><img src="//c.csdnimg.cn/public/common/toolbar/images/100x100.jpg" class="curr-icon-img">                  <div class="wrap clearfix">                    <div class="curr-icon-wrap">                      <div class="curr-icon"></div>                    </div>                    <div class="bd">                      <dl class="clearfix">                        <dt class="pull-left img"><a target="_blank" href="http://my.csdn.net?ref=toolbar" class="avatar"><img src="//c.csdnimg.cn/public/common/toolbar/images/100x100.jpg"></a></dt>                        <dd class="info" style="border: none;"><a target="_blank" href="http://my.csdn.net?ref=toolbar" class="nickname"></a><a class="set-nick" href="https://passport.csdn.net/account/profile">设置昵称<span class="write-icon"></span></a><span class="dec"><a class="fill-dec" href="//my.csdn.net" target="_blank">编辑自我介绍,让更多人了解你<span class="write-icon"></span></a></span></dd>                      </dl>                    </div>                    <div class="ft clearfix"><a target="_blank" href="http://my.csdn.net/my/account/changepwd?ref=toolbar" class="pull-left"><span class="icon-cog"></span>帐号设置</a><a href="https://passport.csdn.net/account/logout?ref=toolbar" target="_top" class="pull-left" style="margin-left:132px; width:18px; height:27px; white-space:nowrap; overflow:hidden;"><span class="icon-signout"></span><span class="out">退出</span></a></div>                  </div>                </div>              </li>              <li class="apps">                <div id="chasnew123" class="hasnew" style="display: none;"></div>                <div id="cappsarea123" class="icon on-apps-icon">                  <div class="wrap clearfix">                    <div class="curr-icon-wrap">                      <div class="curr-icon"></div>                    </div>                  <div class="detail">                    <dl>                      <dt>                        <h5>社区</h5>                      </dt>                      <dd> <a href="http://blog.csdn.net?ref=toolbar" target="_blank">博客</a></dd>                      <dd> <a href="http://bbs.csdn.net?ref=toolbar" target="_blank">论坛</a></dd>                      <dd> <a href="http://download.csdn.net?ref=toolbar" target="_blank">下载</a></dd>                      <dd> <a href="http://lib.csdn.net?ref=toolbar" target="_blank">知识库</a></dd>                      <dd><a href="http://ask.csdn.net?ref=toolbar" target="_blank">技术问答</a></dd>                      <dd><a href="http://geek.csdn.net?ref=toolbar" target="_blank">极客头条</a></dd>                      <dd style="display:none"> <a href="http://hero.csdn.net?ref=toolbar" target="_blank">英雄会</a></dd>                    </dl>                  </div>                  <div class="detail">                    <dl>                      <dt>                        <h5>服务</h5>                      </dt>                      <dd style="display:none"> <a href="http://job.csdn.net?ref=toolbar" target="_blank">JOB<img src="http://c.csdnimg.cn/public/common/toolbar/images/new.gif" style="display: none; margin-top: -26px; width: 23px;"></a></dd>                      <dd> <a href="http://edu.csdn.net?ref=toolbar" target="_blank">学院<img src="http://c.csdnimg.cn/public/common/toolbar/images/new.gif" style="display: none; margin-top: -26px; width: 23px;"></a></dd>                      <dd> <a href="https://code.csdn.net?ref=toolbar" target="_blank">CODE</a></dd>                      <dd> <a href="http://huiyi.csdn.net/?ref=toolbar" target="_blank">活动</a></dd>                      <dd> <a href="http://www.csto.com?ref=toolbar" target="_blank">CSTO</a></dd>                      <dd> <a href="http://mall.csdn.net?ref=toolbar" target="_blank">C币兑换<img src="http://c.csdnimg.cn/public/common/toolbar/images/new.gif" style="display: none; margin-top: -26px; width: 23px;"></a></dd>                    </dl>                  </div>                  <div class="detail last">                    <dl>                      <dt>                        <h5>俱乐部</h5>                      </dt>                      <dd> <a href="http://cto.csdn.net?ref=toolbar" target="_blank">CTO俱乐部</a></dd>                      <dd> <a href="http://student.csdn.net?ref=toolbar" target="_blank">高校俱乐部</a></dd>                    </dl>                  </div>                </div>              </div>            </li>            </ul>          </div>        </div>    </div>           <!-- 广告位开始 -->            <!-- 广告位结束 -->             <!--new top-->    <script id="toolbar-tpl-scriptId" fixed="true" prod="blog" skin="black" src="http://c.csdnimg.cn/public/common/toolbar/js/html.js" type="text/javascript"></script>     <!--new top-->    <div id="container">        <div id="header">    <div class="header">        <div id="blog_title">            <h2>                <a href="http://blog.csdn.net/baimafujinji">白马负金羁</a></h2>            <h3>数据挖掘 | 统计分析 | 图像处理 | 程序设计</h3>            <div class="clear">            </div>        </div>        <div class="clear">        </div>                 </div></div><div id="navigator">    <div class="navigator_bg">    </div>    <div class="navigator">        <ul>                           <li id="btnContents"><a href="http://blog.csdn.net/baimafujinji?viewmode=contents"><span onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_mulu'])">                    <img src="http://static.blog.csdn.net/images/ico_list.gif">目录视图</span></a></li>                <li id="btnView"><a href="http://blog.csdn.net/baimafujinji?viewmode=list"><span onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_zhaiyao'])">                    <img src="http://static.blog.csdn.net/images/ico_summary.gif">摘要视图</span></a></li>                <li id="btnRss"><a href="http://blog.csdn.net/baimafujinji/rss/list"><span onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_RSS'])">                    <img src="http://static.blog.csdn.net/images/ico_rss.gif">订阅</span></a></li>                                        </ul>    </div></div><script type="text/javascript">    var username = "baimafujinji";    var _blogger = username;    var blog_address = "http://blog.csdn.net/baimafujinji";    var static_host = "http://static.blog.csdn.net";    var currentUserName = "";  </script>        <div id="body">            <div id="main">                <div class="main">                        <div class="ad_class"><div class="notice tracking-ad" data-mod="popu_3"> <a href="http://blog.csdn.net/blogdevteam/article/details/71710010" target="_blank"><font color="blue"><strong>【活动】2017 CSDN博客专栏评选</strong></font></a>    <a href=" http://edu.csdn.net/huiyiCourse/series_detail/37?ref=blog&loc=r0" target="_blank"><font color="red"><strong>【5月书讯】流畅的Python,终于等到你!</strong></font></a>    <a href="http://blog.csdn.net/turingbooks/article/details/72416875" target="_blank"><font color="blue"><strong>CSDN日报20170517 ——《怎样和虐死人的老项目谈恋爱》</strong></font></a></div>                        </div>                          <link href="http://static.blog.csdn.net/css/comment1.css" type="text/css" rel="stylesheet"><link href="http://static.blog.csdn.net/css/style1.css" type="text/css" rel="stylesheet"><link rel="stylesheet" href="http://static.blog.csdn.net/public/res-min/markdown_views.css?v=1.0"><link rel="stylesheet" href="http://static.blog.csdn.net/css/category.css?v=1.0"><script type="text/javascript" src="http://static.blog.csdn.net/scripts/category.js"></script>  <script type="text/ecmascript">      window.quickReplyflag = true;                       var isBole = false;                        var fasrc="http://my.csdn.net/my/favorite/miniadd?t=%e8%87%aa%e7%84%b6%e8%af%ad%e8%a8%80%e5%a4%84%e7%90%86%e4%b8%ad%e7%9a%84N-Gram%e6%a8%a1%e5%9e%8b%e8%af%a6%e8%a7%a3&u=http://blog.csdn.net/baimafujinji/article/details/51281816"    </script><div id="article_details" class="details">    <div class="article_title">            <span class="ico ico_type_Original"></span>    <h1>        <span class="link_title"><a href="/baimafujinji/article/details/51281816">        自然语言处理中的N-Gram模型详解                    </a></span>    </h1></div>           <div class="article_manage clearfix">        <div class="article_l">            <span class="link_categories">            标签:              <a href="http://www.csdn.net/tag/NLP" target="_blank" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_tag']);">NLP</a><a href="http://www.csdn.net/tag/N-Gram" target="_blank" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_tag']);">N-Gram</a><a href="http://www.csdn.net/tag/%e8%87%aa%e7%84%b6%e8%af%ad%e8%a8%80%e5%a4%84%e7%90%86" target="_blank" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_tag']);">自然语言处理</a><a href="http://www.csdn.net/tag/%e6%a8%a1%e7%b3%8a%e5%8c%b9%e9%85%8d" target="_blank" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_tag']);">模糊匹配</a><a href="http://www.csdn.net/tag/%e7%bc%96%e8%be%91%e8%b7%9d%e7%a6%bb" target="_blank" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_tag']);">编辑距离</a>            </span>        </div>        <div class="article_r">            <span class="link_postdate">2016-04-29 21:32</span>            <span class="link_view" title="阅读次数">21293人阅读</span>            <span class="link_comments" title="评论次数"> <a href="#comments" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_pinglun'])">评论</a>(1)</span>            <span class="link_collect tracking-ad" data-mod="popu_171"> <a href="javascript:void(0);" onclick="javascript:collectArticle('%e8%87%aa%e7%84%b6%e8%af%ad%e8%a8%80%e5%a4%84%e7%90%86%e4%b8%ad%e7%9a%84N-Gram%e6%a8%a1%e5%9e%8b%e8%af%a6%e8%a7%a3','51281816');return false;" title="收藏" target="_blank">收藏</a></span>             <span class="link_report"> <a href="#report" onclick="javascript:report(51281816,2);return false;" title="举报">举报</a></span>        </div>    </div>    <div class="embody" style="display:none" id="embody">        <span class="embody_t">本文章已收录于:</span>        <div class="embody_c" id="lib" value="{"err":0,"msg":"ok","data":[]}"></div>    </div>    <style type="text/css">                    .embody{                padding:10px 10px 10px;                margin:0 -20px;                border-bottom:solid 1px #ededed;                            }            .embody_b{                margin:0 ;                padding:10px 0;            }            .embody .embody_t,.embody .embody_c{                display: inline-block;                margin-right:10px;            }            .embody_t{                font-size: 12px;                color:#999;            }            .embody_c{                font-size: 12px;            }            .embody_c img,.embody_c em{                display: inline-block;                vertical-align: middle;                           }             .embody_c img{                               width:30px;                height:30px;            }            .embody_c em{                margin: 0 20px 0 10px;                color:#333;                font-style: normal;            }    </style>    <script type="text/javascript">        $(function () {            try            {                var lib = eval("("+$("#lib").attr("value")+")");                var html = "";                if (lib.err == 0) {                    $.each(lib.data, function (i) {                        var obj = lib.data[i];                        //html += '<img src="' + obj.logo + '"/>' + obj.name + "  ";                        html += ' <a href="' + obj.url + '" target="_blank">';                        html += ' <img src="' + obj.logo + '">';                        html += ' <em><b>' + obj.name + '</b></em>';                        html += ' </a>';                    });                    if (html != "") {                        setTimeout(function () {                            $("#lib").html(html);                                                  $("#embody").show();                        }, 100);                    }                }                  } catch (err)            { }                    });    </script>      <div class="category clearfix">        <div class="category_l">           <img src="http://static.blog.csdn.net/images/category_icon.jpg">            <span>分类:</span>        </div>        <div class="category_r">                    <label onclick="GetCategoryArticles('6277600','baimafujinji','top','51281816');">                        <span onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_fenlei']);">自然语言处理与信息检索<em>(15)</em></span>                      <img class="arrow-down" src="http://static.blog.csdn.net/images/arrow_triangle _down.jpg" style="display:inline;">                      <img class="arrow-up" src="http://static.blog.csdn.net/images/arrow_triangle_up.jpg" style="display:none;">                        <div class="subItem">                            <div class="subItem_t"><a href="http://blog.csdn.net/baimafujinji/article/category/6277600" target="_blank">作者同类文章</a><i class="J_close">X</i></div>                            <ul class="subItem_l" id="top_6277600">                                                        </ul>                        </div>                    </label>                            </div>    </div>        <div class="bog_copyright">                     <p class="copyright_p">版权声明:本文为博主原创文章,未经博主允许不得转载。</p>        </div>           <div style="clear:both"></div><div style="border:solid 1px #ccc; background:#eee; float:left; min-width:200px;padding:4px 10px;"><p style="text-align:right;margin:0;"><span style="float:left;">目录<a href="#" title="系统根据文章中H1到H6标签自动生成文章目录">(?)</a></span><a href="#" onclick="javascript:return openct(this);" title="展开">[+]</a></p><ol style="display:none;margin-left:14px;padding-left:14px;line-height:160%;"><li><a href="#t0">基于N-Gram模型定义的字符串距离</a></li><ol><li><a href="#t1">N-Gram在模糊匹配中的应用</a></li><li><a href="#t2">利用N-Gram计算字符串间距离的Java实例</a></li></ol><li><a href="#t3">利用N-Gram模型评估语句是否合理</a></li><li><a href="#t4">使用N-Gram模型时的数据平滑算法</a></li><li><a href="#t5">A Final Word</a></li><li><a href="#t6">推荐阅读和参考文献</a></li></ol></div><div style="clear:both"></div><div id="article_content" class="article_content">        <div class="markdown_views"><p>N-Gram(有时也称为N元模型)是<a href="http://lib.csdn.net/base/nlp" class="replace_word" title="自然语言理解和处理知识库" target="_blank" style="color:#df3434; font-weight:bold;">自然语言</a>处理中一个非常重要的概念,通常在NLP中,人们基于一定的语料库,可以利用N-Gram来预计或者评估一个句子是否合理。另外一方面,N-Gram的另外一个作用是用来评估两个字符串之间的差异程度。这是模糊匹配中常用的一种手段。本文将从此开始,进而向读者展示N-Gram在自然语言处理中的各种powerful的应用。</p><ul><li><strong>基于N-Gram模型定义的字符串距离</strong></li><li><strong>利用N-Gram模型评估语句是否合理</strong></li><li><strong>使用N-Gram模型时的数据平滑算法</strong></li></ul><p>欢迎关注白马负金羁的博客 <a href="http://blog.csdn.net/baimafujinji" target="_blank">http://blog.csdn.net/baimafujinji</a>,为保证公式、图表得以正确显示,强烈建议你从该地址上查看原版博文。本博客<strong>主要关注方向</strong>包括:数字图像处理、<a href="http://lib.csdn.net/base/datastructure" class="replace_word" title="算法与数据结构知识库" target="_blank" style="color:#df3434; font-weight:bold;">算法</a>设计与分析、<a href="http://lib.csdn.net/base/datastructure" class="replace_word" title="算法与数据结构知识库" target="_blank" style="color:#df3434; font-weight:bold;">数据结构</a>、<a href="http://lib.csdn.net/base/machinelearning" class="replace_word" title="机器学习知识库" target="_blank" style="color:#df3434; font-weight:bold;">机器学习</a>、数据挖掘、统计分析方法、自然语言处理。</p><hr><h2 id="基于n-gram模型定义的字符串距离"><a name="t0"></a>基于N-Gram模型定义的字符串距离</h2><blockquote>  <p>在自然语言处理时,最常用也最基础的一个操作是就是“模式匹配”,或者称为“字符串查找”。而模式匹配(字符串查找)又分为<strong>精确匹配</strong>和<strong>模糊匹配</strong>两种。</p></blockquote><p>所谓精确匹配,大家应该并不陌生,比如我们要统计一篇文章中关键词 “<em>information</em>” 出现的次数,这是所使用的方法就是精确的模式匹配。这方面的算法也比较多,而且应该是计算机相关专业必修的基础课中都会涉及到的内容,例如KMP算法、BM算法和BMH算法等等。</p><p>另外一种匹配就是所谓的模糊匹配,它的应用也随处可见。例如,一般的文字处理软件(例如,Microsoft Word等)都会提供拼写检查功能。当你输入一个错误的单词,例如 “ <em>informtaion</em>” 时,系统会提示你是否要输入的词其实是 “<em>information</em>” 。将一个可能错拼单词映射到一个推荐的正确拼写上所才有的技术就是模糊匹配。</p><p>模糊匹配的关键在于如何衡量两个长得很像的单词(或字符串)之间的“差异”。这种差异通常又称为“距离”。这方面的具体算法有很多,例如基于编辑距离的概念,人们设计出了 Smith-Waterman 算法和Needleman-Wunsch 算法,其中后者还是历史上最早的应用动态规划思想设计的算法之一。现在Smith-Waterman 算法和Needleman-Wunsch 算法在生物信息学领域也有重要应用,研究人员常常用它们来计算两个DNA序列片段之间的“差异”(或称“距离”)。甚至于在LeetCode上也有一道<a href="https://leetcode.com/problems/edit-distance/" target="_blank">“No.72  Edit Distance”</a>,其本质就是在考察上述两种算法的实现。可见相关问题离我们并不遥远。</p><h3 id="n-gram在模糊匹配中的应用"><a name="t1"></a><strong>N-Gram在模糊匹配中的应用</strong></h3><p>事实上,笔者在新出版的<a href="http://blog.csdn.net/baimafujinji/article/details/50484348" target="_blank">《算法之美——隐匿在数据结构背后的原理》</a>一书中已经详细介绍了包括Needleman-Wunsch算法、Smith-Waterman算法、N-Gram算法、Soundex算法、Phonix算法等在内的多种距离定义算法(或模糊匹配算法)。而今天为了引出N-Gram模型在NLP中的其他应用,我们首先来介绍一下如何利用N-Gram来定义字符串之间的距离。</p><p>我们除了可以定义两个字符串之间的编辑距离(通常利用Needleman-Wunsch算法或Smith-Waterman算法)之外,还可以定义它们之间的N-Gram距离。N-Gram(有时也称为N元模型)是自然语言处理中一个非常重要的概念。假设有一个字符串 <span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-1-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-1" style="width: 0.539em; display: inline-block;"><span style="display: inline-block; position: relative; width: 0.42em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-2"><span class="mi" id="MathJax-Span-3" style="font-family: STIXGeneral-Italic;">s</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 0.718em; vertical-align: -0.068em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-1">s</script>,那么该字符串的N-Gram就表示按长度 N 切分原词得到的词段,也就是 <span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-2-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-4" style="width: 0.539em; display: inline-block;"><span style="display: inline-block; position: relative; width: 0.42em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-5"><span class="mi" id="MathJax-Span-6" style="font-family: STIXGeneral-Italic;">s</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 0.718em; vertical-align: -0.068em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-2">s</script> 中所有长度为 N 的子字符串。设想如果有两个字符串,然后分别求它们的N-Gram,那么就可以从它们的共有子串的数量这个角度去定义两个字符串间的N-Gram距离。但是仅仅是简单地对共有子串进行计数显然也存在不足,这种方案显然忽略了两个字符串长度差异可能导致的问题。比如字符串 girl 和 girlfriend,二者所拥有的公共子串数量显然与 girl 和其自身所拥有的公共子串数量相等,但是我们并不能据此认为 girl 和girlfriend 是两个等同的匹配。</p><p>为了解决该问题,有学者便提出以非重复的N-Gram分词为基础来定义 N-Gram距离这一概念,可以用下面的公式来表述: <br><span class="MathJax_Preview"></span></p><div class="MathJax_Display" role="textbox" aria-readonly="true" style="text-align: center;"><span class="MathJax" id="MathJax-Element-3-Frame"><nobr><span class="math" id="MathJax-Span-7" style="width: 18.574em; display: inline-block;"><span style="display: inline-block; position: relative; width: 15.479em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.729em 1000em 2.92em -0.473em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-8"><span class="texatom" id="MathJax-Span-9"><span class="mrow" id="MathJax-Span-10"><span class="mo" id="MathJax-Span-11" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="msubsup" id="MathJax-Span-12"><span style="display: inline-block; position: relative; width: 1.313em; height: 0px;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.473em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-13" style="font-family: STIXGeneral-Italic;">G</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.717em;"><span class="mi" id="MathJax-Span-14" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">N<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.063em;"></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-15" style="font-family: STIXGeneral-Regular;">(</span><span class="mi" id="MathJax-Span-16" style="font-family: STIXGeneral-Italic;">s</span><span class="mo" id="MathJax-Span-17" style="font-family: STIXGeneral-Regular;">)</span><span class="texatom" id="MathJax-Span-18"><span class="mrow" id="MathJax-Span-19"><span class="mo" id="MathJax-Span-20" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="mo" id="MathJax-Span-21" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">+</span><span class="texatom" id="MathJax-Span-22" style="padding-left: 0.241em;"><span class="mrow" id="MathJax-Span-23"><span class="mo" id="MathJax-Span-24" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="msubsup" id="MathJax-Span-25"><span style="display: inline-block; position: relative; width: 1.313em; height: 0px;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.473em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-26" style="font-family: STIXGeneral-Italic;">G</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.717em;"><span class="mi" id="MathJax-Span-27" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">N<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.063em;"></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-28" style="font-family: STIXGeneral-Regular;">(</span><span class="mi" id="MathJax-Span-29" style="font-family: STIXGeneral-Italic;">t<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="mo" id="MathJax-Span-30" style="font-family: STIXGeneral-Regular;">)</span><span class="texatom" id="MathJax-Span-31"><span class="mrow" id="MathJax-Span-32"><span class="mo" id="MathJax-Span-33" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="mo" id="MathJax-Span-34" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">−</span><span class="mn" id="MathJax-Span-35" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">2</span><span class="mo" id="MathJax-Span-36" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">×</span><span class="texatom" id="MathJax-Span-37" style="padding-left: 0.241em;"><span class="mrow" id="MathJax-Span-38"><span class="mo" id="MathJax-Span-39" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="msubsup" id="MathJax-Span-40"><span style="display: inline-block; position: relative; width: 1.313em; height: 0px;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.473em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-41" style="font-family: STIXGeneral-Italic;">G</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.717em;"><span class="mi" id="MathJax-Span-42" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">N<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.063em;"></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-43" style="font-family: STIXGeneral-Regular;">(</span><span class="mi" id="MathJax-Span-44" style="font-family: STIXGeneral-Italic;">s</span><span class="mo" id="MathJax-Span-45" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-46" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">∩</span><span class="msubsup" id="MathJax-Span-47" style="padding-left: 0.241em;"><span style="display: inline-block; position: relative; width: 1.313em; height: 0px;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.473em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-48" style="font-family: STIXGeneral-Italic;">G</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.717em;"><span class="mi" id="MathJax-Span-49" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">N<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.063em;"></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-50" style="font-family: STIXGeneral-Regular;">(</span><span class="mi" id="MathJax-Span-51" style="font-family: STIXGeneral-Italic;">t<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="mo" id="MathJax-Span-52" style="font-family: STIXGeneral-Regular;">)</span><span class="texatom" id="MathJax-Span-53"><span class="mrow" id="MathJax-Span-54"><span class="mo" id="MathJax-Span-55" style="font-family: STIXGeneral-Regular;">|</span></span></span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.146em; vertical-align: -0.282em; color: rgb(255, 255, 255);"></span></span></nobr></span></div><script type="math/tex; mode=display" id="MathJax-Element-3"> |G_N(s)| +  |G_N(t)| - 2\times |G_N(s)\cap G_N(t)|</script> <br>此处,<span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-4-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-56" style="width: 3.336em; display: inline-block;"><span style="display: inline-block; position: relative; width: 2.741em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.729em 1000em 2.92em -0.473em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-57"><span class="texatom" id="MathJax-Span-58"><span class="mrow" id="MathJax-Span-59"><span class="mo" id="MathJax-Span-60" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="msubsup" id="MathJax-Span-61"><span style="display: inline-block; position: relative; width: 1.313em; height: 0px;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.473em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-62" style="font-family: STIXGeneral-Italic;">G</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.717em;"><span class="mi" id="MathJax-Span-63" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">N<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.063em;"></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-64" style="font-family: STIXGeneral-Regular;">(</span><span class="mi" id="MathJax-Span-65" style="font-family: STIXGeneral-Italic;">s</span><span class="mo" id="MathJax-Span-66" style="font-family: STIXGeneral-Regular;">)</span><span class="texatom" id="MathJax-Span-67"><span class="mrow" id="MathJax-Span-68"><span class="mo" id="MathJax-Span-69" style="font-family: STIXGeneral-Regular;">|</span></span></span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.146em; vertical-align: -0.282em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-4"> |G_N(s)| </script> 是字符串 <span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-5-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-70" style="width: 0.539em; display: inline-block;"><span style="display: inline-block; position: relative; width: 0.42em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-71"><span class="mi" id="MathJax-Span-72" style="font-family: STIXGeneral-Italic;">s</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 0.718em; vertical-align: -0.068em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-5">s</script> 的 N-Gram集合,N 值一般取2或者3。以 N = 2 为例对字符串Gorbachev和Gorbechyov进行分段,可得如下结果(我们用下画线标出了其中的公共子串)。 <br><p></p><center> <br><img src="http://img.blog.csdn.net/20160429144714271" width="250"> <br></center> <br>结合上面的公式,即可算得两个字符串之间的距离是8 + 9 − 2 × 4 = 9。显然,字符串之间的距离越小,它们就越接近。当两个字符串完全相等的时候,它们之间的距离就是0。<p></p><h3 id="利用n-gram计算字符串间距离的java实例"><a name="t2"></a><strong>利用N-Gram计算字符串间距离的Java实例</strong></h3><p>在<a href="http://blog.csdn.net/baimafujinji/article/details/50484348" target="_blank">《算法之美——隐匿在数据结构背后的原理》</a>一书中,我们给出了在C++下实现的计算两个字符串间N-Gram距离的函数,鉴于全书代码已经在本博客中发布,这里不再重复列出。事实上,很多语言的函数库或者工具箱中都已经提供了封装好的计算 N-Gram 距离的函数,下面这个例子演示了在<a href="http://lib.csdn.net/base/javase" class="replace_word" title="Java SE知识库" target="_blank" style="color:#df3434; font-weight:bold;">Java</a>中使用N-Gram 距离的方法。</p><p>针对这个例子,这里需要说明的是:</p><ul><li>调用函数需要引用lucene的JAR包,我所使用的是lucene-suggest-5.0.0.jar</li><li>前面我们所给出的算法计算所得为一个绝对性的距离分值。而Java中所给出的函数在此基础上进行了归一化,也就是说所得之结果是一个介于0~1之间的浮点数,即0的时候表示两个字符串完全不同,而1则表示两个字符串完全相同。</li></ul><pre class="prettyprint" name="code"><code class="language-java hljs  has-numbering"><span class="hljs-keyword">import</span> org.apache.lucene.search.spell.*;<span class="hljs-keyword">public</span> <span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">NGram_distance</span> {</span>    <span class="hljs-keyword">public</span> <span class="hljs-keyword">static</span> <span class="hljs-keyword">void</span> <span class="hljs-title">main</span>(String[] args) {        NGramDistance ng = <span class="hljs-keyword">new</span> NGramDistance();        <span class="hljs-keyword">float</span> score1 = ng.getDistance(<span class="hljs-string">"Gorbachev"</span>, <span class="hljs-string">"Gorbechyov"</span>);        System.out.println(score1);        <span class="hljs-keyword">float</span> score2 = ng.getDistance(<span class="hljs-string">"girl"</span>, <span class="hljs-string">"girlfriend"</span>);        System.out.println(score2);    }}</code><ul class="pre-numbering" style="opacity: 0;"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li><li>10</li><li>11</li><li>12</li><li>13</li></ul><div class="save_code tracking-ad" data-mod="popu_249"><a href="javascript:;" target="_blank"><img src="http://static.blog.csdn.net/images/save_snippets.png"></a></div><ul class="pre-numbering"><li>1</li><li>2</li><li>3</li><li>4</li><li>5</li><li>6</li><li>7</li><li>8</li><li>9</li><li>10</li><li>11</li><li>12</li><li>13</li></ul></pre><p>有兴趣的读者可以在引用相关JAR包之后在Eclipse中执行上述Java程序,你会发现,和我们预期的一样,字符串Gorbachev和Gorbechyov所得之距离评分较高(=0.7),说明二者很接近;而girl和girlfriend所得之距离评分并不高(=0.3999),说明二者并不很接近。</p><hr><h2 id="利用n-gram模型评估语句是否合理"><a name="t3"></a>利用N-Gram模型评估语句是否合理</h2><p>从现在开始,我们所讨论的N-Gram模型跟前面讲过N-Gram模型从外在来看已经大不相同,但是请注意它们内在的联系(或者说本质上它们仍然是统一的概念)。</p><p>为了引入N-Gram的这个应用,我们从几个例子开始。 <br>首先,从统计的角度来看,自然语言中的一个句子 <span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-6-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-73" style="width: 0.539em; display: inline-block;"><span style="display: inline-block; position: relative; width: 0.42em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-74"><span class="mi" id="MathJax-Span-75" style="font-family: STIXGeneral-Italic;">s</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 0.718em; vertical-align: -0.068em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-6">s</script> 可以由任何词串构成,不过概率 <span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-7-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-76" style="width: 2.027em; display: inline-block;"><span style="display: inline-block; position: relative; width: 1.67em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.729em 1000em 2.92em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-77"><span class="mi" id="MathJax-Span-78" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-79" style="font-family: STIXGeneral-Regular;">(</span><span class="mi" id="MathJax-Span-80" style="font-family: STIXGeneral-Italic;">s</span><span class="mo" id="MathJax-Span-81" style="font-family: STIXGeneral-Regular;">)</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.146em; vertical-align: -0.282em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-7">P(s)</script> 有大有小。例如:</p><ul><li><span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-8-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-82" style="width: 1.015em; display: inline-block;"><span style="display: inline-block; position: relative; width: 0.836em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(0.42em 1000em 1.372em -0.533em); top: -1.009em; left: 0.003em;"><span class="mrow" id="MathJax-Span-83"><span class="msubsup" id="MathJax-Span-84"><span style="display: inline-block; position: relative; width: 0.836em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-85" style="font-family: STIXGeneral-Italic;">s</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.42em;"><span class="mn" id="MathJax-Span-86" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span></span><span style="display: inline-block; width: 0px; height: 1.015em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 0.861em; vertical-align: -0.282em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-8">s_1</script> = 我刚吃过晚饭</li><li><span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-9-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-87" style="width: 1.015em; display: inline-block;"><span style="display: inline-block; position: relative; width: 0.836em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(0.42em 1000em 1.372em -0.533em); top: -1.009em; left: 0.003em;"><span class="mrow" id="MathJax-Span-88"><span class="msubsup" id="MathJax-Span-89"><span style="display: inline-block; position: relative; width: 0.836em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-90" style="font-family: STIXGeneral-Italic;">s</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.42em;"><span class="mn" id="MathJax-Span-91" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">2</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span></span><span style="display: inline-block; width: 0px; height: 1.015em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 0.861em; vertical-align: -0.282em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-9">s_2</script> = 刚我过晚饭吃</li></ul><p>显然,对于中文而言 <span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-10-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-92" style="width: 1.015em; display: inline-block;"><span style="display: inline-block; position: relative; width: 0.836em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(0.42em 1000em 1.372em -0.533em); top: -1.009em; left: 0.003em;"><span class="mrow" id="MathJax-Span-93"><span class="msubsup" id="MathJax-Span-94"><span style="display: inline-block; position: relative; width: 0.836em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-95" style="font-family: STIXGeneral-Italic;">s</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.42em;"><span class="mn" id="MathJax-Span-96" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span></span><span style="display: inline-block; width: 0px; height: 1.015em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 0.861em; vertical-align: -0.282em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-10">s_1</script> 是一个通顺而有意义的句子,而<span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-11-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-97" style="width: 1.015em; display: inline-block;"><span style="display: inline-block; position: relative; width: 0.836em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(0.42em 1000em 1.372em -0.533em); top: -1.009em; left: 0.003em;"><span class="mrow" id="MathJax-Span-98"><span class="msubsup" id="MathJax-Span-99"><span style="display: inline-block; position: relative; width: 0.836em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-100" style="font-family: STIXGeneral-Italic;">s</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.42em;"><span class="mn" id="MathJax-Span-101" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">2</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span></span><span style="display: inline-block; width: 0px; height: 1.015em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 0.861em; vertical-align: -0.282em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-11">s_2</script> 则不是,所以对于中文来说,<span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-12-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-102" style="width: 6.61em; display: inline-block;"><span style="display: inline-block; position: relative; width: 5.479em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.729em 1000em 2.92em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-103"><span class="mi" id="MathJax-Span-104" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-105" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-106"><span style="display: inline-block; position: relative; width: 0.836em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-107" style="font-family: STIXGeneral-Italic;">s</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.42em;"><span class="mn" id="MathJax-Span-108" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-109" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-110" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span><span class="mi" id="MathJax-Span-111" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">P</span><span class="mo" id="MathJax-Span-112" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-113"><span style="display: inline-block; position: relative; width: 0.836em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-114" style="font-family: STIXGeneral-Italic;">s</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.42em;"><span class="mn" id="MathJax-Span-115" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">2</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-116" style="font-family: STIXGeneral-Regular;">)</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.146em; vertical-align: -0.282em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-12">P(s_1)>P(s_2)</script> 。但不同语言来说,这两个概率值的大小可能会反转。</p><p>其次,另外一个例子是,如果我们给出了某个句子的一个节选,我们其实可以能够猜测后续的词应该是什么,例如</p><ul><li>the large green <strong><em>__</em></strong> .    Possible answer may be “mountain” or “tree” ?</li><li>Kate swallowed the large green  <strong><em>__</em></strong> .    Possible answer may be “pill” or “broccoli” ?</li></ul><p>显然,如果我们知道这个句子片段更多前面的内容的情况下,我们会得到一个更加准确的答案。这就告诉我们,前面的(历史)信息越多,对后面未知信息的约束就越强。</p><p>如果我们有一个由 <span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-13-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-117" style="width: 0.896em; display: inline-block;"><span style="display: inline-block; position: relative; width: 0.717em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-118"><span class="mi" id="MathJax-Span-119" style="font-family: STIXGeneral-Italic;">m</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 0.718em; vertical-align: -0.068em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-13">m</script> 个词组成的序列(或者说一个句子),我们希望算得概率 <span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-14-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-120" style="width: 8.515em; display: inline-block;"><span style="display: inline-block; position: relative; width: 7.086em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.729em 1000em 2.92em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-121"><span class="mi" id="MathJax-Span-122" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-123" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-124"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-125" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-126" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-127" style="font-family: STIXGeneral-Regular;">,</span><span class="msubsup" id="MathJax-Span-128" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-129" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-130" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">2</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-131" style="font-family: STIXGeneral-Regular;">,</span><span class="mo" id="MathJax-Span-132" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">⋯</span><span class="mo" id="MathJax-Span-133" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">,</span><span class="msubsup" id="MathJax-Span-134" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 1.253em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-135" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mi" id="MathJax-Span-136" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">m</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-137" style="font-family: STIXGeneral-Regular;">)</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.146em; vertical-align: -0.282em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-14">P(w_1, w_2, \cdots, w_m)</script> ,根据链式规则,可得 <br><span class="MathJax_Preview"></span></p><div class="MathJax_Display" role="textbox" aria-readonly="true" style="text-align: center;"><span class="MathJax" id="MathJax-Element-15-Frame"><nobr><span class="math" id="MathJax-Span-138" style="width: 34.289em; display: inline-block;"><span style="display: inline-block; position: relative; width: 28.574em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.729em 1000em 2.92em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-139"><span class="mi" id="MathJax-Span-140" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-141" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-142"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-143" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-144" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-145" style="font-family: STIXGeneral-Regular;">,</span><span class="msubsup" id="MathJax-Span-146" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-147" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-148" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">2</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-149" style="font-family: STIXGeneral-Regular;">,</span><span class="mo" id="MathJax-Span-150" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">⋯</span><span class="mo" id="MathJax-Span-151" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">,</span><span class="msubsup" id="MathJax-Span-152" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 1.253em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-153" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mi" id="MathJax-Span-154" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">m</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-155" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-156" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mi" id="MathJax-Span-157" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">P</span><span class="mo" id="MathJax-Span-158" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-159"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-160" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-161" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-162" style="font-family: STIXGeneral-Regular;">)</span><span class="mi" id="MathJax-Span-163" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-164" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-165"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-166" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-167" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">2</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="texatom" id="MathJax-Span-168"><span class="mrow" id="MathJax-Span-169"><span class="mo" id="MathJax-Span-170" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="msubsup" id="MathJax-Span-171"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-172" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-173" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-174" style="font-family: STIXGeneral-Regular;">)</span><span class="mi" id="MathJax-Span-175" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-176" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-177"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-178" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-179" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">3</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="texatom" id="MathJax-Span-180"><span class="mrow" id="MathJax-Span-181"><span class="mo" id="MathJax-Span-182" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="msubsup" id="MathJax-Span-183"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-184" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-185" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-186" style="font-family: STIXGeneral-Regular;">,</span><span class="msubsup" id="MathJax-Span-187" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-188" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-189" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">2</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-190" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-191" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">⋯</span><span class="mi" id="MathJax-Span-192" style="font-family: STIXGeneral-Italic; padding-left: 0.182em;">P</span><span class="mo" id="MathJax-Span-193" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-194"><span style="display: inline-block; position: relative; width: 1.253em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-195" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mi" id="MathJax-Span-196" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">m</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="texatom" id="MathJax-Span-197"><span class="mrow" id="MathJax-Span-198"><span class="mo" id="MathJax-Span-199" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="msubsup" id="MathJax-Span-200"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-201" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-202" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-203" style="font-family: STIXGeneral-Regular;">,</span><span class="mo" id="MathJax-Span-204" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">⋯</span><span class="mo" id="MathJax-Span-205" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">,</span><span class="msubsup" id="MathJax-Span-206" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 2.086em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-207" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="texatom" id="MathJax-Span-208"><span class="mrow" id="MathJax-Span-209"><span class="mi" id="MathJax-Span-210" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">m</span><span class="mo" id="MathJax-Span-211" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">−</span><span class="mn" id="MathJax-Span-212" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-213" style="font-family: STIXGeneral-Regular;">)</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.146em; vertical-align: -0.282em; color: rgb(255, 255, 255);"></span></span></nobr></span></div><script type="math/tex; mode=display" id="MathJax-Element-15">P(w_1, w_2, \cdots, w_m)=P(w_1)P(w_2|w_1)P(w_3|w_1,w_2)\cdots P(w_m|w_1,\cdots ,w_{m-1})</script> <br>这个概率显然并不好算,不妨利用马尔科夫链的假设,即当前这个词仅仅跟前面几个有限的词相关,因此也就不必追溯到最开始的那个词,这样便可以大幅缩减上诉算式的长度。即 <br><span class="MathJax_Preview"></span><div class="MathJax_Display" role="textbox" aria-readonly="true" style="text-align: center;"><span class="MathJax" id="MathJax-Element-16-Frame"><nobr><span class="math" id="MathJax-Span-214" style="width: 20.658em; display: inline-block;"><span style="display: inline-block; position: relative; width: 17.205em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.729em 1000em 2.92em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-215"><span class="mi" id="MathJax-Span-216" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-217" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-218"><span style="display: inline-block; position: relative; width: 0.896em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-219" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mi" id="MathJax-Span-220" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="texatom" id="MathJax-Span-221"><span class="mrow" id="MathJax-Span-222"><span class="mo" id="MathJax-Span-223" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="msubsup" id="MathJax-Span-224"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-225" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-226" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-227" style="font-family: STIXGeneral-Regular;">,</span><span class="mo" id="MathJax-Span-228" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">⋯</span><span class="mo" id="MathJax-Span-229" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">,</span><span class="msubsup" id="MathJax-Span-230" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 1.789em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-231" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="texatom" id="MathJax-Span-232"><span class="mrow" id="MathJax-Span-233"><span class="mi" id="MathJax-Span-234" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-235" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">−</span><span class="mn" id="MathJax-Span-236" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-237" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-238" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mi" id="MathJax-Span-239" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">P</span><span class="mo" id="MathJax-Span-240" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-241"><span style="display: inline-block; position: relative; width: 0.896em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-242" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mi" id="MathJax-Span-243" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="texatom" id="MathJax-Span-244"><span class="mrow" id="MathJax-Span-245"><span class="mo" id="MathJax-Span-246" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="msubsup" id="MathJax-Span-247"><span style="display: inline-block; position: relative; width: 2.622em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-248" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="texatom" id="MathJax-Span-249"><span class="mrow" id="MathJax-Span-250"><span class="mi" id="MathJax-Span-251" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-252" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">−</span><span class="mi" id="MathJax-Span-253" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">n</span><span class="mo" id="MathJax-Span-254" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">+</span><span class="mn" id="MathJax-Span-255" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-256" style="font-family: STIXGeneral-Regular;">,</span><span class="mo" id="MathJax-Span-257" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">⋯</span><span class="mo" id="MathJax-Span-258" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">,</span><span class="msubsup" id="MathJax-Span-259" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 1.789em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-260" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="texatom" id="MathJax-Span-261"><span class="mrow" id="MathJax-Span-262"><span class="mi" id="MathJax-Span-263" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-264" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">−</span><span class="mn" id="MathJax-Span-265" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-266" style="font-family: STIXGeneral-Regular;">)</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.146em; vertical-align: -0.282em; color: rgb(255, 255, 255);"></span></span></nobr></span></div><script type="math/tex; mode=display" id="MathJax-Element-16">P(w_i| w_1, \cdots, w_{i-1})=P(w_i|w_{i-n+1},\cdots ,w_{i-1})</script> <br>特别地,对于 <span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-17-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-267" style="width: 0.598em; display: inline-block;"><span style="display: inline-block; position: relative; width: 0.479em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-268"><span class="mi" id="MathJax-Span-269" style="font-family: STIXGeneral-Italic;">n</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 0.718em; vertical-align: -0.068em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-17">n</script> 取得较小值的情况 <br>当 <span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-18-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-270" style="width: 2.741em; display: inline-block;"><span style="display: inline-block; position: relative; width: 2.265em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-271"><span class="mi" id="MathJax-Span-272" style="font-family: STIXGeneral-Italic;">n</span><span class="mo" id="MathJax-Span-273" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mn" id="MathJax-Span-274" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">1</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.004em; vertical-align: -0.068em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-18">n=1</script>, 一个一元模型(unigram model)即为 <br><span class="MathJax_Preview"></span><div class="MathJax_Display" role="textbox" aria-readonly="true" style="text-align: center;"><span class="MathJax" id="MathJax-Element-19-Frame"><nobr><span class="math" id="MathJax-Span-275" style="width: 14.527em; display: inline-block;"><span style="display: inline-block; position: relative; width: 12.086em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(0.777em 1000em 3.991em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-276"><span class="mi" id="MathJax-Span-277" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-278" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-279"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-280" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-281" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-282" style="font-family: STIXGeneral-Regular;">,</span><span class="msubsup" id="MathJax-Span-283" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-284" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-285" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">2</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-286" style="font-family: STIXGeneral-Regular;">,</span><span class="mo" id="MathJax-Span-287" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">⋯</span><span class="mo" id="MathJax-Span-288" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">,</span><span class="msubsup" id="MathJax-Span-289" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 1.253em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-290" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mi" id="MathJax-Span-291" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">m</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-292" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-293" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="munderover" id="MathJax-Span-294" style="padding-left: 0.301em;"><span style="display: inline-block; position: relative; width: 1.372em; height: 0px;"><span style="position: absolute; clip: rect(1.848em 1000em 3.634em -0.473em); top: -2.973em; left: 0.003em;"><span class="mo" id="MathJax-Span-295" style="font-family: STIXSizeOneSym; vertical-align: -0.533em;">∏</span><span style="display: inline-block; width: 0px; height: 2.979em;"></span></span><span style="position: absolute; clip: rect(1.432em 1000em 2.384em -0.473em); top: -0.949em; left: 0.182em;"><span class="texatom" id="MathJax-Span-296"><span class="mrow" id="MathJax-Span-297"><span class="mi" id="MathJax-Span-298" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-299" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">=</span><span class="mn" id="MathJax-Span-300" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span><span style="position: absolute; clip: rect(1.491em 1000em 2.265em -0.533em); top: -3.271em; left: 0.42em;"><span class="mi" id="MathJax-Span-301" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">m</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mi" id="MathJax-Span-302" style="font-family: STIXGeneral-Italic; padding-left: 0.182em;">P</span><span class="mo" id="MathJax-Span-303" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-304"><span style="display: inline-block; position: relative; width: 0.896em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-305" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mi" id="MathJax-Span-306" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-307" style="font-family: STIXGeneral-Regular;">)</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 3.575em; vertical-align: -1.568em; color: rgb(255, 255, 255);"></span></span></nobr></span></div><script type="math/tex; mode=display" id="MathJax-Element-19">P(w_1, w_2, \cdots, w_m)=\prod_{i=1}^mP(w_i)</script> <br>当 <span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-20-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-308" style="width: 2.741em; display: inline-block;"><span style="display: inline-block; position: relative; width: 2.265em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-309"><span class="mi" id="MathJax-Span-310" style="font-family: STIXGeneral-Italic;">n</span><span class="mo" id="MathJax-Span-311" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mn" id="MathJax-Span-312" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">2</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.004em; vertical-align: -0.068em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-20">n=2</script>, 一个二元模型(bigram model)即为 <br><span class="MathJax_Preview"></span><div class="MathJax_Display" role="textbox" aria-readonly="true" style="text-align: center;"><span class="MathJax" id="MathJax-Element-21-Frame"><nobr><span class="math" id="MathJax-Span-313" style="width: 16.908em; display: inline-block;"><span style="display: inline-block; position: relative; width: 14.051em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(0.777em 1000em 3.991em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-314"><span class="mi" id="MathJax-Span-315" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-316" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-317"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-318" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-319" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-320" style="font-family: STIXGeneral-Regular;">,</span><span class="msubsup" id="MathJax-Span-321" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-322" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-323" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">2</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-324" style="font-family: STIXGeneral-Regular;">,</span><span class="mo" id="MathJax-Span-325" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">⋯</span><span class="mo" id="MathJax-Span-326" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">,</span><span class="msubsup" id="MathJax-Span-327" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 1.253em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-328" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mi" id="MathJax-Span-329" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">m</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-330" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-331" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="munderover" id="MathJax-Span-332" style="padding-left: 0.301em;"><span style="display: inline-block; position: relative; width: 1.372em; height: 0px;"><span style="position: absolute; clip: rect(1.848em 1000em 3.634em -0.473em); top: -2.973em; left: 0.003em;"><span class="mo" id="MathJax-Span-333" style="font-family: STIXSizeOneSym; vertical-align: -0.533em;">∏</span><span style="display: inline-block; width: 0px; height: 2.979em;"></span></span><span style="position: absolute; clip: rect(1.432em 1000em 2.384em -0.473em); top: -0.949em; left: 0.182em;"><span class="texatom" id="MathJax-Span-334"><span class="mrow" id="MathJax-Span-335"><span class="mi" id="MathJax-Span-336" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-337" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">=</span><span class="mn" id="MathJax-Span-338" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span><span style="position: absolute; clip: rect(1.491em 1000em 2.265em -0.533em); top: -3.271em; left: 0.42em;"><span class="mi" id="MathJax-Span-339" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">m</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mi" id="MathJax-Span-340" style="font-family: STIXGeneral-Italic; padding-left: 0.182em;">P</span><span class="mo" id="MathJax-Span-341" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-342"><span style="display: inline-block; position: relative; width: 0.896em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-343" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mi" id="MathJax-Span-344" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="texatom" id="MathJax-Span-345"><span class="mrow" id="MathJax-Span-346"><span class="mo" id="MathJax-Span-347" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="msubsup" id="MathJax-Span-348"><span style="display: inline-block; position: relative; width: 1.789em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-349" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="texatom" id="MathJax-Span-350"><span class="mrow" id="MathJax-Span-351"><span class="mi" id="MathJax-Span-352" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-353" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">−</span><span class="mn" id="MathJax-Span-354" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-355" style="font-family: STIXGeneral-Regular;">)</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 3.575em; vertical-align: -1.568em; color: rgb(255, 255, 255);"></span></span></nobr></span></div><script type="math/tex; mode=display" id="MathJax-Element-21">P(w_1, w_2, \cdots, w_m)=\prod_{i=1}^mP(w_i|w_{i-1})</script> <br>当 <span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-22-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-356" style="width: 2.741em; display: inline-block;"><span style="display: inline-block; position: relative; width: 2.265em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-357"><span class="mi" id="MathJax-Span-358" style="font-family: STIXGeneral-Italic;">n</span><span class="mo" id="MathJax-Span-359" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mn" id="MathJax-Span-360" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">3</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.004em; vertical-align: -0.068em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-22">n=3</script>, 一个三元模型(trigram model)即为 <br><span class="MathJax_Preview"></span><div class="MathJax_Display" role="textbox" aria-readonly="true" style="text-align: center;"><span class="MathJax" id="MathJax-Element-23-Frame"><nobr><span class="math" id="MathJax-Span-361" style="width: 19.051em; display: inline-block;"><span style="display: inline-block; position: relative; width: 15.836em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(0.777em 1000em 3.991em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-362"><span class="mi" id="MathJax-Span-363" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-364" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-365"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-366" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-367" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-368" style="font-family: STIXGeneral-Regular;">,</span><span class="msubsup" id="MathJax-Span-369" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-370" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-371" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">2</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-372" style="font-family: STIXGeneral-Regular;">,</span><span class="mo" id="MathJax-Span-373" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">⋯</span><span class="mo" id="MathJax-Span-374" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">,</span><span class="msubsup" id="MathJax-Span-375" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 1.253em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-376" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mi" id="MathJax-Span-377" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">m</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-378" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-379" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="munderover" id="MathJax-Span-380" style="padding-left: 0.301em;"><span style="display: inline-block; position: relative; width: 1.372em; height: 0px;"><span style="position: absolute; clip: rect(1.848em 1000em 3.634em -0.473em); top: -2.973em; left: 0.003em;"><span class="mo" id="MathJax-Span-381" style="font-family: STIXSizeOneSym; vertical-align: -0.533em;">∏</span><span style="display: inline-block; width: 0px; height: 2.979em;"></span></span><span style="position: absolute; clip: rect(1.432em 1000em 2.384em -0.473em); top: -0.949em; left: 0.182em;"><span class="texatom" id="MathJax-Span-382"><span class="mrow" id="MathJax-Span-383"><span class="mi" id="MathJax-Span-384" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-385" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">=</span><span class="mn" id="MathJax-Span-386" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span><span style="position: absolute; clip: rect(1.491em 1000em 2.265em -0.533em); top: -3.271em; left: 0.42em;"><span class="mi" id="MathJax-Span-387" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">m</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mi" id="MathJax-Span-388" style="font-family: STIXGeneral-Italic; padding-left: 0.182em;">P</span><span class="mo" id="MathJax-Span-389" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-390"><span style="display: inline-block; position: relative; width: 0.896em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-391" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mi" id="MathJax-Span-392" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="texatom" id="MathJax-Span-393"><span class="mrow" id="MathJax-Span-394"><span class="mo" id="MathJax-Span-395" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="msubsup" id="MathJax-Span-396"><span style="display: inline-block; position: relative; width: 1.789em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-397" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="texatom" id="MathJax-Span-398"><span class="mrow" id="MathJax-Span-399"><span class="mi" id="MathJax-Span-400" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-401" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">−</span><span class="mn" id="MathJax-Span-402" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">2</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="msubsup" id="MathJax-Span-403"><span style="display: inline-block; position: relative; width: 1.789em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-404" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="texatom" id="MathJax-Span-405"><span class="mrow" id="MathJax-Span-406"><span class="mi" id="MathJax-Span-407" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-408" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">−</span><span class="mn" id="MathJax-Span-409" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-410" style="font-family: STIXGeneral-Regular;">)</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 3.575em; vertical-align: -1.568em; color: rgb(255, 255, 255);"></span></span></nobr></span></div><script type="math/tex; mode=display" id="MathJax-Element-23">P(w_1, w_2, \cdots, w_m)=\prod_{i=1}^mP(w_i|w_{i-2}w_{i-1})</script> <br>接下来的思路就比较明确了,可以利用最大似然法来求出一组参数,使得训练样本的概率取得最大值。<p></p><ul><li>对于unigram model而言,其中<span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-24-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-411" style="width: 5.955em; display: inline-block;"><span style="display: inline-block; position: relative; width: 4.943em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.729em 1000em 2.92em -0.473em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-412"><span class="mi" id="MathJax-Span-413" style="font-family: STIXGeneral-Italic;">c</span><span class="mo" id="MathJax-Span-414" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-415"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-416" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-417" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-418" style="font-family: STIXGeneral-Regular;">,</span><span class="mo" id="MathJax-Span-419" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">.</span><span class="mo" id="MathJax-Span-420" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">.</span><span class="mo" id="MathJax-Span-421" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">,</span><span class="msubsup" id="MathJax-Span-422" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-423" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mi" id="MathJax-Span-424" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">n</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-425" style="font-family: STIXGeneral-Regular;">)</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.146em; vertical-align: -0.282em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-24">c(w_1,..,w_n)</script> 表示 n-gram <span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-25-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-426" style="width: 4.646em; display: inline-block;"><span style="display: inline-block; position: relative; width: 3.872em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.967em 1000em 2.92em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-427"><span class="msubsup" id="MathJax-Span-428"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-429" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mn" id="MathJax-Span-430" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-431" style="font-family: STIXGeneral-Regular;">,</span><span class="mo" id="MathJax-Span-432" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">.</span><span class="mo" id="MathJax-Span-433" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">.</span><span class="mo" id="MathJax-Span-434" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">,</span><span class="msubsup" id="MathJax-Span-435" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 1.074em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-436" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mi" id="MathJax-Span-437" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">n</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 0.861em; vertical-align: -0.282em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-25">w_1,..,w_n</script> 在训练语料中出现的次数,<span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-26-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-438" style="width: 1.074em; display: inline-block;"><span style="display: inline-block; position: relative; width: 0.896em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-439"><span class="mi" id="MathJax-Span-440" style="font-family: STIXGeneral-Italic;">M<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.063em;"></span></span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 0.932em; vertical-align: -0.068em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-26">M</script> 是语料库中的总字数(例如对于 yes no no no yes 而言,<span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-27-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-441" style="width: 3.217em; display: inline-block;"><span style="display: inline-block; position: relative; width: 2.682em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.67em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-442"><span class="mi" id="MathJax-Span-443" style="font-family: STIXGeneral-Italic;">M<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.063em;"></span></span><span class="mo" id="MathJax-Span-444" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mn" id="MathJax-Span-445" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">5</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.004em; vertical-align: -0.068em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-27">M=5</script>) <br><span class="MathJax_Preview"></span><div class="MathJax_Display" role="textbox" aria-readonly="true" style="text-align: center;"><span class="MathJax" id="MathJax-Element-28-Frame"><nobr><span class="math" id="MathJax-Span-446" style="width: 7.265em; display: inline-block;"><span style="display: inline-block; position: relative; width: 6.015em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.015em 1000em 3.455em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-447"><span class="mi" id="MathJax-Span-448" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-449" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-450"><span style="display: inline-block; position: relative; width: 0.896em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-451" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mi" id="MathJax-Span-452" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-453" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-454" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mfrac" id="MathJax-Span-455" style="padding-left: 0.301em;"><span style="display: inline-block; position: relative; width: 2.324em; height: 0px; margin-right: 0.122em; margin-left: 0.122em;"><span style="position: absolute; clip: rect(1.729em 1000em 2.92em -0.473em); top: -3.211em; left: 50%; margin-left: -1.068em;"><span class="mrow" id="MathJax-Span-456"><span class="mi" id="MathJax-Span-457" style="font-family: STIXGeneral-Italic;">C<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="mo" id="MathJax-Span-458" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-459"><span style="display: inline-block; position: relative; width: 0.896em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-460" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mi" id="MathJax-Span-461" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-462" style="font-family: STIXGeneral-Regular;">)</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.533em); top: -1.842em; left: 50%; margin-left: -0.414em;"><span class="mi" id="MathJax-Span-463" style="font-family: STIXGeneral-Italic;">M<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.063em;"></span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(0.836em 1000em 1.253em -0.533em); top: -1.307em; left: 0.003em;"><span style="border-left: 2.324em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.25px; vertical-align: 0.003em;"></span><span style="display: inline-block; width: 0px; height: 1.074em;"></span></span></span></span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 2.575em; vertical-align: -0.925em; color: rgb(255, 255, 255);"></span></span></nobr></span></div><script type="math/tex; mode=display" id="MathJax-Element-28">P(w_i)=\frac{C(w_i)}{M}</script></li><li>对于bigram model而言, <br><span class="MathJax_Preview"></span><div class="MathJax_Display" role="textbox" aria-readonly="true" style="text-align: center;"><span class="MathJax" id="MathJax-Element-29-Frame"><nobr><span class="math" id="MathJax-Span-464" style="width: 11.729em; display: inline-block;"><span style="display: inline-block; position: relative; width: 9.765em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.015em 1000em 3.574em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-465"><span class="mi" id="MathJax-Span-466" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-467" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-468"><span style="display: inline-block; position: relative; width: 0.896em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-469" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mi" id="MathJax-Span-470" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="texatom" id="MathJax-Span-471"><span class="mrow" id="MathJax-Span-472"><span class="mo" id="MathJax-Span-473" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="msubsup" id="MathJax-Span-474"><span style="display: inline-block; position: relative; width: 1.789em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-475" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="texatom" id="MathJax-Span-476"><span class="mrow" id="MathJax-Span-477"><span class="mi" id="MathJax-Span-478" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-479" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">−</span><span class="mn" id="MathJax-Span-480" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-481" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-482" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mfrac" id="MathJax-Span-483" style="padding-left: 0.301em;"><span style="display: inline-block; position: relative; width: 4.11em; height: 0px; margin-right: 0.122em; margin-left: 0.122em;"><span style="position: absolute; clip: rect(1.729em 1000em 2.92em -0.473em); top: -3.211em; left: 50%; margin-left: -1.961em;"><span class="mrow" id="MathJax-Span-484"><span class="mi" id="MathJax-Span-485" style="font-family: STIXGeneral-Italic;">C<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="mo" id="MathJax-Span-486" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-487"><span style="display: inline-block; position: relative; width: 1.789em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-488" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="texatom" id="MathJax-Span-489"><span class="mrow" id="MathJax-Span-490"><span class="mi" id="MathJax-Span-491" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-492" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">−</span><span class="mn" id="MathJax-Span-493" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="msubsup" id="MathJax-Span-494"><span style="display: inline-block; position: relative; width: 0.896em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-495" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mi" id="MathJax-Span-496" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-497" style="font-family: STIXGeneral-Regular;">)</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(1.729em 1000em 2.92em -0.473em); top: -1.842em; left: 50%; margin-left: -1.545em;"><span class="mrow" id="MathJax-Span-498"><span class="mi" id="MathJax-Span-499" style="font-family: STIXGeneral-Italic;">C<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="mo" id="MathJax-Span-500" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-501"><span style="display: inline-block; position: relative; width: 1.789em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-502" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="texatom" id="MathJax-Span-503"><span class="mrow" id="MathJax-Span-504"><span class="mi" id="MathJax-Span-505" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-506" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">−</span><span class="mn" id="MathJax-Span-507" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-508" style="font-family: STIXGeneral-Regular;">)</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(0.836em 1000em 1.253em -0.533em); top: -1.307em; left: 0.003em;"><span style="border-left: 4.11em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.25px; vertical-align: 0.003em;"></span><span style="display: inline-block; width: 0px; height: 1.074em;"></span></span></span></span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 2.789em; vertical-align: -1.068em; color: rgb(255, 255, 255);"></span></span></nobr></span></div><script type="math/tex; mode=display" id="MathJax-Element-29">P(w_i|w_{i-1})=\frac{C(w_{i-1}w_i)}{C(w_{i-1})}</script></li><li>对于<span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-30-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-509" style="width: 0.598em; display: inline-block;"><span style="display: inline-block; position: relative; width: 0.479em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-510"><span class="mi" id="MathJax-Span-511" style="font-family: STIXGeneral-Italic;">n</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 0.718em; vertical-align: -0.068em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-30">n</script>-gram model而言, <br><span class="MathJax_Preview"></span><div class="MathJax_Display" role="textbox" aria-readonly="true" style="text-align: center;"><span class="MathJax" id="MathJax-Element-31-Frame"><nobr><span class="math" id="MathJax-Span-512" style="width: 21.67em; display: inline-block;"><span style="display: inline-block; position: relative; width: 18.039em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.015em 1000em 3.574em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-513"><span class="mi" id="MathJax-Span-514" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-515" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-516"><span style="display: inline-block; position: relative; width: 0.896em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-517" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="mi" id="MathJax-Span-518" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="texatom" id="MathJax-Span-519"><span class="mrow" id="MathJax-Span-520"><span class="mo" id="MathJax-Span-521" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="msubsup" id="MathJax-Span-522"><span style="display: inline-block; position: relative; width: 2.622em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-523" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="texatom" id="MathJax-Span-524"><span class="mrow" id="MathJax-Span-525"><span class="mi" id="MathJax-Span-526" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-527" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">−</span><span class="mi" id="MathJax-Span-528" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">n</span><span class="mo" id="MathJax-Span-529" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">−</span><span class="mn" id="MathJax-Span-530" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-531" style="font-family: STIXGeneral-Regular;">,</span><span class="mo" id="MathJax-Span-532" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">⋯</span><span class="mo" id="MathJax-Span-533" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">,</span><span class="msubsup" id="MathJax-Span-534" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 1.789em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-535" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="texatom" id="MathJax-Span-536"><span class="mrow" id="MathJax-Span-537"><span class="mi" id="MathJax-Span-538" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-539" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">−</span><span class="mn" id="MathJax-Span-540" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-541" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-542" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mfrac" id="MathJax-Span-543" style="padding-left: 0.301em;"><span style="display: inline-block; position: relative; width: 7.801em; height: 0px; margin-right: 0.122em; margin-left: 0.122em;"><span style="position: absolute; clip: rect(1.729em 1000em 2.92em -0.473em); top: -3.211em; left: 50%; margin-left: -3.39em;"><span class="mrow" id="MathJax-Span-544"><span class="mi" id="MathJax-Span-545" style="font-family: STIXGeneral-Italic;">C<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="mo" id="MathJax-Span-546" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-547"><span style="display: inline-block; position: relative; width: 2.622em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-548" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="texatom" id="MathJax-Span-549"><span class="mrow" id="MathJax-Span-550"><span class="mi" id="MathJax-Span-551" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-552" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">−</span><span class="mi" id="MathJax-Span-553" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">n</span><span class="mo" id="MathJax-Span-554" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">−</span><span class="mn" id="MathJax-Span-555" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-556" style="font-family: STIXGeneral-Regular;">,</span><span class="mo" id="MathJax-Span-557" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">⋯</span><span class="mo" id="MathJax-Span-558" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">,</span><span class="msubsup" id="MathJax-Span-559" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 0.896em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-560" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="texatom" id="MathJax-Span-561"><span class="mrow" id="MathJax-Span-562"><span class="mi" id="MathJax-Span-563" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-564" style="font-family: STIXGeneral-Regular;">)</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(1.729em 1000em 2.92em -0.473em); top: -1.842em; left: 50%; margin-left: -3.807em;"><span class="mrow" id="MathJax-Span-565"><span class="mi" id="MathJax-Span-566" style="font-family: STIXGeneral-Italic;">C<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="mo" id="MathJax-Span-567" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-568"><span style="display: inline-block; position: relative; width: 2.622em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-569" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="texatom" id="MathJax-Span-570"><span class="mrow" id="MathJax-Span-571"><span class="mi" id="MathJax-Span-572" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-573" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">−</span><span class="mi" id="MathJax-Span-574" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">n</span><span class="mo" id="MathJax-Span-575" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">−</span><span class="mn" id="MathJax-Span-576" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-577" style="font-family: STIXGeneral-Regular;">,</span><span class="mo" id="MathJax-Span-578" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">⋯</span><span class="mo" id="MathJax-Span-579" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">,</span><span class="msubsup" id="MathJax-Span-580" style="padding-left: 0.182em;"><span style="display: inline-block; position: relative; width: 1.789em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-581" style="font-family: STIXGeneral-Italic;">w</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.658em;"><span class="texatom" id="MathJax-Span-582"><span class="mrow" id="MathJax-Span-583"><span class="mi" id="MathJax-Span-584" style="font-size: 70.7%; font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-585" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">−</span><span class="mn" id="MathJax-Span-586" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span></span></span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-587" style="font-family: STIXGeneral-Regular;">)</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(0.836em 1000em 1.253em -0.533em); top: -1.307em; left: 0.003em;"><span style="border-left: 7.801em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.25px; vertical-align: 0.003em;"></span><span style="display: inline-block; width: 0px; height: 1.074em;"></span></span></span></span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 2.789em; vertical-align: -1.068em; color: rgb(255, 255, 255);"></span></span></nobr></span></div><script type="math/tex; mode=display" id="MathJax-Element-31">P(w_i|w_{i-n-1},\cdots, w_{i-1})=\frac{C(w_{i-n-1},\cdots, w_{i})}{C(w_{i-n-1},\cdots, w_{i-1})}</script></li></ul><p>来看一个具体的例子,假设我们现在有一个语料库如下,其中<span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-32-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-588" style="width: 6.908em; display: inline-block;"><span style="display: inline-block; position: relative; width: 5.717em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.473em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-589"><span class="mo" id="MathJax-Span-590" style="font-family: STIXGeneral-Regular;"><</span><span class="mi" id="MathJax-Span-591" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">s</span><span class="mn" id="MathJax-Span-592" style="font-family: STIXGeneral-Regular;">1</span><span class="mo" id="MathJax-Span-593" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">><span style="font-family: STIXGeneral-Regular; font-style: normal; font-weight: normal;"><</span></span><span class="mi" id="MathJax-Span-594" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">s</span><span class="mn" id="MathJax-Span-595" style="font-family: STIXGeneral-Regular;">2</span><span class="mo" id="MathJax-Span-596" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.004em; vertical-align: -0.068em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-32"><s1><s2></script> 是句首标记,<span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-33-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-597" style="width: 7.503em; display: inline-block;"><span style="display: inline-block; position: relative; width: 6.253em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.473em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-598"><span class="mo" id="MathJax-Span-599" style="font-family: STIXGeneral-Regular;"><</span><span class="texatom" id="MathJax-Span-600" style="padding-left: 0.301em;"><span class="mrow" id="MathJax-Span-601"><span class="mo" id="MathJax-Span-602" style="font-family: STIXGeneral-Regular;">/</span></span></span><span class="mi" id="MathJax-Span-603" style="font-family: STIXGeneral-Italic;">s</span><span class="mn" id="MathJax-Span-604" style="font-family: STIXGeneral-Regular;">2</span><span class="mo" id="MathJax-Span-605" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">><span style="font-family: STIXGeneral-Regular; font-style: normal; font-weight: normal;"><</span></span><span class="texatom" id="MathJax-Span-606" style="padding-left: 0.301em;"><span class="mrow" id="MathJax-Span-607"><span class="mo" id="MathJax-Span-608" style="font-family: STIXGeneral-Regular;">/</span></span></span><span class="mi" id="MathJax-Span-609" style="font-family: STIXGeneral-Italic;">s</span><span class="mn" id="MathJax-Span-610" style="font-family: STIXGeneral-Regular;">1</span><span class="mo" id="MathJax-Span-611" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.004em; vertical-align: -0.068em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-33"></s2></s1></script> 是句尾标记: <br><span class="MathJax_Preview"></span></p><div class="MathJax_Display" role="textbox" aria-readonly="true"><span class="MathJax" id="MathJax-Element-34-Frame"><nobr><span class="math" id="MathJax-Span-612" style="width: 100%; display: inline-block; min-width: 32.682em;"><span style="display: inline-block; position: relative; width: 100%; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(3.158em 1000em 5.717em -0.533em); top: -3.985em; left: 0.003em; width: 100%;"><span class="mrow" id="MathJax-Span-613"><span style="display: inline-block; position: relative; width: 100%; height: 0px;"><span style="position: absolute; clip: rect(3.158em 1000em 4.348em -0.473em); top: -3.985em; left: 50%; margin-left: -12.378em;"><span class="mo" id="MathJax-Span-614" style="font-family: STIXGeneral-Regular;"><</span><span class="mi" id="MathJax-Span-615" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">s</span><span class="mn" id="MathJax-Span-616" style="font-family: STIXGeneral-Regular;">1</span><span class="mo" id="MathJax-Span-617" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">><span style="font-family: STIXGeneral-Regular; font-style: normal; font-weight: normal;"><</span></span><span class="mi" id="MathJax-Span-618" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">s</span><span class="mn" id="MathJax-Span-619" style="font-family: STIXGeneral-Regular;">2</span><span class="mo" id="MathJax-Span-620" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span><span class="mi" id="MathJax-Span-621" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">y</span><span class="mi" id="MathJax-Span-622" style="font-family: STIXGeneral-Italic;">e</span><span class="mi" id="MathJax-Span-623" style="font-family: STIXGeneral-Italic;">s</span><span class="mspace" id="MathJax-Span-624" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-625" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-626" style="font-family: STIXGeneral-Italic;">o</span><span class="mspace" id="MathJax-Span-627" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-628" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-629" style="font-family: STIXGeneral-Italic;">o</span><span class="mspace" id="MathJax-Span-630" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-631" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-632" style="font-family: STIXGeneral-Italic;">o</span><span class="mspace" id="MathJax-Span-633" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-634" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-635" style="font-family: STIXGeneral-Italic;">o</span><span class="mspace" id="MathJax-Span-636" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-637" style="font-family: STIXGeneral-Italic;">y</span><span class="mi" id="MathJax-Span-638" style="font-family: STIXGeneral-Italic;">e</span><span class="mi" id="MathJax-Span-639" style="font-family: STIXGeneral-Italic;">s</span><span class="mo" id="MathJax-Span-640" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;"><</span><span class="texatom" id="MathJax-Span-641" style="padding-left: 0.301em;"><span class="mrow" id="MathJax-Span-642"><span class="mo" id="MathJax-Span-643" style="font-family: STIXGeneral-Regular;">/</span></span></span><span class="mi" id="MathJax-Span-644" style="font-family: STIXGeneral-Italic;">s</span><span class="mn" id="MathJax-Span-645" style="font-family: STIXGeneral-Regular;">2</span><span class="mo" id="MathJax-Span-646" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">><span style="font-family: STIXGeneral-Regular; font-style: normal; font-weight: normal;"><</span></span><span class="texatom" id="MathJax-Span-647" style="padding-left: 0.301em;"><span class="mrow" id="MathJax-Span-648"><span class="mo" id="MathJax-Span-649" style="font-family: STIXGeneral-Regular;">/</span></span></span><span class="mi" id="MathJax-Span-650" style="font-family: STIXGeneral-Italic;">s</span><span class="mn" id="MathJax-Span-651" style="font-family: STIXGeneral-Regular;">1</span><span class="mo" id="MathJax-Span-652" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span><span style="display: inline-block; width: 0px; height: 3.991em;"></span></span><span style="position: absolute; clip: rect(3.158em 1000em 4.348em -0.533em); top: -2.616em; left: 50%; margin-left: -13.568em;"><span class="mspace" id="MathJax-Span-653" style="height: 0.003em; vertical-align: 0.003em; width: 0.003em; display: inline-block; overflow: hidden;"></span><span class="mo" id="MathJax-Span-654" style="font-family: STIXGeneral-Regular;"><</span><span class="mi" id="MathJax-Span-655" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">s</span><span class="mn" id="MathJax-Span-656" style="font-family: STIXGeneral-Regular;">1</span><span class="mo" id="MathJax-Span-657" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">><span style="font-family: STIXGeneral-Regular; font-style: normal; font-weight: normal;"><</span></span><span class="mi" id="MathJax-Span-658" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">s</span><span class="mn" id="MathJax-Span-659" style="font-family: STIXGeneral-Regular;">2</span><span class="mo" id="MathJax-Span-660" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span><span class="mi" id="MathJax-Span-661" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">n</span><span class="mi" id="MathJax-Span-662" style="font-family: STIXGeneral-Italic;">o</span><span class="mspace" id="MathJax-Span-663" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-664" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-665" style="font-family: STIXGeneral-Italic;">o</span><span class="mspace" id="MathJax-Span-666" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-667" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-668" style="font-family: STIXGeneral-Italic;">o</span><span class="mspace" id="MathJax-Span-669" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-670" style="font-family: STIXGeneral-Italic;">y</span><span class="mi" id="MathJax-Span-671" style="font-family: STIXGeneral-Italic;">e</span><span class="mi" id="MathJax-Span-672" style="font-family: STIXGeneral-Italic;">s</span><span class="mspace" id="MathJax-Span-673" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-674" style="font-family: STIXGeneral-Italic;">y</span><span class="mi" id="MathJax-Span-675" style="font-family: STIXGeneral-Italic;">e</span><span class="mi" id="MathJax-Span-676" style="font-family: STIXGeneral-Italic;">s</span><span class="mspace" id="MathJax-Span-677" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-678" style="font-family: STIXGeneral-Italic;">y</span><span class="mi" id="MathJax-Span-679" style="font-family: STIXGeneral-Italic;">e</span><span class="mi" id="MathJax-Span-680" style="font-family: STIXGeneral-Italic;">s</span><span class="mspace" id="MathJax-Span-681" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-682" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-683" style="font-family: STIXGeneral-Italic;">o</span><span class="mo" id="MathJax-Span-684" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;"><</span><span class="texatom" id="MathJax-Span-685" style="padding-left: 0.301em;"><span class="mrow" id="MathJax-Span-686"><span class="mo" id="MathJax-Span-687" style="font-family: STIXGeneral-Regular;">/</span></span></span><span class="mi" id="MathJax-Span-688" style="font-family: STIXGeneral-Italic;">s</span><span class="mn" id="MathJax-Span-689" style="font-family: STIXGeneral-Regular;">2</span><span class="mo" id="MathJax-Span-690" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">><span style="font-family: STIXGeneral-Regular; font-style: normal; font-weight: normal;"><</span></span><span class="texatom" id="MathJax-Span-691" style="padding-left: 0.301em;"><span class="mrow" id="MathJax-Span-692"><span class="mo" id="MathJax-Span-693" style="font-family: STIXGeneral-Regular;">/</span></span></span><span class="mi" id="MathJax-Span-694" style="font-family: STIXGeneral-Italic;">s</span><span class="mn" id="MathJax-Span-695" style="font-family: STIXGeneral-Regular;">1</span><span class="mo" id="MathJax-Span-696" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span><span style="display: inline-block; width: 0px; height: 3.991em;"></span></span></span></span><span style="display: inline-block; width: 0px; height: 3.991em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 2.861em; vertical-align: -1.925em; color: rgb(255, 255, 255);"></span></span></nobr></span></div><script type="math/tex; mode=display" id="MathJax-Element-34"><s1><s2>yes \quad no \quad  no  \quad no \quad  no  \quad yes </s2></s1>\\<s1><s2>no  \quad no  \quad no  \quad yes \quad  yes  \quad yes  \quad no </s2></s1></script> <br>下面我们的任务是来评估如下这个句子的概率: <br><span class="MathJax_Preview"></span><div class="MathJax_Display" role="textbox" aria-readonly="true" style="text-align: center;"><span class="MathJax" id="MathJax-Element-35-Frame"><nobr><span class="math" id="MathJax-Span-697" style="width: 24.646em; display: inline-block;"><span style="display: inline-block; position: relative; width: 20.539em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.729em 1000em 2.92em -0.473em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-698"><span class="mo" id="MathJax-Span-699" style="font-family: STIXGeneral-Regular;"><</span><span class="mi" id="MathJax-Span-700" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">s</span><span class="mn" id="MathJax-Span-701" style="font-family: STIXGeneral-Regular;">1</span><span class="mo" id="MathJax-Span-702" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">><span style="font-family: STIXGeneral-Regular; font-style: normal; font-weight: normal;"><</span></span><span class="mi" id="MathJax-Span-703" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">s</span><span class="mn" id="MathJax-Span-704" style="font-family: STIXGeneral-Regular;">2</span><span class="mo" id="MathJax-Span-705" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span><span class="mi" id="MathJax-Span-706" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">y</span><span class="mi" id="MathJax-Span-707" style="font-family: STIXGeneral-Italic;">e</span><span class="mi" id="MathJax-Span-708" style="font-family: STIXGeneral-Italic;">s</span><span class="mspace" id="MathJax-Span-709" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-710" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-711" style="font-family: STIXGeneral-Italic;">o</span><span class="mspace" id="MathJax-Span-712" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-713" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-714" style="font-family: STIXGeneral-Italic;">o</span><span class="mspace" id="MathJax-Span-715" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-716" style="font-family: STIXGeneral-Italic;">y</span><span class="mi" id="MathJax-Span-717" style="font-family: STIXGeneral-Italic;">e</span><span class="mi" id="MathJax-Span-718" style="font-family: STIXGeneral-Italic;">s</span><span class="mo" id="MathJax-Span-719" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;"><</span><span class="texatom" id="MathJax-Span-720" style="padding-left: 0.301em;"><span class="mrow" id="MathJax-Span-721"><span class="mo" id="MathJax-Span-722" style="font-family: STIXGeneral-Regular;">/</span></span></span><span class="mi" id="MathJax-Span-723" style="font-family: STIXGeneral-Italic;">s</span><span class="mn" id="MathJax-Span-724" style="font-family: STIXGeneral-Regular;">2</span><span class="mo" id="MathJax-Span-725" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">><span style="font-family: STIXGeneral-Regular; font-style: normal; font-weight: normal;"><</span></span><span class="texatom" id="MathJax-Span-726" style="padding-left: 0.301em;"><span class="mrow" id="MathJax-Span-727"><span class="mo" id="MathJax-Span-728" style="font-family: STIXGeneral-Regular;">/</span></span></span><span class="mi" id="MathJax-Span-729" style="font-family: STIXGeneral-Italic;">s</span><span class="mn" id="MathJax-Span-730" style="font-family: STIXGeneral-Regular;">1</span><span class="mo" id="MathJax-Span-731" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.218em; vertical-align: -0.282em; color: rgb(255, 255, 255);"></span></span></nobr></span></div><script type="math/tex; mode=display" id="MathJax-Element-35"><s1><s2> yes \quad no\quad no\quad yes </s2></s1></script> <br>我们来演示利用trigram模型来计算概率的结果 <br><span class="MathJax_Preview"></span><div class="MathJax_Display" role="textbox" aria-readonly="true"><span class="MathJax" id="MathJax-Element-36-Frame"><nobr><span class="math" id="MathJax-Span-732" style="width: 100%; display: inline-block; min-width: 29.646em;"><span style="display: inline-block; position: relative; width: 100%; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(6.551em 1000em 14.051em -0.533em); top: -8.092em; left: 0.003em; width: 100%;"><span class="mrow" id="MathJax-Span-733"><span style="display: inline-block; position: relative; width: 100%; height: 0px;"><span style="position: absolute; clip: rect(2.443em 1000em 4.884em -0.533em); top: -3.985em; left: 50%; margin-left: -11.366em;"><span class="mi" id="MathJax-Span-734" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-735" style="font-family: STIXGeneral-Regular;">(</span><span class="mi" id="MathJax-Span-736" style="font-family: STIXGeneral-Italic;">y</span><span class="mi" id="MathJax-Span-737" style="font-family: STIXGeneral-Italic;">e</span><span class="mi" id="MathJax-Span-738" style="font-family: STIXGeneral-Italic;">s</span><span class="texatom" id="MathJax-Span-739"><span class="mrow" id="MathJax-Span-740"><span class="mo" id="MathJax-Span-741" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="mo" id="MathJax-Span-742" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;"><</span><span class="mi" id="MathJax-Span-743" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">s</span><span class="mn" id="MathJax-Span-744" style="font-family: STIXGeneral-Regular;">1</span><span class="mo" id="MathJax-Span-745" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">><span style="font-family: STIXGeneral-Regular; font-style: normal; font-weight: normal;"><</span></span><span class="mi" id="MathJax-Span-746" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">s</span><span class="mn" id="MathJax-Span-747" style="font-family: STIXGeneral-Regular;">2</span><span class="mo" id="MathJax-Span-748" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span><span class="mo" id="MathJax-Span-749" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-750" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mfrac" id="MathJax-Span-751" style="padding-left: 0.301em;"><span style="display: inline-block; position: relative; width: 0.598em; height: 0px; margin-right: 0.122em; margin-left: 0.122em;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.414em); top: -3.211em; left: 50%; margin-left: -0.235em;"><span class="mn" id="MathJax-Span-752" style="font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.533em); top: -1.842em; left: 50%; margin-left: -0.235em;"><span class="mn" id="MathJax-Span-753" style="font-family: STIXGeneral-Regular;">2</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(0.836em 1000em 1.253em -0.533em); top: -1.307em; left: 0.003em;"><span style="border-left: 0.598em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.25px; vertical-align: 0.003em;"></span><span style="display: inline-block; width: 0px; height: 1.074em;"></span></span></span></span><span class="mo" id="MathJax-Span-754" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">,</span><span class="mspace" id="MathJax-Span-755" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mspace" id="MathJax-Span-756" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-757" style="font-family: STIXGeneral-Italic; padding-left: 0.182em;">P</span><span class="mo" id="MathJax-Span-758" style="font-family: STIXGeneral-Regular;">(</span><span class="mi" id="MathJax-Span-759" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-760" style="font-family: STIXGeneral-Italic;">o</span><span class="texatom" id="MathJax-Span-761"><span class="mrow" id="MathJax-Span-762"><span class="mo" id="MathJax-Span-763" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="mo" id="MathJax-Span-764" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;"><</span><span class="mi" id="MathJax-Span-765" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">s</span><span class="mn" id="MathJax-Span-766" style="font-family: STIXGeneral-Regular;">2</span><span class="mo" id="MathJax-Span-767" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span><span class="mi" id="MathJax-Span-768" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">y</span><span class="mi" id="MathJax-Span-769" style="font-family: STIXGeneral-Italic;">e</span><span class="mi" id="MathJax-Span-770" style="font-family: STIXGeneral-Italic;">s</span><span class="mo" id="MathJax-Span-771" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-772" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mn" id="MathJax-Span-773" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">1</span><span style="display: inline-block; width: 0px; height: 3.991em;"></span></span><span style="position: absolute; clip: rect(2.443em 1000em 4.884em -0.533em); top: -1.485em; left: 50%; margin-left: -9.402em;"><span class="mspace" id="MathJax-Span-774" style="height: 0.003em; vertical-align: 0.003em; width: 0.003em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-775" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-776" style="font-family: STIXGeneral-Regular;">(</span><span class="mi" id="MathJax-Span-777" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-778" style="font-family: STIXGeneral-Italic;">o</span><span class="texatom" id="MathJax-Span-779"><span class="mrow" id="MathJax-Span-780"><span class="mo" id="MathJax-Span-781" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="mi" id="MathJax-Span-782" style="font-family: STIXGeneral-Italic;">y</span><span class="mi" id="MathJax-Span-783" style="font-family: STIXGeneral-Italic;">e</span><span class="mi" id="MathJax-Span-784" style="font-family: STIXGeneral-Italic;">s</span><span class="mspace" id="MathJax-Span-785" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-786" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-787" style="font-family: STIXGeneral-Italic;">o</span><span class="mo" id="MathJax-Span-788" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-789" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mfrac" id="MathJax-Span-790" style="padding-left: 0.301em;"><span style="display: inline-block; position: relative; width: 0.598em; height: 0px; margin-right: 0.122em; margin-left: 0.122em;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.414em); top: -3.211em; left: 50%; margin-left: -0.235em;"><span class="mn" id="MathJax-Span-791" style="font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.533em); top: -1.842em; left: 50%; margin-left: -0.235em;"><span class="mn" id="MathJax-Span-792" style="font-family: STIXGeneral-Regular;">2</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(0.836em 1000em 1.253em -0.533em); top: -1.307em; left: 0.003em;"><span style="border-left: 0.598em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.25px; vertical-align: 0.003em;"></span><span style="display: inline-block; width: 0px; height: 1.074em;"></span></span></span></span><span class="mo" id="MathJax-Span-793" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">,</span><span class="mspace" id="MathJax-Span-794" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mspace" id="MathJax-Span-795" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-796" style="font-family: STIXGeneral-Italic; padding-left: 0.182em;">P</span><span class="mo" id="MathJax-Span-797" style="font-family: STIXGeneral-Regular;">(</span><span class="mi" id="MathJax-Span-798" style="font-family: STIXGeneral-Italic;">y</span><span class="mi" id="MathJax-Span-799" style="font-family: STIXGeneral-Italic;">e</span><span class="mi" id="MathJax-Span-800" style="font-family: STIXGeneral-Italic;">s</span><span class="texatom" id="MathJax-Span-801"><span class="mrow" id="MathJax-Span-802"><span class="mo" id="MathJax-Span-803" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="mi" id="MathJax-Span-804" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-805" style="font-family: STIXGeneral-Italic;">o</span><span class="mspace" id="MathJax-Span-806" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-807" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-808" style="font-family: STIXGeneral-Italic;">o</span><span class="mo" id="MathJax-Span-809" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-810" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mfrac" id="MathJax-Span-811" style="padding-left: 0.301em;"><span style="display: inline-block; position: relative; width: 0.598em; height: 0px; margin-right: 0.122em; margin-left: 0.122em;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.533em); top: -3.211em; left: 50%; margin-left: -0.235em;"><span class="mn" id="MathJax-Span-812" style="font-family: STIXGeneral-Regular;">2</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(1.67em 1000em 2.741em -0.473em); top: -1.842em; left: 50%; margin-left: -0.235em;"><span class="mn" id="MathJax-Span-813" style="font-family: STIXGeneral-Regular;">5</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(0.836em 1000em 1.253em -0.533em); top: -1.307em; left: 0.003em;"><span style="border-left: 0.598em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.25px; vertical-align: 0.003em;"></span><span style="display: inline-block; width: 0px; height: 1.074em;"></span></span></span></span><span style="display: inline-block; width: 0px; height: 3.991em;"></span></span><span style="position: absolute; clip: rect(2.443em 1000em 4.884em -0.533em); top: 1.074em; left: 50%; margin-left: -12.378em;"><span class="mspace" id="MathJax-Span-814" style="height: 0.003em; vertical-align: 0.003em; width: 0.003em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-815" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-816" style="font-family: STIXGeneral-Regular;">(</span><span class="mo" id="MathJax-Span-817" style="font-family: STIXGeneral-Regular;"><</span><span class="texatom" id="MathJax-Span-818" style="padding-left: 0.301em;"><span class="mrow" id="MathJax-Span-819"><span class="mo" id="MathJax-Span-820" style="font-family: STIXGeneral-Regular;">/</span></span></span><span class="mi" id="MathJax-Span-821" style="font-family: STIXGeneral-Italic;">s</span><span class="mn" id="MathJax-Span-822" style="font-family: STIXGeneral-Regular;">2</span><span class="mo" id="MathJax-Span-823" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span><span class="texatom" id="MathJax-Span-824" style="padding-left: 0.301em;"><span class="mrow" id="MathJax-Span-825"><span class="mo" id="MathJax-Span-826" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="mi" id="MathJax-Span-827" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-828" style="font-family: STIXGeneral-Italic;">o</span><span class="mspace" id="MathJax-Span-829" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-830" style="font-family: STIXGeneral-Italic;">y</span><span class="mi" id="MathJax-Span-831" style="font-family: STIXGeneral-Italic;">e</span><span class="mi" id="MathJax-Span-832" style="font-family: STIXGeneral-Italic;">s</span><span class="mo" id="MathJax-Span-833" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-834" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mfrac" id="MathJax-Span-835" style="padding-left: 0.301em;"><span style="display: inline-block; position: relative; width: 0.598em; height: 0px; margin-right: 0.122em; margin-left: 0.122em;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.414em); top: -3.211em; left: 50%; margin-left: -0.235em;"><span class="mn" id="MathJax-Span-836" style="font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.533em); top: -1.842em; left: 50%; margin-left: -0.235em;"><span class="mn" id="MathJax-Span-837" style="font-family: STIXGeneral-Regular;">2</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(0.836em 1000em 1.253em -0.533em); top: -1.307em; left: 0.003em;"><span style="border-left: 0.598em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.25px; vertical-align: 0.003em;"></span><span style="display: inline-block; width: 0px; height: 1.074em;"></span></span></span></span><span class="mo" id="MathJax-Span-838" style="font-family: STIXGeneral-Regular; padding-left: 0.182em;">,</span><span class="mspace" id="MathJax-Span-839" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mspace" id="MathJax-Span-840" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-841" style="font-family: STIXGeneral-Italic; padding-left: 0.182em;">P</span><span class="mo" id="MathJax-Span-842" style="font-family: STIXGeneral-Regular;">(</span><span class="mo" id="MathJax-Span-843" style="font-family: STIXGeneral-Regular;"><</span><span class="texatom" id="MathJax-Span-844" style="padding-left: 0.301em;"><span class="mrow" id="MathJax-Span-845"><span class="mo" id="MathJax-Span-846" style="font-family: STIXGeneral-Regular;">/</span></span></span><span class="mi" id="MathJax-Span-847" style="font-family: STIXGeneral-Italic;">s</span><span class="mn" id="MathJax-Span-848" style="font-family: STIXGeneral-Regular;">1</span><span class="mo" id="MathJax-Span-849" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span><span class="texatom" id="MathJax-Span-850" style="padding-left: 0.301em;"><span class="mrow" id="MathJax-Span-851"><span class="mo" id="MathJax-Span-852" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="mi" id="MathJax-Span-853" style="font-family: STIXGeneral-Italic;">y</span><span class="mi" id="MathJax-Span-854" style="font-family: STIXGeneral-Italic;">e</span><span class="mi" id="MathJax-Span-855" style="font-family: STIXGeneral-Italic;">s</span><span class="mo" id="MathJax-Span-856" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;"><</span><span class="texatom" id="MathJax-Span-857" style="padding-left: 0.301em;"><span class="mrow" id="MathJax-Span-858"><span class="mo" id="MathJax-Span-859" style="font-family: STIXGeneral-Regular;">/</span></span></span><span class="mi" id="MathJax-Span-860" style="font-family: STIXGeneral-Italic;">s</span><span class="mn" id="MathJax-Span-861" style="font-family: STIXGeneral-Regular;">2</span><span class="mo" id="MathJax-Span-862" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span><span class="mo" id="MathJax-Span-863" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-864" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mn" id="MathJax-Span-865" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">1</span><span style="display: inline-block; width: 0px; height: 3.991em;"></span></span></span></span><span style="display: inline-block; width: 0px; height: 8.098em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 8.646em; vertical-align: -6.996em; color: rgb(255, 255, 255);"></span></span></nobr></span></div><script type="math/tex; mode=display" id="MathJax-Element-36">P(yes | <s1><s2>)=\frac{1}{2}, \quad\quad P(no | <s2>yes)=1\\P(no | yes\quad no)=\frac{1}{2}, \quad\quad P(yes | no\quad no)=\frac{2}{5}\\P(</s2> | no\quad yes)=\frac{1}{2}, \quad\quad P(</s1> | yes</s2>)=1</script> <br>所以我们要求的概率就等于: <br><span class="MathJax_Preview"></span><div class="MathJax_Display" role="textbox" aria-readonly="true" style="text-align: center;"><span class="MathJax" id="MathJax-Element-37-Frame"><nobr><span class="math" id="MathJax-Span-866" style="width: 15.598em; display: inline-block;"><span style="display: inline-block; position: relative; width: 12.979em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.015em 1000em 3.455em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-867"><span class="mfrac" id="MathJax-Span-868"><span style="display: inline-block; position: relative; width: 0.598em; height: 0px; margin-right: 0.122em; margin-left: 0.122em;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.414em); top: -3.211em; left: 50%; margin-left: -0.235em;"><span class="mn" id="MathJax-Span-869" style="font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.533em); top: -1.842em; left: 50%; margin-left: -0.235em;"><span class="mn" id="MathJax-Span-870" style="font-family: STIXGeneral-Regular;">2</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(0.836em 1000em 1.253em -0.533em); top: -1.307em; left: 0.003em;"><span style="border-left: 0.598em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.25px; vertical-align: 0.003em;"></span><span style="display: inline-block; width: 0px; height: 1.074em;"></span></span></span></span><span class="mo" id="MathJax-Span-871" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">×</span><span class="mn" id="MathJax-Span-872" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">1</span><span class="mo" id="MathJax-Span-873" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">×</span><span class="mfrac" id="MathJax-Span-874" style="padding-left: 0.241em;"><span style="display: inline-block; position: relative; width: 0.598em; height: 0px; margin-right: 0.122em; margin-left: 0.122em;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.414em); top: -3.211em; left: 50%; margin-left: -0.235em;"><span class="mn" id="MathJax-Span-875" style="font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.533em); top: -1.842em; left: 50%; margin-left: -0.235em;"><span class="mn" id="MathJax-Span-876" style="font-family: STIXGeneral-Regular;">2</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(0.836em 1000em 1.253em -0.533em); top: -1.307em; left: 0.003em;"><span style="border-left: 0.598em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.25px; vertical-align: 0.003em;"></span><span style="display: inline-block; width: 0px; height: 1.074em;"></span></span></span></span><span class="mo" id="MathJax-Span-877" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">×</span><span class="mfrac" id="MathJax-Span-878" style="padding-left: 0.241em;"><span style="display: inline-block; position: relative; width: 0.598em; height: 0px; margin-right: 0.122em; margin-left: 0.122em;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.533em); top: -3.211em; left: 50%; margin-left: -0.235em;"><span class="mn" id="MathJax-Span-879" style="font-family: STIXGeneral-Regular;">2</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(1.67em 1000em 2.741em -0.473em); top: -1.842em; left: 50%; margin-left: -0.235em;"><span class="mn" id="MathJax-Span-880" style="font-family: STIXGeneral-Regular;">5</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(0.836em 1000em 1.253em -0.533em); top: -1.307em; left: 0.003em;"><span style="border-left: 0.598em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.25px; vertical-align: 0.003em;"></span><span style="display: inline-block; width: 0px; height: 1.074em;"></span></span></span></span><span class="mo" id="MathJax-Span-881" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">×</span><span class="mfrac" id="MathJax-Span-882" style="padding-left: 0.241em;"><span style="display: inline-block; position: relative; width: 0.598em; height: 0px; margin-right: 0.122em; margin-left: 0.122em;"><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.414em); top: -3.211em; left: 50%; margin-left: -0.235em;"><span class="mn" id="MathJax-Span-883" style="font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(1.729em 1000em 2.741em -0.533em); top: -1.842em; left: 50%; margin-left: -0.235em;"><span class="mn" id="MathJax-Span-884" style="font-family: STIXGeneral-Regular;">2</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; clip: rect(0.836em 1000em 1.253em -0.533em); top: -1.307em; left: 0.003em;"><span style="border-left: 0.598em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.25px; vertical-align: 0.003em;"></span><span style="display: inline-block; width: 0px; height: 1.074em;"></span></span></span></span><span class="mo" id="MathJax-Span-885" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">×</span><span class="mn" id="MathJax-Span-886" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">1</span><span class="mo" id="MathJax-Span-887" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mn" id="MathJax-Span-888" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">0.05</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 2.575em; vertical-align: -0.925em; color: rgb(255, 255, 255);"></span></span></nobr></span></div><script type="math/tex; mode=display" id="MathJax-Element-37">\frac{1}{2} \times 1 \times \frac{1}{2} \times \frac{2}{5}\times \frac{1}{2}\times 1=0.05</script><p></p><p>再举一个来自文献[1]的例子,假设现在有一个语料库,我们统计了下面一些词出现的数量</p><p></p><center> <br><img src="http://img.blog.csdn.net/20160429205913410" width="400"> <br></center> <br>下面这个概率作为其他一些已知条件给出: <br><span class="MathJax_Preview"></span><div class="MathJax_Display" role="textbox" aria-readonly="true"><span class="MathJax" id="MathJax-Element-38-Frame"><nobr><span class="math" id="MathJax-Span-889" style="width: 100%; display: inline-block; min-width: 25.955em;"><span style="display: inline-block; position: relative; width: 100%; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(3.098em 1000em 5.777em -0.533em); top: -3.985em; left: 0.003em; width: 100%;"><span class="mrow" id="MathJax-Span-890"><span style="display: inline-block; position: relative; width: 100%; height: 0px;"><span style="position: absolute; clip: rect(3.098em 1000em 4.348em -0.533em); top: -3.985em; left: 50%; margin-left: -10.592em;"><span class="mi" id="MathJax-Span-891" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-892" style="font-family: STIXGeneral-Regular;">(</span><span class="mi" id="MathJax-Span-893" style="font-family: STIXGeneral-Italic;">i</span><span class="texatom" id="MathJax-Span-894"><span class="mrow" id="MathJax-Span-895"><span class="mo" id="MathJax-Span-896" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="mo" id="MathJax-Span-897" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;"><</span><span class="mi" id="MathJax-Span-898" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">s</span><span class="mo" id="MathJax-Span-899" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span><span class="mo" id="MathJax-Span-900" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-901" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mn" id="MathJax-Span-902" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">0.25</span><span class="mspace" id="MathJax-Span-903" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mspace" id="MathJax-Span-904" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mspace" id="MathJax-Span-905" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-906" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-907" style="font-family: STIXGeneral-Regular;">(</span><span class="mi" id="MathJax-Span-908" style="font-family: STIXGeneral-Italic;">e</span><span class="mi" id="MathJax-Span-909" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-910" style="font-family: STIXGeneral-Italic;">g</span><span class="mi" id="MathJax-Span-911" style="font-family: STIXGeneral-Italic;">l<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="mi" id="MathJax-Span-912" style="font-family: STIXGeneral-Italic;">i</span><span class="mi" id="MathJax-Span-913" style="font-family: STIXGeneral-Italic;">s</span><span class="mi" id="MathJax-Span-914" style="font-family: STIXGeneral-Italic;">h</span><span class="texatom" id="MathJax-Span-915"><span class="mrow" id="MathJax-Span-916"><span class="mo" id="MathJax-Span-917" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="mi" id="MathJax-Span-918" style="font-family: STIXGeneral-Italic;">w</span><span class="mi" id="MathJax-Span-919" style="font-family: STIXGeneral-Italic;">a</span><span class="mi" id="MathJax-Span-920" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-921" style="font-family: STIXGeneral-Italic;">t<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="mo" id="MathJax-Span-922" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-923" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mn" id="MathJax-Span-924" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">0.0011</span><span style="display: inline-block; width: 0px; height: 3.991em;"></span></span><span style="position: absolute; clip: rect(3.098em 1000em 4.348em -0.533em); top: -2.616em; left: 50%; margin-left: -10.83em;"><span class="mspace" id="MathJax-Span-925" style="height: 0.003em; vertical-align: 0.003em; width: 0.003em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-926" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-927" style="font-family: STIXGeneral-Regular;">(</span><span class="mi" id="MathJax-Span-928" style="font-family: STIXGeneral-Italic;">f<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.122em;"></span></span><span class="mi" id="MathJax-Span-929" style="font-family: STIXGeneral-Italic;">o</span><span class="mi" id="MathJax-Span-930" style="font-family: STIXGeneral-Italic;">o</span><span class="mi" id="MathJax-Span-931" style="font-family: STIXGeneral-Italic;">d<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="texatom" id="MathJax-Span-932"><span class="mrow" id="MathJax-Span-933"><span class="mo" id="MathJax-Span-934" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="mi" id="MathJax-Span-935" style="font-family: STIXGeneral-Italic;">e</span><span class="mi" id="MathJax-Span-936" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-937" style="font-family: STIXGeneral-Italic;">g</span><span class="mi" id="MathJax-Span-938" style="font-family: STIXGeneral-Italic;">l<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="mi" id="MathJax-Span-939" style="font-family: STIXGeneral-Italic;">i</span><span class="mi" id="MathJax-Span-940" style="font-family: STIXGeneral-Italic;">s</span><span class="mi" id="MathJax-Span-941" style="font-family: STIXGeneral-Italic;">h</span><span class="mo" id="MathJax-Span-942" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-943" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mn" id="MathJax-Span-944" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">0.5</span><span class="mspace" id="MathJax-Span-945" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mspace" id="MathJax-Span-946" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mspace" id="MathJax-Span-947" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-948" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-949" style="font-family: STIXGeneral-Regular;">(</span><span class="mo" id="MathJax-Span-950" style="font-family: STIXGeneral-Regular;"><</span><span class="texatom" id="MathJax-Span-951" style="padding-left: 0.301em;"><span class="mrow" id="MathJax-Span-952"><span class="mo" id="MathJax-Span-953" style="font-family: STIXGeneral-Regular;">/</span></span></span><span class="mi" id="MathJax-Span-954" style="font-family: STIXGeneral-Italic;">s</span><span class="mo" id="MathJax-Span-955" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span><span class="texatom" id="MathJax-Span-956" style="padding-left: 0.301em;"><span class="mrow" id="MathJax-Span-957"><span class="mo" id="MathJax-Span-958" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="mi" id="MathJax-Span-959" style="font-family: STIXGeneral-Italic;">f<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.122em;"></span></span><span class="mi" id="MathJax-Span-960" style="font-family: STIXGeneral-Italic;">o</span><span class="mi" id="MathJax-Span-961" style="font-family: STIXGeneral-Italic;">o</span><span class="mi" id="MathJax-Span-962" style="font-family: STIXGeneral-Italic;">d<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="mo" id="MathJax-Span-963" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-964" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mn" id="MathJax-Span-965" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">0.68</span><span style="display: inline-block; width: 0px; height: 3.991em;"></span></span></span></span><span style="display: inline-block; width: 0px; height: 3.991em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 2.861em; vertical-align: -1.996em; color: rgb(255, 255, 255);"></span></span></nobr></span></div><script type="math/tex; mode=display" id="MathJax-Element-38">P(i|<s>)=0.25\quad \quad \quad P(english|want)=0.0011\\P(food|english)=0.5\quad \quad \quad P(</s>|food)=0.68</script> <br>下面这个表给出的是基于Bigram模型进行计数之结果 <br><center> <br><img src="http://img.blog.csdn.net/20160429210335363" width="450"> <br></center> <br>例如,其中第一行,第二列 表示给定前一个词是 “i” 时,当前词为“want”的情况一共出现了827次。据此,我们便可以算得相应的频率分布表如下。 <br><center> <br><img src="http://img.blog.csdn.net/20160429210352254" width="450"> <br></center> <br>因为我们从表1中知道 “i” 一共出现了2533次,而其后出现 “want” 的情况一共有827次,所以<span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-39-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-966" style="width: 14.17em; display: inline-block;"><span style="display: inline-block; position: relative; width: 11.789em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.67em 1000em 2.92em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-967"><span class="mi" id="MathJax-Span-968" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-969" style="font-family: STIXGeneral-Regular;">(</span><span class="mi" id="MathJax-Span-970" style="font-family: STIXGeneral-Italic;">w</span><span class="mi" id="MathJax-Span-971" style="font-family: STIXGeneral-Italic;">a</span><span class="mi" id="MathJax-Span-972" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-973" style="font-family: STIXGeneral-Italic;">t<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="texatom" id="MathJax-Span-974"><span class="mrow" id="MathJax-Span-975"><span class="mo" id="MathJax-Span-976" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="mi" id="MathJax-Span-977" style="font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-978" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-979" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mn" id="MathJax-Span-980" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">827</span><span class="texatom" id="MathJax-Span-981"><span class="mrow" id="MathJax-Span-982"><span class="mo" id="MathJax-Span-983" style="font-family: STIXGeneral-Regular;">/</span></span></span><span class="mn" id="MathJax-Span-984" style="font-family: STIXGeneral-Regular;">2533</span><span class="mo" id="MathJax-Span-985" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">≈</span><span class="mn" id="MathJax-Span-986" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">0.33</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.218em; vertical-align: -0.282em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-39">P(want|i)=827/2533\approx 0.33</script> <br>现在设<span class="MathJax_Preview"></span><span class="MathJax" id="MathJax-Element-40-Frame" role="textbox" aria-readonly="true"><nobr><span class="math" id="MathJax-Span-987" style="width: 22.801em; display: inline-block;"><span style="display: inline-block; position: relative; width: 18.991em; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(1.729em 1000em 2.92em -0.533em); top: -2.557em; left: 0.003em;"><span class="mrow" id="MathJax-Span-988"><span class="msubsup" id="MathJax-Span-989"><span style="display: inline-block; position: relative; width: 0.836em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-990" style="font-family: STIXGeneral-Italic;">s</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.42em;"><span class="mn" id="MathJax-Span-991" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-992" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mo" id="MathJax-Span-993" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">“</span><span class="mo" id="MathJax-Span-994" style="font-family: STIXGeneral-Regular;"><</span><span class="mi" id="MathJax-Span-995" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">s</span><span class="mo" id="MathJax-Span-996" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span><span class="mi" id="MathJax-Span-997" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">i</span><span class="mspace" id="MathJax-Span-998" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-999" style="font-family: STIXGeneral-Italic;">w</span><span class="mi" id="MathJax-Span-1000" style="font-family: STIXGeneral-Italic;">a</span><span class="mi" id="MathJax-Span-1001" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-1002" style="font-family: STIXGeneral-Italic;">t<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="mspace" id="MathJax-Span-1003" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-1004" style="font-family: STIXGeneral-Italic;">e</span><span class="mi" id="MathJax-Span-1005" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-1006" style="font-family: STIXGeneral-Italic;">g</span><span class="mi" id="MathJax-Span-1007" style="font-family: STIXGeneral-Italic;">l<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="mi" id="MathJax-Span-1008" style="font-family: STIXGeneral-Italic;">i</span><span class="mi" id="MathJax-Span-1009" style="font-family: STIXGeneral-Italic;">s</span><span class="mi" id="MathJax-Span-1010" style="font-family: STIXGeneral-Italic;">h</span><span class="mspace" id="MathJax-Span-1011" style="height: 0.003em; vertical-align: 0.003em; width: 1.134em; display: inline-block; overflow: hidden;"></span><span class="mi" id="MathJax-Span-1012" style="font-family: STIXGeneral-Italic;">f<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.122em;"></span></span><span class="mi" id="MathJax-Span-1013" style="font-family: STIXGeneral-Italic;">o</span><span class="mi" id="MathJax-Span-1014" style="font-family: STIXGeneral-Italic;">o</span><span class="mi" id="MathJax-Span-1015" style="font-family: STIXGeneral-Italic;">d<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="mo" id="MathJax-Span-1016" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;"><</span><span class="texatom" id="MathJax-Span-1017" style="padding-left: 0.301em;"><span class="mrow" id="MathJax-Span-1018"><span class="mo" id="MathJax-Span-1019" style="font-family: STIXGeneral-Regular;">/</span></span></span><span class="mi" id="MathJax-Span-1020" style="font-family: STIXGeneral-Italic;">s</span><span class="mo" id="MathJax-Span-1021" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span><span class="mo" id="MathJax-Span-1022" style="font-family: STIXGeneral-Regular;">”</span></span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 1.218em; vertical-align: -0.282em; color: rgb(255, 255, 255);"></span></span></nobr></span><script type="math/tex" id="MathJax-Element-40">s_1=“<s> i\quad want\quad english\quad food </s>”</script> ,则可以算得 <br><span class="MathJax_Preview"></span><div class="MathJax_Display" role="textbox" aria-readonly="true"><span class="MathJax" id="MathJax-Element-41-Frame"><nobr><span class="math" id="MathJax-Span-1023" style="width: 100%; display: inline-block; min-width: 36.729em;"><span style="display: inline-block; position: relative; width: 100%; height: 0px; font-size: 120%;"><span style="position: absolute; clip: rect(3.158em 1000em 5.598em -0.533em); top: -3.985em; left: 0.003em; width: 100%;"><span class="mrow" id="MathJax-Span-1024"><span style="display: inline-block; position: relative; width: 100%; height: 0px;"><span style="position: absolute; clip: rect(3.158em 1000em 4.348em -0.533em); top: -3.985em; left: 50%; margin-left: -15.295em;"><span class="mi" id="MathJax-Span-1025" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-1026" style="font-family: STIXGeneral-Regular;">(</span><span class="msubsup" id="MathJax-Span-1027"><span style="display: inline-block; position: relative; width: 0.836em; height: 0px;"><span style="position: absolute; clip: rect(1.967em 1000em 2.741em -0.533em); top: -2.557em; left: 0.003em;"><span class="mi" id="MathJax-Span-1028" style="font-family: STIXGeneral-Italic;">s</span><span style="display: inline-block; width: 0px; height: 2.562em;"></span></span><span style="position: absolute; top: -1.902em; left: 0.42em;"><span class="mn" id="MathJax-Span-1029" style="font-size: 70.7%; font-family: STIXGeneral-Regular;">1</span><span style="display: inline-block; width: 0px; height: 2.086em;"></span></span></span></span><span class="mo" id="MathJax-Span-1030" style="font-family: STIXGeneral-Regular;">)</span><span class="mo" id="MathJax-Span-1031" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mi" id="MathJax-Span-1032" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">P</span><span class="mo" id="MathJax-Span-1033" style="font-family: STIXGeneral-Regular;">(</span><span class="mi" id="MathJax-Span-1034" style="font-family: STIXGeneral-Italic;">i</span><span class="texatom" id="MathJax-Span-1035"><span class="mrow" id="MathJax-Span-1036"><span class="mo" id="MathJax-Span-1037" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="mo" id="MathJax-Span-1038" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;"><</span><span class="mi" id="MathJax-Span-1039" style="font-family: STIXGeneral-Italic; padding-left: 0.301em;">s</span><span class="mo" id="MathJax-Span-1040" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span><span class="mo" id="MathJax-Span-1041" style="font-family: STIXGeneral-Regular;">)</span><span class="mi" id="MathJax-Span-1042" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-1043" style="font-family: STIXGeneral-Regular;">(</span><span class="mi" id="MathJax-Span-1044" style="font-family: STIXGeneral-Italic;">w</span><span class="mi" id="MathJax-Span-1045" style="font-family: STIXGeneral-Italic;">a</span><span class="mi" id="MathJax-Span-1046" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-1047" style="font-family: STIXGeneral-Italic;">t<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="texatom" id="MathJax-Span-1048"><span class="mrow" id="MathJax-Span-1049"><span class="mo" id="MathJax-Span-1050" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="mi" id="MathJax-Span-1051" style="font-family: STIXGeneral-Italic;">i</span><span class="mo" id="MathJax-Span-1052" style="font-family: STIXGeneral-Regular;">)</span><span class="mi" id="MathJax-Span-1053" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-1054" style="font-family: STIXGeneral-Regular;">(</span><span class="mi" id="MathJax-Span-1055" style="font-family: STIXGeneral-Italic;">e</span><span class="mi" id="MathJax-Span-1056" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-1057" style="font-family: STIXGeneral-Italic;">g</span><span class="mi" id="MathJax-Span-1058" style="font-family: STIXGeneral-Italic;">l<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="mi" id="MathJax-Span-1059" style="font-family: STIXGeneral-Italic;">i</span><span class="mi" id="MathJax-Span-1060" style="font-family: STIXGeneral-Italic;">s</span><span class="mi" id="MathJax-Span-1061" style="font-family: STIXGeneral-Italic;">h</span><span class="texatom" id="MathJax-Span-1062"><span class="mrow" id="MathJax-Span-1063"><span class="mo" id="MathJax-Span-1064" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="mi" id="MathJax-Span-1065" style="font-family: STIXGeneral-Italic;">w</span><span class="mi" id="MathJax-Span-1066" style="font-family: STIXGeneral-Italic;">a</span><span class="mi" id="MathJax-Span-1067" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-1068" style="font-family: STIXGeneral-Italic;">t<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="mo" id="MathJax-Span-1069" style="font-family: STIXGeneral-Regular;">)</span><span class="mi" id="MathJax-Span-1070" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-1071" style="font-family: STIXGeneral-Regular;">(</span><span class="mi" id="MathJax-Span-1072" style="font-family: STIXGeneral-Italic;">f<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.122em;"></span></span><span class="mi" id="MathJax-Span-1073" style="font-family: STIXGeneral-Italic;">o</span><span class="mi" id="MathJax-Span-1074" style="font-family: STIXGeneral-Italic;">o</span><span class="mi" id="MathJax-Span-1075" style="font-family: STIXGeneral-Italic;">d<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="texatom" id="MathJax-Span-1076"><span class="mrow" id="MathJax-Span-1077"><span class="mo" id="MathJax-Span-1078" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="mi" id="MathJax-Span-1079" style="font-family: STIXGeneral-Italic;">e</span><span class="mi" id="MathJax-Span-1080" style="font-family: STIXGeneral-Italic;">n</span><span class="mi" id="MathJax-Span-1081" style="font-family: STIXGeneral-Italic;">g</span><span class="mi" id="MathJax-Span-1082" style="font-family: STIXGeneral-Italic;">l<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="mi" id="MathJax-Span-1083" style="font-family: STIXGeneral-Italic;">i</span><span class="mi" id="MathJax-Span-1084" style="font-family: STIXGeneral-Italic;">s</span><span class="mi" id="MathJax-Span-1085" style="font-family: STIXGeneral-Italic;">h</span><span class="mo" id="MathJax-Span-1086" style="font-family: STIXGeneral-Regular;">)</span><span class="mi" id="MathJax-Span-1087" style="font-family: STIXGeneral-Italic;">P</span><span class="mo" id="MathJax-Span-1088" style="font-family: STIXGeneral-Regular;">(</span><span class="mo" id="MathJax-Span-1089" style="font-family: STIXGeneral-Regular;"><</span><span class="texatom" id="MathJax-Span-1090" style="padding-left: 0.301em;"><span class="mrow" id="MathJax-Span-1091"><span class="mo" id="MathJax-Span-1092" style="font-family: STIXGeneral-Regular;">/</span></span></span><span class="mi" id="MathJax-Span-1093" style="font-family: STIXGeneral-Italic;">s</span><span class="mo" id="MathJax-Span-1094" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">></span><span class="texatom" id="MathJax-Span-1095" style="padding-left: 0.301em;"><span class="mrow" id="MathJax-Span-1096"><span class="mo" id="MathJax-Span-1097" style="font-family: STIXGeneral-Regular;">|</span></span></span><span class="mi" id="MathJax-Span-1098" style="font-family: STIXGeneral-Italic;">f<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.122em;"></span></span><span class="mi" id="MathJax-Span-1099" style="font-family: STIXGeneral-Italic;">o</span><span class="mi" id="MathJax-Span-1100" style="font-family: STIXGeneral-Italic;">o</span><span class="mi" id="MathJax-Span-1101" style="font-family: STIXGeneral-Italic;">d<span style="display: inline-block; overflow: hidden; height: 1px; width: 0.003em;"></span></span><span class="mo" id="MathJax-Span-1102" style="font-family: STIXGeneral-Regular;">)</span><span style="display: inline-block; width: 0px; height: 3.991em;"></span></span><span style="position: absolute; clip: rect(3.098em 1000em 4.17em -0.533em); top: -2.616em; left: 50%; margin-left: -9.878em;"><span class="mspace" id="MathJax-Span-1103" style="height: 0.003em; vertical-align: 0.003em; width: 0.003em; display: inline-block; overflow: hidden;"></span><span class="mo" id="MathJax-Span-1104" style="font-family: STIXGeneral-Regular;">=</span><span class="mn" id="MathJax-Span-1105" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">0.25</span><span class="mo" id="MathJax-Span-1106" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">×</span><span class="mn" id="MathJax-Span-1107" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">0.33</span><span class="mo" id="MathJax-Span-1108" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">×</span><span class="mn" id="MathJax-Span-1109" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">0.0011</span><span class="mo" id="MathJax-Span-1110" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">×</span><span class="mn" id="MathJax-Span-1111" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">0.5</span><span class="mo" id="MathJax-Span-1112" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">×</span><span class="mn" id="MathJax-Span-1113" style="font-family: STIXGeneral-Regular; padding-left: 0.241em;">0.68</span><span class="mo" id="MathJax-Span-1114" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">=</span><span class="mn" id="MathJax-Span-1115" style="font-family: STIXGeneral-Regular; padding-left: 0.301em;">0.000031</span><span style="display: inline-block; width: 0px; height: 3.991em;"></span></span></span></span><span style="display: inline-block; width: 0px; height: 3.991em;"></span></span></span><span style="border-left: 0.004em solid; display: inline-block; overflow: hidden; width: 0px; height: 2.646em; vertical-align: -1.782em; color: rgb(255, 255, 255);"></span></span></nobr></span></div><script type="math/tex; mode=display" id="MathJax-Element-41">P(s_1)=P(i|<s>)P(want|i)P(english|want)P(food|english)P(</s>|food)\\=0.25\times 0.33\times 0.0011\times 0.5\times 0.68=0.000031</script><p></p><hr><h2 id="使用n-gram模型时的数据平滑算法"><a name="t4"></a>使用N-Gram模型时的数据平滑算法</h2><p>有研究人员用150万词的训练语料来训练 trigram 模型,然后用同样来源的<a href="http://lib.csdn.net/base/softwaretest" class="replace_word" title="软件测试知识库" target="_blank" style="color:#df3434; font-weight:bold;">测试</a>语料来做验证,结果发现23%的 trigram 没有在训练语料中出现过。这其实就意味着上一节我们所计算的那些概率有空为 0,这就导致了数据稀疏的可能性,我们的表3中也确实有些为0的情况。对语言而言,由于数据稀疏的存在,极大似然法不是一种很好的参数估计办法。</p><p>这时的解决办法,我们称之为“平滑技术”(Smoothing)或者 “减值” (Discounting)。其主要策略是把在训练样本中出现过的事件的概率适当减小,然后把减小得到的概率密度分配给训练语料中没有出现过的事件。实际中平滑算法有很多种,例如: <br>  ▸ Laplacian (add-one) smoothing <br>  ▸ Add-k smoothing <br>  ▸ Jelinek-Mercer interpolation <br>  ▸ Katz backoff <br>  ▸ Absolute discounting <br>  ▸ Kneser-Ney</p><p>对于这些算法的详细介绍,我们将在后续的文章中结合一些实例再来进行讨论。</p><hr><h2 id="a-final-word"><a name="t5"></a>A Final Word</h2><p>如果你能从前面那些繁冗、复杂的概念和公式中挺过来,恭喜你,你对N-Gram模型已经有所认识了。尽管,我们还没来得及探讨平滑算法(但它即将出现在我的下一篇博文里,如果你觉得还未过瘾的话),但是其实你已经掌握了一个相对powerful的工具。你可以能会问,在实践中N-Gram模型有哪些具体应用,作为本文的结束,主页君便在此补充几个你曾见过的或者曾经好奇它是如何实现的例子。</p><p><strong>Eg.1</strong> <br><a href="http://lib.csdn.net/base/searchengine" class="replace_word" title="搜索引擎知识库" target="_blank" style="color:#df3434; font-weight:bold;">搜索引擎</a>(Google或者Baidu)、或者输入法的猜想或者提示。你在用百度时,输入一个或几个词,搜索框通常会以下拉菜单的形式给出几个像下图一样的备选,这些备选其实是在猜想你想要搜索的那个词串。再者,当你用输入法输入一个汉字的时候,输入法通常可以联系出一个完整的词,例如我输入一个“刘”字,通常输入法会提示我是否要输入的是“刘备”。通过上面的介绍,你应该能够很敏锐的发觉,这其实是以N-Gram模型为基础来实现的,如果你能有这种觉悟或者想法,那我不得不恭喜你,都学会抢答了!</p><p></p><center> <br><img src="http://img.blog.csdn.net/20160501123851358" width="260"> <br></center><p></p><p><strong>Eg.2</strong> <br>某某作家或者语料库风格的文本自动生成。这是一个相当有趣的话题。来看下面这段话(该例子取材自文献【1】):</p><blockquote>  <p>“You are uniformly charming!” cried he, with a smile of associating and now and then I bowed and they perceived a chaise and four to wish for.</p></blockquote><p>你应该还没有感觉到它有什么异样吧。但事实上这并不是由人类写出的句子,而是计算机根据Jane Austen的语料库利用trigram模型自动生成的文段。(Jane Austen是英国著名女作家,代表作有《傲慢与偏见》等)</p><p>再来看两个例子,你是否能看出它们是按照哪位文豪(或者语料库)的风格生成的吗?</p><ul><li>This shall forbid it should be branded, if renown made it empty.</li><li>They also point to ninety nine point six billion dollars from two hundred four oh three percent of the rates of interest stores as Mexico and Brazil on market conditions.</li></ul><p>答案是第一个是莎士比亚,第二个是华尔街日报。最后一个问题留给读者思考,你觉得上面两个文段所运用的n-gram模型中,n应该等于多少?</p><hr><h2 id="推荐阅读和参考文献"><a name="t6"></a>推荐阅读和参考文献:</h2><p>[1] Speech and Language Processing. Daniel Jurafsky & James H. Martin, 3rd. Chapter 4 <br>[2] 本文中的一些例子和描述来自 北京大学 常宝宝 以及 The University of Melbourne “Web Search and Text Analysis” 课程的幻灯片素材</p></div>        <script type="text/javascript">            $(function () {                $('pre.prettyprint code').each(function () {                    var lines = $(this).text().split('\n').length;                    var $numbering = $('<ul></ul>').addClass('pre-numbering').hide();                    $(this).addClass('has-numbering').parent().append($numbering);                    for (i = 1; i <= lines; i++) {                        $numbering.append($('<li></li>').text(i));                    };                    $numbering.fadeIn(1700);                });            });        </script>   </div><!-- Baidu Button BEGIN --><div class="bdsharebuttonbox tracking-ad bdshare-button-style0-16" style="float: right;" data-mod="popu_172" data-bd-bind="1495108490888"><a href="#" class="bds_more" data-cmd="more" style="background-position:0 0 !important; background-image: url(http://bdimg.share.baidu.com/static/api/img/share/icons_0_16.png?v=d754dcc0.png) !important" target="_blank"></a><a href="#" class="bds_qzone" data-cmd="qzone" title="分享到QQ空间" style="background-position:0 -52px !important" target="_blank"></a><a href="#" class="bds_tsina" data-cmd="tsina" title="分享到新浪微博" style="background-position:0 -104px !important" target="_blank"></a><a href="#" class="bds_tqq" data-cmd="tqq" title="分享到腾讯微博" style="background-position:0 -260px !important" target="_blank"></a><a href="#" class="bds_renren" data-cmd="renren" title="分享到人人网" style="background-position:0 -208px !important" target="_blank"></a><a href="#" class="bds_weixin" data-cmd="weixin" title="分享到微信" style="background-position:0 -1612px !important" target="_blank"></a></div><script>window._bd_share_config = { "common": { "bdSnsKey": {}, "bdText": "", "bdMini": "1", "bdMiniList": false, "bdPic": "", "bdStyle": "0", "bdSize": "16" }, "share": {} }; with (document) 0[(getElementsByTagName('head')[0] || body).appendChild(createElement('script')).src = 'http://bdimg.share.baidu.com/static/api/js/share.js?v=89860593.js?cdnversion=' + ~(-new Date() / 36e5)];</script><!-- Baidu Button END -->   <link rel="stylesheet" href="http://static.blog.csdn.net/css/blog_detail.css">    <!--172.16.140.12--><!-- Baidu Button BEGIN --><script type="text/javascript" id="bdshare_js" data="type=tools&uid=1536434" src="http://bdimg.share.baidu.com/static/js/bds_s_v2.js?cdnversion=415308"></script><script type="text/javascript">    document.getElementById("bdshell_js").src = "http://bdimg.share.baidu.com/static/js/shell_v2.js?cdnversion=" + Math.ceil(new Date()/3600000)</script><!-- Baidu Button END --><script type="text/javascript">    var fromjs = $("#fromjs");    if (fromjs.length > 0) {            $("#fromjs .markdown_views pre").addClass("prettyprint");            prettyPrint();            $('pre.prettyprint code').each(function () {                var lines = $(this).text().split('\n').length;                var $numbering = $('<ul/>').addClass('pre-numbering').hide();                $(this).addClass('has-numbering').parent().append($numbering);                for (i = 1; i <= lines; i++) {                    $numbering.append($('<li/>').text(i));                };                $numbering.fadeIn(1700);            });            $('.pre-numbering li').css("color", "#999");        }        $(".markdown_views a[target!='_blank']").attr("target", "_blank");    $(".toc a[target='_blank']").attr("target", "");</script>         <div id="digg" articleid="51281816">            <dl id="btnDigg" class="digg digg_disable" onclick="btndigga();">                                <dt>顶</dt>                <dd>7</dd>            </dl>                                     <dl id="btnBury" class="digg digg_disable" onclick="btnburya();">                                <dt>踩</dt>                <dd>0</dd>                           </dl>                    </div>     <div class="tracking-ad" data-mod="popu_222"><a href="javascript:void(0);" target="_blank"> </a>   </div>    <div class="tracking-ad" data-mod="popu_223"> <a href="javascript:void(0);" target="_blank"> </a></div>    <script type="text/javascript">                function btndigga() {                    $(".tracking-ad[data-mod='popu_222'] a").click();                }                function btnburya() {                    $(".tracking-ad[data-mod='popu_223'] a").click();                }            </script>   <ul class="article_next_prev">                <li class="prev_article"><span onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_shangyipian']);location.href='/baimafujinji/article/details/51281367';">上一篇</span><a href="/baimafujinji/article/details/51281367" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_shangyipian'])">从小蝌蚪找妈妈谈“机器学习VS数据挖掘”</a></li>                <li class="next_article"><span onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_xiayipian']);location.href='/baimafujinji/article/details/51285082';">下一篇</span><a href="/baimafujinji/article/details/51285082" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_xiayipian'])">机器学习中的隐马尔科夫模型(HMM)详解</a></li>    </ul>    <div style="clear:both; height:10px;"></div>    <div id="articlecommend" style="display:none"><li><em>•</em><a href="http://blog.csdn.net/GarfieldEr007/article/details/50845665" title="MIT自然语言处理第三讲概率语言模型第四五六部分" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">MIT自然语言处理第三讲概率语言模型第四五六部分</a></li><li><em>•</em><a href="http://blog.csdn.net/ahmanz/article/details/51273500" title="N-Gram语言模型" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">N-Gram语言模型</a></li><li><em>•</em><a href="http://blog.csdn.net/hadoopX/article/details/57075203" title="CS224d-Day 2深度学习与自然语言处理" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">CS224d-Day 2深度学习与自然语言处理</a></li><li><em>•</em><a href="http://blog.csdn.net/zhangf666/article/details/56485067" title="深度学习概览之自然语言处理从基本概念到前沿研究" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">深度学习概览之自然语言处理从基本概念到前沿研究</a></li><li><em>•</em><a href="http://blog.csdn.net/sxh850297968/article/details/41383651" title="实词和虚词的区别自然语言处理要用到" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">实词和虚词的区别自然语言处理要用到</a></li><li><em>•</em><a href="http://blog.csdn.net/artemisrj/article/details/50813031" title="自然语言处理的一些工具" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">自然语言处理的一些工具</a></li><li><em>•</em><a href="http://blog.csdn.net/jinxiaoqiang0608/article/details/38143169" title="智能技术与自然语言处理研究室" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">智能技术与自然语言处理研究室</a></li><li><em>•</em><a href="http://blog.csdn.net/u013378306/article/details/55509784" title="自然语言处理工具 nltk 安装使用" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">自然语言处理工具 nltk 安装使用</a></li><li><em>•</em><a href="http://blog.csdn.net/u011415481/article/details/51131888" title="中文信息处理 N-gram模型" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">中文信息处理 N-gram模型</a></li><li><em>•</em><a href="http://blog.csdn.net/sinat_21062543/article/details/56670115" title="统计自然语言处理基础学习笔记1" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">统计自然语言处理基础学习笔记1</a></li></div>        <div class="similar_article">                <h4></h4>                <div class="similar_c" style="margin:20px 0px 0px 0px">                    <div class="similar_c_t">                        相关文章推荐                    </div>                                       <div class="similar_wrap tracking-ad" data-mod="popu_36" style="max-height:195px;">                                               <ul class="similar_list fl">                                                  <li><em>•</em><a href="http://blog.csdn.net/GarfieldEr007/article/details/50845665" title="MIT自然语言处理第三讲概率语言模型第四五六部分" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">MIT自然语言处理第三讲概率语言模型第四五六部分</a></li><li><em>•</em><a href="http://blog.csdn.net/ahmanz/article/details/51273500" title="N-Gram语言模型" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">N-Gram语言模型</a></li><li><em>•</em><a href="http://blog.csdn.net/hadoopX/article/details/57075203" title="CS224d-Day 2深度学习与自然语言处理" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">CS224d-Day 2深度学习与自然语言处理</a></li><li><em>•</em><a href="http://blog.csdn.net/zhangf666/article/details/56485067" title="深度学习概览之自然语言处理从基本概念到前沿研究" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">深度学习概览之自然语言处理从基本概念到前沿研究</a></li><li><em>•</em><a href="http://blog.csdn.net/sxh850297968/article/details/41383651" title="实词和虚词的区别自然语言处理要用到" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">实词和虚词的区别自然语言处理要用到</a></li></ul>                          <ul class="similar_list fr">                                                  <li><em>•</em><a href="http://blog.csdn.net/artemisrj/article/details/50813031" title="自然语言处理的一些工具" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">自然语言处理的一些工具</a></li><li><em>•</em><a href="http://blog.csdn.net/jinxiaoqiang0608/article/details/38143169" title="智能技术与自然语言处理研究室" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">智能技术与自然语言处理研究室</a></li><li><em>•</em><a href="http://blog.csdn.net/u013378306/article/details/55509784" title="自然语言处理工具 nltk 安装使用" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">自然语言处理工具 nltk 安装使用</a></li><li><em>•</em><a href="http://blog.csdn.net/u011415481/article/details/51131888" title="中文信息处理 N-gram模型" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">中文信息处理 N-gram模型</a></li><li><em>•</em><a href="http://blog.csdn.net/sinat_21062543/article/details/56670115" title="统计自然语言处理基础学习笔记1" strategy="SearchAlgorithm" target="_blank" style="width: 290px;">统计自然语言处理基础学习笔记1</a></li></ul>                    </div>                </div>            </div>          </div>     <div>                    <script type="text/javascript">             /*博客内容页下方Banner1-960*90,创建于2016-12-13*/             var cpro_id = "u2843955";        </script>        <script type="text/javascript" src="http://cpro.baidustatic.com/cpro/ui/c.js"></script><div id="BAIDU_SSP__wrapper_u2843955_0"><iframe id="iframeu2843955_0" src="http://pos.baidu.com/lcbm?rdid=2843955&dc=3&di=u2843955&dri=0&dis=0&dai=1&ps=8021x275&dcb=___adblockplus&dtm=HTML_POST&dvi=0.0&dci=-1&dpt=none&tsr=0&tpr=1495108489699&ti=%E8%87%AA%E7%84%B6%E8%AF%AD%E8%A8%80%E5%A4%84%E7%90%86%E4%B8%AD%E7%9A%84N-Gram%E6%A8%A1%E5%9E%8B%E8%AF%A6%E8%A7%A3%20-%20%E7%99%BD%E9%A9%AC%E8%B4%9F%E9%87%91%E7%BE%81%20-%20%E5%8D%9A%E5%AE%A2%E9%A2%91%E9%81%93%20-%20CSDN.NET&ari=2&dbv=2&drs=1&pcs=1261x665&pss=1265x8062&cfv=0&cpl=5&chi=1&cce=true&cec=UTF-8&tlm=1495108489&rw=680&ltu=http%3A%2F%2Fblog.csdn.net%2Fbaimafujinji%2Farticle%2Fdetails%2F51281816&ltr=https%3A%2F%2Fwww.baidu.com%2Flink%3Furl%3DdW7R5yW4K-ZUObYxux76x1LwsDUpgMJ6yMHy_q46XqssVoNrnlbQX428ZtV99Pw7IECXxdPvjlafV7rqtkqiapQKxhrSoFhuKk59_Hb53iC%26wd%3D%26eqid%3Dbc559f330003952500000002591d727a&ecd=1&uc=1860x1057&pis=-1x-1&sr=1920x1080&tcn=1495108490&qn=f4a405c6ac03db7f&tt=1495108489680.20.121.123" width="960" height="90" align="center,center" vspace="0" hspace="0" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" style="border:0;vertical-align:bottom;margin:0;width:960px;height:90px" allowtransparency="true"></iframe></div>    </div><div id="suggest"></div>         <script language="javascript" type="text/javascript">                  $(function(){                 $.get("/baimafujinji/svc/GetSuggestContent/51281816",function(data){                     $("#suggest").html(data);                 });                  });                      </script>  <style>.blog-ass-articl dd {color: #369;width: 99%; /*修改行*/float: left;overflow: hidden;font: normal normal 12px/23px "SimSun";height: 23px;margin: 0;padding: 0 0 0 10px;margin-right: 30px;background: url(http://static.blog.csdn.net/skin/default/images/blog-dot-red3.gif) no-repeat 0 10px;}</style> <link rel="stylesheet" href="http://static.blog.csdn.net/css/replace.css"><div id="relate" data-mod="popu_218" class="tracking-ad" style="display: block;">        <div class="relate_t">            <h3><span>参考知识库</span></h3>        </div>        <div class="relate_c"><dl class="relate_list"><dt><a target="_blank" href="http://lib.csdn.net/base/searchengine"><img src="http://img.knowledge.csdn.net/upload/base/1490785469420_420.jpg" alt="img"></a></dt><dd><h4><a target="_blank" href="http://lib.csdn.net/base/searchengine">搜索引擎知识库</a></h4><p><label><span>316</span><em>关注</em><i>|</i><span>9</span><em>收录</em></label></p></dd></dl><dl class="relate_list"><dt><a target="_blank" href="http://lib.csdn.net/base/machinelearning"><img src="http://img.knowledge.csdn.net/upload/base/1490757082721_721.jpg" alt="img"></a></dt><dd><h4><a target="_blank" href="http://lib.csdn.net/base/machinelearning">机器学习知识库</a></h4><p><label><span>18042</span><em>关注</em><i>|</i><span>2164</span><em>收录</em></label></p></dd></dl><dl class="relate_list"><dt><a target="_blank" href="http://lib.csdn.net/base/dotnet"><img src="http://img.knowledge.csdn.net/upload/base/1470876331285_285.jpg" alt="img"></a></dt><dd><h4><a target="_blank" href="http://lib.csdn.net/base/dotnet">.NET知识库</a></h4><p><label><span>3889</span><em>关注</em><i>|</i><span>839</span><em>收录</em></label></p></dd></dl><dl class="relate_list"><dt><a target="_blank" href="http://lib.csdn.net/base/nlp"><img src="http://img.knowledge.csdn.net/upload/base/1490351555268_268.jpg" alt="img"></a></dt><dd><h4><a target="_blank" href="http://lib.csdn.net/base/nlp">自然语言理解和处理知识库</a></h4><p><label><span>542</span><em>关注</em><i>|</i><span>97</span><em>收录</em></label></p></dd></dl><dl class="relate_list"><dt><a target="_blank" href="http://lib.csdn.net/base/aimachinelearning"><img src="http://img.knowledge.csdn.net/upload/base/1490775139619_619.jpg" alt="img"></a></dt><dd><h4><a target="_blank" href="http://lib.csdn.net/base/aimachinelearning">人工智能机器学习知识库</a></h4><p><label><span>879</span><em>关注</em><i>|</i><span>321</span><em>收录</em></label></p></dd></dl><dl class="relate_list"><dt><a target="_blank" href="http://lib.csdn.net/base/softwaretest"><img src="http://img.knowledge.csdn.net/upload/base/1467193268346_346.jpg" alt="img"></a></dt><dd><h4><a target="_blank" href="http://lib.csdn.net/base/softwaretest">软件测试知识库</a></h4><p><label><span>4704</span><em>关注</em><i>|</i><span>318</span><em>收录</em></label></p></dd></dl><dl class="relate_list"><dt><a target="_blank" href="http://lib.csdn.net/base/javase"><img src="http://img.knowledge.csdn.net/upload/base/1453169124297_297.jpg" alt="img"></a></dt><dd><h4><a target="_blank" href="http://lib.csdn.net/base/javase">Java SE知识库</a></h4><p><label><span>26207</span><em>关注</em><i>|</i><span>578</span><em>收录</em></label></p></dd></dl><dl class="relate_list"><dt><a target="_blank" href="http://lib.csdn.net/base/javaee"><img src="http://img.knowledge.csdn.net/upload/base/1456818035722_722.jpg" alt="img"></a></dt><dd><h4><a target="_blank" href="http://lib.csdn.net/base/javaee">Java EE知识库</a></h4><p><label><span>18299</span><em>关注</em><i>|</i><span>1334</span><em>收录</em></label></p></dd></dl><dl class="relate_list"><dt><a target="_blank" href="http://lib.csdn.net/base/java"><img src="http://img.knowledge.csdn.net/upload/base/1453701371636_636.jpg" alt="img"></a></dt><dd><h4><a target="_blank" href="http://lib.csdn.net/base/java">Java 知识库</a></h4><p><label><span>26727</span><em>关注</em><i>|</i><span>1476</span><em>收录</em></label></p></dd></dl><dl class="relate_list"><dt><a target="_blank" href="http://lib.csdn.net/base/datastructure"><img src="http://img.knowledge.csdn.net/upload/base/1461035533512_512.jpg" alt="img"></a></dt><dd><h4><a target="_blank" href="http://lib.csdn.net/base/datastructure">算法与数据结构知识库</a></h4><p><label><span>16254</span><em>关注</em><i>|</i><span>2320</span><em>收录</em></label></p></dd></dl></div></div> <dl class="blog-ass-articl tracking-ad" id="res-relatived" data-mod="popu_84">     <div class="embody embody_b" id="libkeyparent" style="display:none">            <span class="embody_t">更多资料请参考:</span>            <div class="embody_c" id="libkey"></div>    </div>     <dt><span>猜你在找</span></dt>           <div id="adcollegedata" style="display:none"><div class="tracking-ad" data-mod="popu_84"><dd style="background:url(http://static.blog.csdn.net/skin/default/images/blog-dot-red3.gif) no-repeat 0 10px; white-space: nowrap;"><a href="http://edu.csdn.net/course/detail/662" title="Java Swing、JDBC开发桌面级应用" strategy="v4:content" target="_blank">Java Swing、JDBC开发桌面级应用</a></dd><dd style="background:url(http://static.blog.csdn.net/skin/default/images/blog-dot-red3.gif) no-repeat 0 10px; white-space: nowrap;"><a href="http://edu.csdn.net/course/detail/4179" title="深入Javascript字符串实战视频课程" strategy="v4:content" target="_blank">深入Javascript字符串实战视频课程</a></dd><dd style="background:url(http://static.blog.csdn.net/skin/default/images/blog-dot-red3.gif) no-repeat 0 10px; white-space: nowrap;"><a href="http://edu.csdn.net/course/detail/4282" title="java数据库连接技术JDBC" strategy="v4:content" target="_blank">java数据库连接技术JDBC</a></dd><dd style="background:url(http://static.blog.csdn.net/skin/default/images/blog-dot-red3.gif) no-repeat 0 10px; white-space: nowrap;"><a href="http://edu.csdn.net/course/detail/1198" title="《C语言/C++学习指南》加密解密篇(安全相关算法)" strategy="v4:content" target="_blank">《C语言/C++学习指南》加密解密篇(安全相关算法)</a></dd><dd style="background:url(http://static.blog.csdn.net/skin/default/images/blog-dot-red3.gif) no-repeat 0 10px; white-space: nowrap;"><a href="http://edu.csdn.net/course/detail/1144" title="C语言系列之 字符串相关算法" strategy="v4:content" target="_blank">C语言系列之 字符串相关算法</a></dd><dd style="background:url(http://static.blog.csdn.net/skin/default/images/blog-dot-red3.gif) no-repeat 0 10px; white-space: nowrap;"><a href="http://edu.csdn.net/course/detail/1145" title="C语言系列之 字符串压缩算法与结构体初探" strategy="v4:content" target="_blank">C语言系列之 字符串压缩算法与结构体初探</a></dd><dd style="background:url(http://static.blog.csdn.net/skin/default/images/blog-dot-red3.gif) no-repeat 0 10px; white-space: nowrap;"><a href="http://edu.csdn.net/course/detail/759" title="Java经典算法讲解" strategy="v4:content" target="_blank">Java经典算法讲解</a></dd><dd style="background:url(http://static.blog.csdn.net/skin/default/images/blog-dot-red3.gif) no-repeat 0 10px; white-space: nowrap;"><a href="http://edu.csdn.net/course/detail/3359" title="数据结构与算法在实战项目中的应用" strategy="v4:content" target="_blank">数据结构与算法在实战项目中的应用</a></dd><dd style="background:url(http://static.blog.csdn.net/skin/default/images/blog-dot-red3.gif) no-repeat 0 10px; white-space: nowrap;"><a href="http://edu.csdn.net/course/detail/913" title="零基础实战HTML、XHTML、CSS3应用开发" strategy="v4:content" target="_blank">零基础实战HTML、XHTML、CSS3应用开发</a></dd><dd style="background:url(http://static.blog.csdn.net/skin/default/images/blog-dot-red3.gif) no-repeat 0 10px; white-space: nowrap;"><a href="http://edu.csdn.net/course/detail/1118" title="C语言系列之 递归算法示例与 Windows 趣味小项目" strategy="v4:content" target="_blank">C语言系列之 递归算法示例与 Windows 趣味小项目</a></dd></div></div>    <div id="adCollege" style="width: 42%;float: left;">     <dd><a href="http://edu.csdn.net/course/detail/662" title="Java Swing、JDBC开发桌面级应用" strategy="v4:content" target="_blank">Java Swing、JDBC开发桌面级应用</a></dd><dd><a href="http://edu.csdn.net/course/detail/4179" title="深入Javascript字符串实战视频课程" strategy="v4:content" target="_blank">深入Javascript字符串实战视频课程</a></dd><dd><a href="http://edu.csdn.net/course/detail/4282" title="java数据库连接技术JDBC" strategy="v4:content" target="_blank">java数据库连接技术JDBC</a></dd><dd><a href="http://edu.csdn.net/course/detail/1198" title="《C语言/C++学习指南》加密解密篇(安全相关算法)" strategy="v4:content" target="_blank">《C语言/C++学习指南》加密解密篇(安全相关算法)</a></dd><dd><a href="http://edu.csdn.net/course/detail/1144" title="C语言系列之 字符串相关算法" strategy="v4:content" target="_blank">C语言系列之 字符串相关算法</a></dd></div>           <div id="resforAd" style="width: 42%;float: left;margin-right: 30px;"><dd><a href="http://edu.csdn.net/course/detail/1145" title="C语言系列之 字符串压缩算法与结构体初探" strategy="v4:content" target="_blank">C语言系列之 字符串压缩算法与结构体初探</a></dd><dd><a href="http://edu.csdn.net/course/detail/759" title="Java经典算法讲解" strategy="v4:content" target="_blank">Java经典算法讲解</a></dd><dd><a href="http://edu.csdn.net/course/detail/3359" title="数据结构与算法在实战项目中的应用" strategy="v4:content" target="_blank">数据结构与算法在实战项目中的应用</a></dd><dd><a href="http://edu.csdn.net/course/detail/913" title="零基础实战HTML、XHTML、CSS3应用开发" strategy="v4:content" target="_blank">零基础实战HTML、XHTML、CSS3应用开发</a></dd><dd><a href="http://edu.csdn.net/course/detail/1118" title="C语言系列之 递归算法示例与 Windows 趣味小项目" strategy="v4:content" target="_blank">C语言系列之 递归算法示例与 Windows 趣味小项目</a></dd></div>     <script src="http://csdnimg.cn/jobreco/job_reco.js" type="text/javascript"></script>      <script type="text/javascript">         csdn.position.showEdu({             sourceType: "blog",             searchType: "detail",             searchKey: "51281816",                username: "",                recordcount: "10",                containerId: "adcollegedata" //容器DIV的id。             });            //setEduLoc();            //function setEduLoc() {            //    var edus = $("#adCollege div dd a");            //    if (edus.length == 0) {            //        setTimeout(function () {            //            setEduLoc();            //        }, 500);            //    }            //    else {            //        var eduLoc = "?ref=blog&loc=0";            //        $.each(edus, function (index, item) {            //            var href = $(this).attr("href") + eduLoc;            //            $(this).attr("href", href);            //        });            //    }            //}            setTimeout(function () {                var adcolleges = $("#adcollegedata div dd");                for (var i = 0; i < adcolleges.length; i++) {                    if (i < 5) {                        $("#adCollege").append("<dd>" + $(adcolleges[i]).html() + "</dd");                    }                    else {                        $("#resforAd").append("<dd>" + $(adcolleges[i]).html() + "</dd");                    }                }            }, 1500);                            </script>    </dl>    <div id="ad_cen">                        <div>                                    <div class="J_adv" data-view="true" data-mod="ad_popu_199" data-mtp="43" data-order="114" data-con="ad_content_1843" style="width: 960px; height: 90px; display: none;">                                             <script type="text/javascript">                                                 /*博客内容页下方Banner2-960*90,创建于,2016-11-28*/                                                 var cpro_id = "u2831143";                                            </script>                                            <script type="text/javascript" src="http://cpro.baidustatic.com/cpro/ui/c.js"></script><div id="BAIDU_SSP__wrapper_u2831143_0"><iframe id="iframeu2831143_0" src="http://pos.baidu.com/lcbm?rdid=2831143&dc=3&di=u2831143&dri=0&dis=0&dai=2&ps=8108x305&dcb=___adblockplus&dtm=HTML_POST&dvi=0.0&dci=-1&dpt=none&tsr=0&tpr=1495108489699&ti=%E8%87%AA%E7%84%B6%E8%AF%AD%E8%A8%80%E5%A4%84%E7%90%86%E4%B8%AD%E7%9A%84N-Gram%E6%A8%A1%E5%9E%8B%E8%AF%A6%E8%A7%A3%20-%20%E7%99%BD%E9%A9%AC%E8%B4%9F%E9%87%91%E7%BE%81%20-%20%E5%8D%9A%E5%AE%A2%E9%A2%91%E9%81%93%20-%20CSDN.NET&ari=2&dbv=2&drs=1&pcs=1261x665&pss=1265x8264&cfv=0&cpl=5&chi=1&cce=true&cec=UTF-8&tlm=1495108489&rw=680&ltu=http%3A%2F%2Fblog.csdn.net%2Fbaimafujinji%2Farticle%2Fdetails%2F51281816&ltr=https%3A%2F%2Fwww.baidu.com%2Flink%3Furl%3DdW7R5yW4K-ZUObYxux76x1LwsDUpgMJ6yMHy_q46XqssVoNrnlbQX428ZtV99Pw7IECXxdPvjlafV7rqtkqiapQKxhrSoFhuKk59_Hb53iC%26wd%3D%26eqid%3Dbc559f330003952500000002591d727a&ecd=1&uc=1860x1057&pis=-1x-1&sr=1920x1080&tcn=1495108490&qn=6bb3632421859614&tt=1495108489680.47.47.48" width="960" height="90" align="center,center" vspace="0" hspace="0" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" style="border:0;vertical-align:bottom;margin:0;width:960px;height:90px" allowtransparency="true"></iframe></div>                                   </div>                    </div>    </div>          <!-- 广告位开始 -->        <!-- 广告位结束 --><div class="J_adv" data-view="true" data-mod="ad_popu_72" data-mtp="62" data-order="40" data-con="ad_content_2072" style="display: none;">                 <script id="popuLayer_js_q" src="http://ads.csdn.net/js/popuLayer.js" defer="defer" type="text/javascript"></script>            <div id="layerd" style="position: fixed; bottom: 0px; right: 0px; line-height: 0px; z-index: 1000; width: 300px; height: 278px;">                <div class="J_close layer_close" style="display:;background-color:#efefef;padding:0px;color:#333;font:12px/24px Helvetica,Tahoma,Arial,sans-serif;text-align:right;">关闭</div><!-- 广告占位容器 --><div id="cpro_u2895327"><iframe id="iframeu2895327_0" src="http://pos.baidu.com/lcbm?rdid=2895327&dc=3&di=u2895327&dri=0&dis=0&dai=3&ps=665x1237&dcb=___adblockplus&dtm=HTML_POST&dvi=0.0&dci=-1&dpt=none&tsr=0&tpr=1495108489699&ti=%E8%87%AA%E7%84%B6%E8%AF%AD%E8%A8%80%E5%A4%84%E7%90%86%E4%B8%AD%E7%9A%84N-Gram%E6%A8%A1%E5%9E%8B%E8%AF%A6%E8%A7%A3%20-%20%E7%99%BD%E9%A9%AC%E8%B4%9F%E9%87%91%E7%BE%81%20-%20%E5%8D%9A%E5%AE%A2%E9%A2%91%E9%81%93%20-%20CSDN.NET&ari=2&dbv=2&drs=1&pcs=1261x665&pss=1265x8264&cfv=0&cpl=5&chi=1&cce=true&cec=UTF-8&tlm=1495108489&rw=680&ltu=http%3A%2F%2Fblog.csdn.net%2Fbaimafujinji%2Farticle%2Fdetails%2F51281816&ltr=https%3A%2F%2Fwww.baidu.com%2Flink%3Furl%3DdW7R5yW4K-ZUObYxux76x1LwsDUpgMJ6yMHy_q46XqssVoNrnlbQX428ZtV99Pw7IECXxdPvjlafV7rqtkqiapQKxhrSoFhuKk59_Hb53iC%26wd%3D%26eqid%3Dbc559f330003952500000002591d727a&ecd=1&uc=1860x1057&pis=-1x-1&sr=1920x1080&tcn=1495108490&qn=a306fcab771b1319&tt=1495108489680.71.71.73" width="300" height="250" align="center,center" vspace="0" hspace="0" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" style="border:0;vertical-align:bottom;margin:0;width:300px;height:250px" allowtransparency="true"></iframe></div></div>            <script>  document.getElementById("popuLayer_js_q").onload=function(){      var styObjd=styObj={width:"300px","height":parseInt(250)+28};window.CSDN.Layer.PopuLayer("#layerd",{storageName:"layerd",styleObj:styObjd,total:50,expoire:1000*60});  }</script><!-- 投放代码 --><script type="text/javascript">                /*服务器频道首页置顶Banner960*90,创建于2014-7-3*/    (window.cproArray = window.cproArray || []).push({        id: "u2895327"      });  </script>  <script src="http://cpro.baidustatic.com/cpro/ui/c.js" type="text/javascript"></script>     </div><div class="comment_class">    <div id="comment_title" class="panel_head">        <span class="see_comment">查看评论</span><a name="comments"></a></div>    <div id="comment_list"><dl class="comment_item comment_topic" id="comment_item_6358114"><dt class="comment_head" floor="1">1楼 <span class="user"><a class="username" href="/zuchengkaoshi" target="_blank">独上高楼望天涯</a> <span class="ptime">2016-10-25 11:21发表</span>  <a href="#reply" class="cmt_btn reply" title="回复">[回复]</a> <span class="comment_manage" style="display:none;" commentid="6358114" username="zuchengkaoshi"> <a href="#quote" class="cmt_btn quote" title="引用">[引用]</a> <a href="#report" class="cmt_btn report" title="举报">[举报]</a></span></span></dt><dd class="comment_userface"><a href="/zuchengkaoshi" target="_blank"><img src="http://avatar.csdn.net/6/0/3/3_zuchengkaoshi.jpg" width="40" height="40"></a></dd><dd class="comment_body">最后一个问题:我觉得对于莎士比亚的N=2~3,对于华尔街日报的话N=3~5,可能这样比较合适吧。总之应该是后面的N应该要比前面的大一些。</dd></dl><div class="clear"></div></div>    <div id="comment_bar" style="display: none;">    </div>    <div id="comment_form"><div class="guest_link">您还没有登录,请<a href="javascript:void(0);" onclick="javascript:loginbox();">[登录]</a>或<a href="http://passport.csdn.net/account/register?from=http%3A%2F%2Fblog.csdn.net%2Fbaimafujinji%2Farticle%2Fdetails%2F51281816">[注册]</a></div></div>    <div class="announce">        * 以上用户言论只代表其个人观点,不代表CSDN网站的观点或立场<a name="reply"></a><a name="quote"></a></div></div><script type="text/javascript">    var fileName = '51281816';    var commentscount = 1;    var islock = false</script>    <div id="ad_bot">    </div><div id="report_dialog"></div><div id="d-top" style="bottom:60px;">        <a id="quick-reply" class="btn btn-top q-reply" title="快速回复" style="display:none;">            <img src="http://static.blog.csdn.net/images/blog-icon-reply.png" alt="快速回复">        </a>        <a id="d-top-a" class="btn btn-top backtop" style="display: none;" title="返回顶部" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_huidaodingbu'])">                  <img src="http://static.blog.csdn.net/images/top.png" alt="TOP">    </a></div><script type="text/javascript">    $(function ()    {        $("#ad_frm_0").height("90px");                setTimeout(function(){            $("#ad_frm_2").height("200px");        },1000);        });  </script><style type="text/css">    .tag_list    {        background: none repeat scroll 0 0 #FFFFFF;        border: 1px solid #D7CBC1;        color: #000000;        font-size: 12px;        line-height: 20px;        list-style: none outside none;        margin: 10px 2% 0 1%;        padding: 1px;    }    .tag_list h5    {        background: none repeat scroll 0 0 #E0DBD3;        color: #47381C;        font-size: 12px;        height: 24px;        line-height: 24px;        padding: 0 5px;        margin: 0;    }    .tag_list h5 a    {        color: #47381C;    }    .classify    {        margin: 10px 0;        padding: 4px 12px 8px;    }    .classify a    {        margin-right: 20px;        white-space: nowrap;    }</style><div class="tag_list" style="">    <h5>        <a href="http://www.csdn.net/tag/" target="_blank">核心技术类目</a></h5>    <div class="classify"><a title="全部主题" href="http://www.csdn.net/tag" target="_blank" onclick="LogClickCount(this,336);">全部主题</a><a title="Hadoop" href="http://g.csdn.net/5272865" target="_blank" onclick="LogClickCount(this,336);">Hadoop</a><a title="AWS" href="http://g.csdn.net/5272866" target="_blank" onclick="LogClickCount(this,336);">AWS</a><a title="移动游戏" href="http://g.csdn.net/5272870" target="_blank" onclick="LogClickCount(this,336);">移动游戏</a><a title="Java" href="http://g.csdn.net/5272871" target="_blank" onclick="LogClickCount(this,336);">Java</a><a title="Android" href="http://g.csdn.net/5272872" target="_blank" onclick="LogClickCount(this,336);">Android</a><a title="iOS" href="http://g.csdn.net/5272873" target="_blank" onclick="LogClickCount(this,336);">iOS</a><a title="Swift" href="http://g.csdn.net/5272868" target="_blank" onclick="LogClickCount(this,336);">Swift</a><a title="智能硬件" href="http://g.csdn.net/5272869" target="_blank" onclick="LogClickCount(this,336);">智能硬件</a><a title="Docker" href="http://g.csdn.net/5272867" target="_blank" onclick="LogClickCount(this,336);">Docker</a><a title="OpenStack" href="http://g.csdn.net/5272925" target="_blank" onclick="LogClickCount(this,336);">OpenStack</a><a title="VPN" href="http://www.csdn.net/tag/vpn" target="_blank" onclick="LogClickCount(this,336);">VPN</a><a title="Spark" href="http://g.csdn.net/5272924" target="_blank" onclick="LogClickCount(this,336);">Spark</a><a title="ERP" href="http://www.csdn.net/tag/erp" target="_blank" onclick="LogClickCount(this,336);">ERP</a><a title="IE10" href="http://www.csdn.net/tag/ie10" target="_blank" onclick="LogClickCount(this,336);">IE10</a><a title="Eclipse" href="http://www.csdn.net/tag/eclipse" target="_blank" onclick="LogClickCount(this,336);">Eclipse</a><a title="CRM" href="http://www.csdn.net/tag/crm" target="_blank" onclick="LogClickCount(this,336);">CRM</a><a title="JavaScript" href="http://www.csdn.net/tag/javascript" target="_blank" onclick="LogClickCount(this,336);">JavaScript</a><a title="数据库" href="http://www.csdn.net/tag/数据库" target="_blank" onclick="LogClickCount(this,336);">数据库</a><a title="Ubuntu" href="http://www.csdn.net/tag/ubuntu" target="_blank" onclick="LogClickCount(this,336);">Ubuntu</a><a title="NFC" href="http://www.csdn.net/tag/nfc" target="_blank" onclick="LogClickCount(this,336);">NFC</a><a title="WAP" href="http://www.csdn.net/tag/wap" target="_blank" onclick="LogClickCount(this,336);">WAP</a><a title="jQuery" href="http://www.csdn.net/tag/jquery" target="_blank" onclick="LogClickCount(this,336);">jQuery</a><a title="BI" href="http://www.csdn.net/tag/bi" target="_blank" onclick="LogClickCount(this,336);">BI</a><a title="HTML5" href="http://www.csdn.net/tag/html5" target="_blank" onclick="LogClickCount(this,336);">HTML5</a><a title="Spring" href="http://www.csdn.net/tag/spring" target="_blank" onclick="LogClickCount(this,336);">Spring</a><a title="Apache" href="http://www.csdn.net/tag/apache" target="_blank" onclick="LogClickCount(this,336);">Apache</a><a title=".NET" href="http://www.csdn.net/tag/.net" target="_blank" onclick="LogClickCount(this,336);">.NET</a><a title="API" href="http://www.csdn.net/tag/api" target="_blank" onclick="LogClickCount(this,336);">API</a><a title="HTML" href="http://www.csdn.net/tag/html" target="_blank" onclick="LogClickCount(this,336);">HTML</a><a title="SDK" href="http://www.csdn.net/tag/sdk" target="_blank" onclick="LogClickCount(this,336);">SDK</a><a title="IIS" href="http://www.csdn.net/tag/iis" target="_blank" onclick="LogClickCount(this,336);">IIS</a><a title="Fedora" href="http://www.csdn.net/tag/fedora" target="_blank" onclick="LogClickCount(this,336);">Fedora</a><a title="XML" href="http://www.csdn.net/tag/xml" target="_blank" onclick="LogClickCount(this,336);">XML</a><a title="LBS" href="http://www.csdn.net/tag/lbs" target="_blank" onclick="LogClickCount(this,336);">LBS</a><a title="Unity" href="http://www.csdn.net/tag/unity" target="_blank" onclick="LogClickCount(this,336);">Unity</a><a title="Splashtop" href="http://www.csdn.net/tag/splashtop" target="_blank" onclick="LogClickCount(this,336);">Splashtop</a><a title="UML" href="http://www.csdn.net/tag/uml" target="_blank" onclick="LogClickCount(this,336);">UML</a><a title="components" href="http://www.csdn.net/tag/components" target="_blank" onclick="LogClickCount(this,336);">components</a><a title="Windows Mobile" href="http://www.csdn.net/tag/windowsmobile" target="_blank" onclick="LogClickCount(this,336);">Windows Mobile</a><a title="Rails" href="http://www.csdn.net/tag/rails" target="_blank" onclick="LogClickCount(this,336);">Rails</a><a title="QEMU" href="http://www.csdn.net/tag/qemu" target="_blank" onclick="LogClickCount(this,336);">QEMU</a><a title="KDE" href="http://www.csdn.net/tag/kde" target="_blank" onclick="LogClickCount(this,336);">KDE</a><a title="Cassandra" href="http://www.csdn.net/tag/cassandra" target="_blank" onclick="LogClickCount(this,336);">Cassandra</a><a title="CloudStack" href="http://www.csdn.net/tag/cloudstack" target="_blank" onclick="LogClickCount(this,336);">CloudStack</a><a title="FTC" href="http://www.csdn.net/tag/ftc" target="_blank" onclick="LogClickCount(this,336);">FTC</a><a title="coremail" href="http://www.csdn.net/tag/coremail" target="_blank" onclick="LogClickCount(this,336);">coremail</a><a title="OPhone " href="http://www.csdn.net/tag/ophone " target="_blank" onclick="LogClickCount(this,336);">OPhone </a><a title="CouchBase" href="http://www.csdn.net/tag/couchbase" target="_blank" onclick="LogClickCount(this,336);">CouchBase</a><a title="云计算" href="http://www.csdn.net/tag/云计算" target="_blank" onclick="LogClickCount(this,336);">云计算</a><a title="iOS6" href="http://www.csdn.net/tag/iOS6" target="_blank" onclick="LogClickCount(this,336);">iOS6</a><a title="Rackspace " href="http://www.csdn.net/tag/rackspace " target="_blank" onclick="LogClickCount(this,336);">Rackspace </a><a title="Web App" href="http://www.csdn.net/tag/webapp" target="_blank" onclick="LogClickCount(this,336);">Web App</a><a title="SpringSide" href="http://www.csdn.net/tag/springside" target="_blank" onclick="LogClickCount(this,336);">SpringSide</a><a title="Maemo" href="http://www.csdn.net/tag/maemo" target="_blank" onclick="LogClickCount(this,336);">Maemo</a><a title="Compuware" href="http://www.csdn.net/tag/compuware" target="_blank" onclick="LogClickCount(this,336);">Compuware</a><a title="大数据" href="http://www.csdn.net/tag/大数据" target="_blank" onclick="LogClickCount(this,336);">大数据</a><a title="aptech" href="http://www.csdn.net/tag/aptech" target="_blank" onclick="LogClickCount(this,336);">aptech</a><a title="Perl" href="http://www.csdn.net/tag/perl" target="_blank" onclick="LogClickCount(this,336);">Perl</a><a title="Tornado" href="http://www.csdn.net/tag/tornado" target="_blank" onclick="LogClickCount(this,336);">Tornado</a><a title="Ruby" href="http://www.csdn.net/tag/ruby" target="_blank" onclick="LogClickCount(this,336);">Ruby</a><a title="Hibernate" href="http://www.csdn.net/tag/hibernate" target="_blank" onclick="LogClickCount(this,336);">Hibernate</a><a title="ThinkPHP" href="http://www.csdn.net/tag/thinkphp" target="_blank" onclick="LogClickCount(this,336);">ThinkPHP</a><a title="HBase" href="http://www.csdn.net/tag/hbase" target="_blank" onclick="LogClickCount(this,336);">HBase</a><a title="Pure" href="http://www.csdn.net/tag/pure" target="_blank" onclick="LogClickCount(this,336);">Pure</a><a title="Solr" href="http://www.csdn.net/tag/solr" target="_blank" onclick="LogClickCount(this,336);">Solr</a><a title="Angular" href="http://www.csdn.net/tag/angular" target="_blank" onclick="LogClickCount(this,336);">Angular</a><a title="Cloud Foundry" href="http://www.csdn.net/tag/cloudfoundry" target="_blank" onclick="LogClickCount(this,336);">Cloud Foundry</a><a title="Redis" href="http://www.csdn.net/tag/redis" target="_blank" onclick="LogClickCount(this,336);">Redis</a><a title="Scala" href="http://www.csdn.net/tag/scala" target="_blank" onclick="LogClickCount(this,336);">Scala</a><a title="Django" href="http://www.csdn.net/tag/django" target="_blank" onclick="LogClickCount(this,336);">Django</a><a title="Bootstrap" href="http://www.csdn.net/tag/bootstrap" target="_blank" onclick="LogClickCount(this,336);">Bootstrap</a>    </div></div>  <script type="text/javascript">           $(function(){              setTimeout(function(){                  $.get("/baimafujinji/svc/GetTagContent",function(data){                      $(".tag_list").html(data).show();                  });                   });          },500);                        </script> <div id="pop_win" style="display:none ;position: absolute; z-index: 10000; border: 1px solid rgb(220, 220, 220); top: 222.5px; left: 630px; opacity: 1; background: none 0px 0px repeat scroll rgb(255, 255, 255);">    </div><div id="popup_mask"></div><style>    #popup_mask    {        position: absolute;        width: 100%;        height: 100%;        background: #000;        z-index: 9999;        left: 0px;        top: 0px;        opacity: 0.3;        filter: alpha(opacity=30);        display: none;    }</style><script type="text/javascript">    $(function(){                        setTimeout(function(){            $(".comment_body:contains('回复')").each(function(index,item){                var u=$(this).text().split(':')[0].toString().replace("回复","")                var thisComment=$(this);                if(u)                {                    $.getJSON("https://passport.csdn.net/get/nick?callback=?", {users: u}, function(a) {                        if(a!=null&&a.data!=null&&a.data.length>0)                        {                            nick=a.data[0].n;                             if(u!=nick)                            {                                thisComment.text(thisComment.text().replace(u,nick));                              }                        }                           });                  }            });                 },200);          setTimeout(function(){            $(".math").each(function(index,value){$(this).find("span").last().css("color","#fff"); })        },5000);        setTimeout(function(){            $(".math").each(function(index,value){$(this).find("span").last().css("color","#fff"); })        },10000);        setTimeout(function(){            $(".math").each(function(index,value){$(this).find("span").last().css("color","#fff"); })        },15000);                setTimeout(function(){            $("a img[src='http://js.tongji.linezing.com/stats.gif']").parent().css({"position":"absolute","left":"50%"});        },300);    });    function loginbox(){        var $logpop=$("#pop_win");        $logpop.html('<iframe src="https://passport.csdn.net/account/loginbox?service=http://static.blog.csdn.net/callback.htm" frameborder="0" height="600" width="400" scrolling="no"></iframe>');        $('#popup_mask').css({            opacity: 0.5,            width: $( document ).width() + 'px',            height:  $( document ).height() + 'px'        });        $('#popup_mask').css("display","block");         $logpop.css( {            top: ($( window ).height() - $logpop.height())/ 2  + $( window        ).scrollTop() + 'px',            left:($( window ).width() - $logpop.width())/ 2        } );         setTimeout( function () {            $logpop.show();            $logpop.css( {                opacity: 1            } );        }, 200 );         $('#popup_mask').unbind("click");        $('#popup_mask').bind("click", function(){            $('#popup_mask').hide();            var $clopop = $("#pop_win");            $("#common_ask_div_sc").css("display","none");            $clopop.css( {                opacity: 0            } );            setTimeout( function () {                $clopop.hide();            }, 350 );            return false;        });    }       var articletitle='自然语言处理中的N-Gram模型详解';</script>                        <div class="clear">                        </div>                    </div>                                               </div>                              <div id="side">                   <div class="side"><div id="panel_Profile" class="panel"><ul class="panel_head"><span>个人资料</span></ul><ul class="panel_body profile"><div id="blog_userface">    <a href="http://my.csdn.net/baimafujinji" target="_blank">    <img src="http://avatar.csdn.net/D/4/E/1_baimafujinji.jpg" title="访问我的空间" style="max-width:90%">    </a>    <br>    <span><a href="http://my.csdn.net/baimafujinji" class="user_name" target="_blank">白马负金羁</a></span></div><div class="interact">    <a href="javascript:void(0);" class="attent" id="span_add_follow" title="[加关注]"></a> <a href="javascript:void(0);" class="letter" title="[发私信]" onclick="window.open('http://msg.csdn.net/letters/model?receiver=baimafujinji','_blank','height=350,width=700');_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_sixin'])"></a>  </div><div id="blog_medal">                       <div class="ico_expert" onclick="javascript:location='http://blog.csdn.net/experts/rule.html'" title="CSDN认证专家" style="cursor:pointer;width:60px;height:60px;background:url('http://c.csdnimg.cn/jifen/images/xunzhang/xunzhang/bokezhuanjiamiddle.png') no-repeat"></div>                <div id="bms_box">                                            <a target="_blank">                                                    <img src="http://c.csdnimg.cn/jifen/images/xunzhang/xunzhang/zhuanlandaren.png" onmouseover="m_over_m(this,2)" onmouseout="m_out_m()" alt="3">                                            </a>                                            <a target="_blank">                                                    <img src="http://c.csdnimg.cn/jifen/images/xunzhang/xunzhang/chizhiyiheng.png" onmouseover="m_over_m(this,4)" onmouseout="m_out_m()" alt="3">                                            </a>                                            <a target="_blank">                                                    <img src="http://c.csdnimg.cn/jifen/images/xunzhang/xunzhang/bokezhixing.png" onmouseover="m_over_m(this,6)" onmouseout="m_out_m()" alt="1">                                            </a>               </div></div><ul id="blog_rank">    <li>访问:<span>1566060次</span></li>    <li>积分:<span>21669</span> </li>        <li>等级: <span style="position:relative;display:inline-block;z-index:1">            <img src="http://c.csdnimg.cn/jifen/images/xunzhang/jianzhang/blog7.png" alt="" style="vertical-align: middle;" id="leveImg">            <div id="smallTittle" style=" position: absolute;  left: -24px;  top: 25px;  text-align: center;  width: 101px;  height: 32px;  background-color: #fff;  line-height: 32px;  border: 2px #DDDDDD solid;  box-shadow: 0px 2px 2px rgba (0,0,0,0.1);  display: none;   z-index: 999;">            <div style="left: 42%;  top: -8px;  position: absolute;  width: 0;  height: 0;  border-left: 10px solid transparent;  border-right: 10px solid transparent;  border-bottom: 8px solid #EAEAEA;"></div>            积分:21669 </div>        </span>  </li>    <li>排名:<span>第294名</span></li></ul><ul id="blog_statistics">    <li>原创:<span>313篇</span></li>    <li>转载:<span>11篇</span></li>    <li>译文:<span>0篇</span></li>    <li>评论:<span>3779条</span></li></ul></ul></div><div id="custom_column_40913695" class="panel"><ul class="panel_head"><span>联系方式</span></ul><ul class="panel_body"><li>1. 在博客文章下留言,<font color="blue">博客私信一律不回</font>。</li><li>2. 发邮件至fzuo#foxmail.com,将#换成@。</li><li>3. 算法与数据结构QQ群:<font color="blue">495573865</font>,<font color="red">仅限算法之美读者交流之用</font>。</li><li>4. <a href="http://blog.csdn.net/baimafujinji/article/details/54602338">图像处理算法研究学习群(单击链接进入加群通道)</a>。</li></ul></div><div id="panel_Category" class="panel">    <ul class="panel_head"><span>博客专栏</span></ul>    <ul class="panel_body" id="sp_column">    <table cellpadding="0" cellspacing="0"><tbody><tr>    <td style="padding:10px 10px 0 0;">    <a href="http://blog.csdn.net/column/details/14749.html" target="_blank"><img src="http://img.blog.csdn.net/20170308193632872" style="width:75px;height:75px;"></a>    </td>    <td style="padding:10px 0; vertical-align:top;">    <a href="http://blog.csdn.net/column/details/14749.html" target="_blank">跳脱旧我:心智砥砺之旅</a>    <p>文章:12篇</p>    <span>阅读:31725</span>    </td>    </tr></tbody></table>    <table cellpadding="0" cellspacing="0"><tbody><tr>    <td style="padding:10px 10px 0 0;">    <a href="http://blog.csdn.net/column/details/math-imageprocess.html" target="_blank"><img src="http://img.blog.csdn.net/20151123180557677" style="width:75px;height:75px;"></a>    </td>    <td style="padding:10px 0; vertical-align:top;">    <a href="http://blog.csdn.net/column/details/math-imageprocess.html" target="_blank">图像处理中的数学原理详解</a>    <p>文章:34篇</p>    <span>阅读:270404</span>    </td>    </tr></tbody></table>    </ul></div><div id="custom_column_41382567" class="panel"><ul class="panel_head"><span>图像处理</span></ul><ul class="panel_body"><ul class="panel_body"><center><img src="http://img.my.csdn.net/uploads/201512/31/1451555303_6476.png"></center><br><h3 align="right"><br><center><b>数字图像处理原理与实践<br>(MATLAB版)</b></center></h3><a href="http://item.jd.com/11572050.html" target="_blank"><b></b></a><center><a href="http://item.jd.com/11572050.html" target="_blank"><b><font color="red">京东网有售</font></b></a>,<a href="http://product.dangdang.com/23594911.html" target="_blank"><b><font color="red">当当网有售</font></b></a></center></ul></ul></div><div id="panel_Category" class="panel"><ul class="panel_head"><span>文章分类</span></ul><ul class="panel_body">                     <li>                    <a href="/baimafujinji/article/category/1608089" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">编程语言与程序设计</a><span>(24)</span>                </li>                 <li>                    <a href="/baimafujinji/article/category/1608093" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">图像与信号处理</a><span>(33)</span>                </li>                 <li>                    <a href="/baimafujinji/article/category/1608099" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">数据结构与算法</a><span>(36)</span>                </li>                 <li>                    <a href="/baimafujinji/article/category/1608131" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">其他杂文</a><span>(22)</span>                </li>                 <li>                    <a href="/baimafujinji/article/category/5937321" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">应用技巧</a><span>(18)</span>                </li>                 <li>                    <a href="/baimafujinji/article/category/6048234" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">经济研究</a><span>(15)</span>                </li>                 <li>                    <a href="/baimafujinji/article/category/6048259" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">数据挖掘与机器学习</a><span>(43)</span>                </li>                 <li>                    <a href="/baimafujinji/article/category/6277600" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">自然语言处理与信息检索</a><span>(16)</span>                </li>                 <li>                    <a href="/baimafujinji/article/category/6277601" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">图像处理中的数学</a><span>(38)</span>                </li>                 <li>                    <a href="/baimafujinji/article/category/6277677" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">线性代数</a><span>(17)</span>                </li>                 <li>                    <a href="/baimafujinji/article/category/6411216" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">多核编程与并行计算</a><span>(15)</span>                </li>                 <li>                    <a href="/baimafujinji/article/category/6418785" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">废言集</a><span>(25)</span>                </li>                 <li>                    <a href="/baimafujinji/article/category/6435757" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">文学与诗歌</a><span>(9)</span>                </li>                 <li>                    <a href="/baimafujinji/article/category/6435759" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">学习方法与方法论</a><span>(15)</span>                </li>                 <li>                    <a href="/baimafujinji/article/category/6741736" onclick="_gaq.push(['_trackEvent','function', 'onclick', 'blog_articles_wenzhangfenlei']); ">空</a><span>(0)</span>                </li></ul></div><div id="hotarticls" class="panel"><ul class="panel_head">    <span>       阅读排行    </span></ul><ul class="panel_body itemlist"><li><a href="/baimafujinji/article/details/38026421" title="在Eclipse中进行C/C++开发的配置方法(20140721最新版)">在Eclipse中进行C/C++开发的配置方法(20140721最新版)</a><span>(30936)</span></li><li><a href="/baimafujinji/article/details/27206237" title="暗通道优先的图像去雾算法(上)">暗通道优先的图像去雾算法(上)</a><span>(27984)</span></li><li><a href="/baimafujinji/article/details/6485778" title="常见C/C++笔试题目整理(含答案)2">常见C/C++笔试题目整理(含答案)2</a><span>(22380)</span></li><li><a href="/baimafujinji/article/details/30060161" title="暗通道优先的图像去雾算法(下)">暗通道优先的图像去雾算法(下)</a><span>(22144)</span></li><li><a href="/baimafujinji/article/details/50500757" title="自己动手用C++写的图像处理软件(不调用外部包)">自己动手用C++写的图像处理软件(不调用外部包)</a><span>(21260)</span></li><li><a href="/baimafujinji/article/details/51281816" title="自然语言处理中的N-Gram模型详解">自然语言处理中的N-Gram模型详解</a><span>(21233)</span></li><li><a href="/baimafujinji/article/details/46787837" title="图像的泊松(Poisson)编辑、泊松融合完全详解(3) ——完结篇">图像的泊松(Poisson)编辑、泊松融合完全详解(3) ——完结篇</a><span>(19484)</span></li><li><a href="/baimafujinji/article/details/49885643" title="在R中使用支持向量机(SVM)进行数据挖掘(下)">在R中使用支持向量机(SVM)进行数据挖掘(下)</a><span>(19068)</span></li><li><a href="/baimafujinji/article/details/50467970" title="机器学习与数据挖掘网上资源搜罗——良心推荐">机器学习与数据挖掘网上资源搜罗——良心推荐</a><span>(18852)</span></li><li><a href="/baimafujinji/article/details/49885481" title="在R中使用支持向量机(SVM)进行数据挖掘(上)">在R中使用支持向量机(SVM)进行数据挖掘(上)</a><span>(18510)</span></li></ul></div><div id="newcomments" class="panel"><ul class="panel_head"><span>最新评论</span></ul><ul class="panel_body itemlist">    <li>            <a href="/baimafujinji/article/details/54602338#comments">【欢迎加入图像处理算法交流群】群规贴</a>    <p style="margin:0px;"><a href="/Yingyingjia" class="user_name">Yingyingjia</a>:我已知晓群规,请求加入QQ:2464458611    </p>    </li>    <li>            <a href="/baimafujinji/article/details/49891221#comments">机器学习与数据挖掘的学习路线图</a>    <p style="margin:0px;"><a href="/baimafujinji" class="user_name">白马负金羁</a>:@sinat_27721839:http://blog.csdn.net/baimafujinji/...    </p>    </li>    <li>            <a href="/baimafujinji/article/details/49891221#comments">机器学习与数据挖掘的学习路线图</a>    <p style="margin:0px;"><a href="/sinat_27721839" class="user_name">润泽2016</a>:老师好,请问可否推荐一些比较通俗易懂的微积分、概率论、线性代数、统计学、信息论等数学书籍?谢谢!    </p>    </li>    <li>            <a href="/baimafujinji/article/details/49891221#comments">【欢迎加入图像处理算法交流群】群规贴</a>    <p style="margin:0px;"><a href="/tengjuan576290366" class="user_name">tengjuan576290366</a>:我已经知晓群规,我的QQ:576290366    </p>    </li>    <li>            <a href="/baimafujinji/article/details/51179381#comments">牛顿法解机器学习中的Logistic回归</a>    <p style="margin:0px;"><a href="/baimafujinji" class="user_name">白马负金羁</a>:@mcjh_2016416:这里用Cholesky矩阵分解法是更好的选择,高斯赛德尔迭代法还需要讨论...    </p>    </li>    <li>            <a href="/baimafujinji/article/details/51179381#comments">牛顿法解机器学习中的Logistic回归</a>    <p style="margin:0px;"><a href="/mcjh_2016416" class="user_name">斌_视野</a>:@baimafujinji:谢谢!    </p>    </li>    <li>            <a href="/baimafujinji/article/details/51179381#comments">牛顿法解机器学习中的Logistic回归</a>    <p style="margin:0px;"><a href="/baimafujinji" class="user_name">白马负金羁</a>:@mcjh_2016416:1. 原文有笔误,有时间会把这里改过来。2. 原文说:解方程组可以用高斯...    </p>    </li>    <li>            <a href="/baimafujinji/article/details/51179381#comments">牛顿法解机器学习中的Logistic回归</a>    <p style="margin:0px;"><a href="/mcjh_2016416" class="user_name">斌_视野</a>:1.计算(H')-1U时,应该求解方程组(H')X=U。2.在求解方程组(H')X=U时,当H'为正...    </p>    </li>    <li>            <a href="/baimafujinji/article/details/50603686#comments">Poisson image editing算法实现的Matlab代码解析</a>    <p style="margin:0px;"><a href="/baimafujinji" class="user_name">白马负金羁</a>:@baidu_16024721:可是我并没有你的 calcAdjancency 函数代码啊,我怎么知...    </p>    </li>    <li>            <a href="/baimafujinji/article/details/50603686#comments">Poisson image editing算法实现的Matlab代码解析</a>    <p style="margin:0px;"><a href="/baidu_16024721" class="user_name">baidu_16024721</a>:Error in calcAdjancency (line 3)      = size(Mask)...    </p>    </li></ul></div><div id="custom_column_42906555" class="panel"><ul class="panel_body"><script>$(document).ready(function(){      setInterval(function(){        $("#cpro_u2392861_closebtn").trigger("click");        $("#bd-hl-content").css("display","none");        $(".J_adv").css("display","none");        $("#cpro_u2392861").css("display","none");        $("#adJs52b5334").css("display","none");       },3000);    });</script></ul></div>    </div>    <div class="clear">    </div>                 <!-- 广告位开始 -->                 <!-- 广告位结束 -->                   <div class="J_adv" data-view="true" data-mod="ad_popu_189" data-mtp="63" data-order="40" data-con="ad_content_1259" style="width: 250px; height: 250px; display: none;">                        <div id="nav_show_top_stop" style="width: 250px; height: 250px; z-index: 1000; position: fixed; top: 3187px;"><div id="cpro_u2734133"><iframe id="iframeu2734133_0" src="http://pos.baidu.com/lcbm?rdid=2734133&dc=3&di=u2734133&dri=0&dis=0&dai=4&ps=3187x0&dcb=___adblockplus&dtm=HTML_POST&dvi=0.0&dci=-1&dpt=none&tsr=0&tpr=1495108489699&ti=%E8%87%AA%E7%84%B6%E8%AF%AD%E8%A8%80%E5%A4%84%E7%90%86%E4%B8%AD%E7%9A%84N-Gram%E6%A8%A1%E5%9E%8B%E8%AF%A6%E8%A7%A3%20-%20%E7%99%BD%E9%A9%AC%E8%B4%9F%E9%87%91%E7%BE%81%20-%20%E5%8D%9A%E5%AE%A2%E9%A2%91%E9%81%93%20-%20CSDN.NET&ari=2&dbv=2&drs=1&pcs=1261x665&pss=1265x8462&cfv=0&cpl=5&chi=1&cce=true&cec=UTF-8&tlm=1495108489&rw=680&ltu=http%3A%2F%2Fblog.csdn.net%2Fbaimafujinji%2Farticle%2Fdetails%2F51281816&ltr=https%3A%2F%2Fwww.baidu.com%2Flink%3Furl%3DdW7R5yW4K-ZUObYxux76x1LwsDUpgMJ6yMHy_q46XqssVoNrnlbQX428ZtV99Pw7IECXxdPvjlafV7rqtkqiapQKxhrSoFhuKk59_Hb53iC%26wd%3D%26eqid%3Dbc559f330003952500000002591d727a&ecd=1&uc=1860x1057&pis=-1x-1&sr=1920x1080&tcn=1495108490&qn=8f2aba239dcb078b&tt=1495108489680.196.197.197" width="250" height="250" align="center,center" vspace="0" hspace="0" marginwidth="0" marginheight="0" scrolling="no" frameborder="0" style="border:0;vertical-align:bottom;margin:0;width:250px;height:250px" allowtransparency="true"></iframe></div></div>                   </div>                    <script>                                                    setTimeout(function () {                            var naviga_offsetTop = 0; function naviga_stay_top() {                                var scrollTop = jQuery(document).scrollTop();                                if (scrollTop > naviga_offsetTop) {                                    jQuery("#nav_show_top_stop").css({ "position": "fixed" });                                    jQuery("#nav_show_top_stop").css({ "top": "0px" });                                } else { jQuery("#nav_show_top_stop").css({ "position": "fixed" }); jQuery("#nav_show_top_stop").css({ "top": naviga_offsetTop - scrollTop + "px" }); }                            }                            function onload_function() {                                naviga_offsetTop = jQuery("#nav_show_top_stop").position().top;                                jQuery(window).bind("scroll", naviga_stay_top); jQuery(window).bind("mousewheel", naviga_stay_top);                                jQuery(document).bind("scroll", naviga_stay_top); jQuery(document).bind("mousewheel", naviga_stay_top);                            } jQuery(document).ready(onload_function);                        },200);                                                       </script>                    <script type="text/javascript">    (window.cproArray = window.cproArray || []).push({ id: "u2734133" });  </script>                   <script src="http://cpro.baidustatic.com/cpro/ui/c.js" type="text/javascript"></script>           </div>               <div class="clear">            </div>        </div>        <script type="text/javascript" src="http://c.csdnimg.cn/rabbit/cnick/cnick.js"></script><script type="text/javascript" src="http://static.blog.csdn.net/scripts/newblog.min.js"></script><script type="text/javascript" src="http://medal.blog.csdn.net/showblogmedal.ashx?blogid=598388"></script><script type="text/javascript" src="http://static.blog.csdn.net/scripts/JavaScript1.js"></script><link rel="stylesheet" type="text/css" href="//csdnimg.cn/pubfooter/css/pub_footer_2014.css"><div class="pub_fo"><div id="pub_footerall" class="pub_footer_new"><dl><dt></dt> <dd class="foot_sub_menu"><a href="http://www.csdn.net/company/about.html" target="_blank">公司简介</a><span>|</span><a href="http://www.csdn.net/company/recruit.html" target="_blank">招贤纳士</a><span>|</span><a href="http://www.csdn.net/company/marketing.html" target="_blank">广告服务</a><span>|</span><a href="http://www.csdn.net/company/contact.html" target="_blank">联系方式</a><span>|</span><a href="http://www.csdn.net/company/statement.html" target="_blank">版权声明</a><span>|</span><a href="http://www.csdn.net/company/layer.html" target="_blank">法律顾问</a><span>|</span><a href="mailto:webmaster@csdn.net">问题报告</a><span>|</span><a target="_blank" href="http://www.csdn.net/friendlink.html">合作伙伴</a><span>|</span><a href="http://bbs.csdn.net/forums/Service" target="_blank">论坛反馈</a></dd><dd class="foot_contact"><a href="javascript:void(0);" target="_blank" class="qq">网站客服</a><a href="http://wpa.qq.com/msgrd?v=3&uin=2251809102&site=qq&menu=yes" target="_blank" class="qq">杂志客服</a><a href="http://e.weibo.com/csdnsupport/profile" target="_blank" class="weibo">微博客服</a><a href="mailto:webmaster@csdn.net" class="email" title="联系邮箱">webmaster@csdn.net</a><span class="phone" title="服务热线">400-600-2320</span><span class="interval">|</span><span>北京创新乐知信息技术有限公司 版权所有</span><span class="interval">|</span><span>江苏知之为计算机有限公司</span><span class="interval">|</span><span>江苏乐知网络技术有限公司</span></dd><dd class="foot_copyright"><span>京 ICP 证 09002463 号</span><span class="interval">|</span><span>Copyright © 1999-2017, CSDN.NET, All Rights Reserved </span><a href="http://www.hd315.gov.cn/beian/view.asp?bianhao=010202001032100010" target="_blank"><img src="http://c.csdnimg.cn/pubfooter/images/gongshang_logos.gif" alt="GongshangLogo" title=""></a></dd></dl></div></div><div id="note1" class="csdn_note" style="display:none; position:absolute; z-index:9999; width:440px">  <span class="notice_top_arrow"><span class="inner"></span></span>  <div class="box"></div></div><div class="csdn_notice_tip" style="display:none; position:absolute; z-index:9990; width:170px">  <iframe src="about:blank" frameborder="0" scrolling="no" style="z-index:-1;position:absolute;top:0;left:0;width:100%;height:100%;background:transparent"></iframe>  <div class="tip_text">您有<strong>0</strong>条新通知</div>  <a href="javascript:void 0" class="close2"></a></div><script id="noticeScript" type="text/javascript" btnid="header_notice_num" wrapid="note1" count="5" subcount="5" src="//csdnimg.cn/rabbit/notev2/js/notify.js?9d86d94"></script>    <script type="text/javascript" src="http://passport.csdn.net/content/loginbox/login.js"></script><script type="text/javascript">    $(function () {        function __get_code_toolbar(snippet_id) {            return $("<span class='tracking-ad' data-mod='popu_167'><a href='https://code.csdn.net/snippets/"                    + snippet_id                    + "' target='_blank' title='在CODE上查看代码片'  style='text-indent:0;'><img src='https://code.csdn.net/assets/CODE_ico.png' width=12 height=12 alt='在CODE上查看代码片' style='position:relative;top:1px;left:2px;'/></a></span>"                    + "<span class='tracking-ad' data-mod='popu_170'><a href='https://code.csdn.net/snippets/"                    + snippet_id                    + "/fork' target='_blank' title='派生到我的代码片' style='text-indent:0;'><img src='https://code.csdn.net/assets/ico_fork.svg' width=12 height=12 alt='派生到我的代码片' style='position:relative;top:2px;left:2px;'/></a></span>");        }                $("[code_snippet_id]").each(function () {            __s_id = $(this).attr("code_snippet_id");            if (__s_id != null && __s_id != "" && __s_id != 0 && parseInt(__s_id) > 70020) {                __code_tool = __get_code_toolbar(__s_id);                $(this).prev().find(".tools").append(__code_tool);            }        });        $(".bar").show();    });</script>    </div><input type="hidden" id="aa_g_data_ids">      <!--new top-->        <script type="text/javascript" src="http://c.csdnimg.cn/pubfooter/js/tracking.js" charset="utf-8"></script>         <script id="csdn-toolbar-id" btnid="header_notice_num" wrapid="note1" count="5" subcount="5" type="text/javascript" src="http://c.csdnimg.cn/public/common/toolbar/js/toolbar.js"></script>     <!--new top-->       <link href="http://c.csdnimg.cn/comm_ask/css/ask_float_block.css" type="text/css" rel="stylesheet">    <script language="JavaScript" type="text/javascript" src="http://c.csdnimg.cn/comm_ask/js/libs/wmd.js"></script>    <script language="JavaScript" type="text/javascript" src="http://c.csdnimg.cn/comm_ask/js/libs/showdown.js"></script>        <script language="JavaScript" type="text/javascript" src="http://c.csdnimg.cn/comm_ask/js/apps/ask_float_block.js"></script>              <script type="text/javascript" src="http://ads.csdn.net/js/async_new.js"></script>            <script type="text/javascript" src="http://static.blog.csdn.net/scripts/comment.js"></script>        <script type="text/javascript" src="http://static.blog.csdn.net/public/res/bower-libs/MathJax/MathJax.js?config=TeX-AMS_HTML"></script>        <link rel="stylesheet" href="http://static.blog.csdn.net/code/prettify.css">        <script type="text/javascript" src="http://static.blog.csdn.net/code/prettify.js"></script>        <script type="text/javascript" src="http://c.csdnimg.cn/rabbit/search-service/main.js"></script>          <script type="text/javascript">              //$(function () {              //    setTimeout(function () {              //        var searchtitletags = articletitle + ',' + $("#tags").html();              //        searchService({              //            index: 'blog',              //            query: searchtitletags,              //            from: 5,              //            size: 5,              //            appendTo: '#res',              //            url: 'recommend',              //            his: 2,              //            client: "blog_cf_enhance",              //            tmpl: '<dd style="background:url(http://static.blog.csdn.net/skin/default/images/blog-dot-red3.gif) no-repeat 0 10px;"><a href="#{ url }" title="#{ title }" strategy="#{ strategy }">#{ title }</a></dd>'              //        });              //    }, 1000);              //});         </script>             <script type="text/javascript">              $(function () {                  setTimeout(function () {                      var searchtitletags = articletitle + ',' + $("#tags").html();                      searchService({                          index: 'blog',                          query: searchtitletags,                          from: 0,                          size: 5,                          appendTo: '#articlecommend',                          url: 'recommend',                          his: 2,                          client: "blog_cf_enhance",                          tmpl: '<li><em>•</em><a href="#{ url }" title="#{ title }" strategy="#{ strategy }">#{ title }</a></li>'                      });                      setTimeout(function () {                          var articles=$("#articlecommend li");                          for (var i = 0; i < articles.length; i++)                          {                              $(articles[i]).find("a").css("width", "290px");                              if (i < 5) {                                  $(".similar_list.fl").append("<li>" + $(articles[i]).html() + "</li");                              }                              else {                                  $(".similar_list.fr").append("<li>" + $(articles[i]).html() + "</li");                              }                          }                      }, 2000);                                       }, 1000);              });         </script>           <script type="text/javascript" src="http://static.blog.csdn.net/scripts/web-storage-cache.min.js"></script>        <script type="text/javascript" src="http://static.blog.csdn.net/scripts/replace.min.js"></script>      <div id="a52b5334d" style="width: 1px; height: 1px; display: none;">                    <script id="adJs52b5334" src="http://ads.csdn.net/js/opt/52b5334.js?t=0.3547480548771431" style="display: none;"></script>                    <script>document.getElementById("adJs52b5334").src = "http://ads.csdn.net/js/opt/52b5334.js?t=" + Math.random();</script>   <div><iframe src="http://ads.csdn.net/skip.php?subject=UzoKIgwzUzcCJgdbD2RUYFY/BTBTMwQyACYLalBmACQNbg4mCSYBaVJ3VTMGW1ZvBjYDPwRiX21VY1dxAzgFM1MwCjEMCFM7AjAHOQ8/VDdWNQUyUyIEdgBsC2pQbAANDXsOIglvATlSNlVwBnBWfwYiA2cEbl8r&r=0.2564005848278399" style="width: 1px; height: 1px; position: absolute; visibility: hidden;"></iframe></div></div>    <link rel="stylesheet" href="http://static.blog.csdn.net/css/blog_code.css">    <script type="text/javascript" src="http://static.blog.csdn.net/scripts/saveToCode.js"></script>      <script type="text/javascript" src="//csdnimg.cn/rabbit/tracking-ad/main.js?75eacd8"></script>    <link rel="stylesheet" href="http://static.blog.csdn.net/css/fa.css">              <div class="pop_CA_cover" style="display:none"></div>    <div class="pop pop_CA" style="display:none">          <div class="CA_header">            收藏助手            <span class="cancel_icon" id="fapancle" onclick="$('.pop_CA').hide();$('.pop_CA_cover').hide();"></span>          </div>          <iframe src="" id="fa" frameborder="0" width="100%" height="360" scrolling="no"></iframe>    </div>    <div id="tag-suggest-pop">  <div class="relative">    <div class="close"></div>    <div class="content"></div>  </div></div><link rel="stylesheet" type="text/css" media="screen" href="http://ask.csdn.net/assets/ask_float_fonts_css-6b30a53970eb5c3a2a045e3df585b475.css"><div data-mod="popu_64" class="csdn-tracking-statistics" chg-blk="0"><a id="com-quick-reply" title="快速回复" style="top:290px"></a><a id="com-quick-collect" title="我要收藏" style="top:328px"></a><a id="com-d-top-a" style="top: 366px; display: none;" title="返回顶部" onclick=""></a> </div><div class="pop_edit ask_second comm_ask_second"><h3>提问</h3><span class="ask_float_span">您的问题将会被发布在“<a class="ask_float_channel" href="//ask.csdn.net" target="_blank" style="cursor:pointer">技术问答</a>”频道</span><a href="#" nodetype="close" class="close">×</a><div class="context"><div class="err_div"><span class="err_ico"></span><span class="err_txt">该问题已存在,请勿重复提问</span></div><div class="input_div"><input id="askInputSecond" type="text" style="font-size:14px;" placeholder="问题标题"></div><div class="cm_box"><div class="cm_dialog"></div> <div class="pop_cm cm_add_link"><input type="text" placeholder="链接内容" id="af_cm_link_txt"><input type="text" placeholder="链接地址" id="af_cm_link_url"><input type="text" placeholder="链接提示" id="af_cm_link_tit"><div class="text-right"><span class="btn btn-default btn-sm" id="add_link_btn">插入链接</span> </div> </div><div class="pop_cm cm_add_img"><div class="nav-tabs"><a class="img_tab active" href="#tab_upload">本地上传</a><a class="img_tab" href="#tab_weburl">网络图片</a></div><div class="tab_panel active" id="tab_upload"><div class="set_img"><iframe src="http://ask.csdn.net/upload.html"></iframe></div></div><div class="tab_panel" id="tab_weburl"><input type="text" placeholder="图片地址" id="af_cm_img_url"><input type="text" placeholder="图片说明" id="af_cm_img_alt"><div class="text-right"><span class="btn btn-default btn-sm" id="add_img_btn">插入图片</span> </div></div> </div></div> <textarea id="editor_all" rows="8" style="display: none;"></textarea><div class="editor-toolbar"><i class="separator">|</i><a class="icon-headline" title="标题一(Ctrl+Alt+1)"></a><a class="icon-heading" title="标题二(Ctrl+Alt+2)"></a><a class="icon-bold" title="粗体(Ctrl+B)"></a><a class="icon-italic" title="斜体(Ctrl+I)"></a><i class="separator">|</i><a class="icon-quote-left" title="引用(Ctrl+’)"></a><a class="icon-code" title="插入代码片(Ctrl+,)"></a><a class="icon-list-ul" title="无序列表(Ctrl+L)"></a><a class="icon-list-ol" title="有序列表(Ctrl+Alt+L)"></a><i class="separator">|</i><a class="icon-link" title="添加链接(Ctrl+K)"></a><a class="icon-picture" title="添加图片(Ctrl+Alt+I)"></a><i class="separator">|</i><a class="icon-reply" title="撤退(Ctrl+Z)"></a><a class="icon-share-alt" title="前进(Ctrl+Shift+Z)"></a><i class="separator">|</i><a class="icon-info" href="http://ask.csdn.net/pages/markdown" target="_blank" title="markdown语法参考"></a><a class="icon-preview" title="预览"></a><i class="separator">|</i></div><div class="CodeMirror cm-s-paper CodeMirror-focused"><div style="overflow: hidden; position: relative; width: 3px; height: 0px;"><textarea autocorrect="off" autocapitalize="off" spellcheck="false" style="position: absolute; padding: 0px; width: 1000px; height: 1em; outline: none; font-size: 4px;" tabindex="0"></textarea></div><div class="CodeMirror-hscrollbar"><div style="height: 1px;"></div></div><div class="CodeMirror-vscrollbar"><div style="width: 1px;"></div></div><div class="CodeMirror-scrollbar-filler"></div><div class="CodeMirror-gutter-filler"></div><div class="CodeMirror-scroll" tabindex="-1"><div class="CodeMirror-sizer" style="min-width: 33px;"><div style="position: relative;"><div class="CodeMirror-lines"><div style="position: relative; outline: none;"><div class="CodeMirror-measure"><pre> <span style="display: inline-block; width: 1px; margin-right: -1px;"> </span></pre></div><div style="position: relative; z-index: 1;"></div><div class="CodeMirror-code"></div><div class="CodeMirror-cursor" style="visibility: hidden;"> </div><div class="CodeMirror-cursor CodeMirror-secondarycursor" style="visibility: hidden;"> </div></div></div></div></div><div style="position: absolute; height: 30px; width: 1px;"></div><div class="CodeMirror-gutters" style="display: none;"></div></div></div><div class="editor-statusbar"><span class="lines">0</span><span class="words">0</span><span class="cursor">0:0</span></div><div class="div_tags clearfix"><div id="divSearchTags" class="tags_con"><input type="text"></div><input type="hidden" name="txtSearchTags"></div><div id="ask2_tagRecomm_div" class="drt_tagRecomm tracking-ad" data-mod="popu_73"><span class="drt_tit">推荐标签:</span></div></div><div class="success"><div class="left_area"><input id="chk_cb" type="checkbox"><span class="wyxs">我要悬赏</span><input id="cb_num" class="cb_num" readonly="true"><span class="phib_rii"><span> 币</span></span></div><a href="#" nodetype="cancel" class="cancel">取消</a><a href="#" nodetype="ok" class="ok">发布</a></div></div><div id="common_ask_div_sc" class="searchContainer"><div class="sTitle">可能存在类似的问题:</div><div class="sFooter"><a class="sFirstNewAsk">我想提一个新问题</a></div></div><div id="mask_code"></div><div class="gist_edit"><div class="save_snippets clearfix"><div class="tit"><h3>保存代码片</h3><span>整理和分享保存的代码片,请访问<a href="https://code.csdn.net/snippets_manage" target="_blank">代码笔记</a></span></div><div class="con_form"><ul class="gist_edit_list clearfix"><li><span class="red">*</span><span class="txt">标题</span><input id="form_title" class="form-input" placeholder="自然语言处理中的N-Gram模型详解" type="text"></li><li><span class="red">*</span><span class="txt">描述</span><textarea id="form-textarea" class="form-textarea" placeholder="自然语言处理中的N-Gram模型详解: http://blog.csdn.net/baimafujinji/article/details/51281816"></textarea></li><li><span class="red"> </span><span class="txt">标签</span><div id="divSearchTags"><span class="label blog_tag"><span>NLP</span><a title="Removing tag" href="javascript:;">x</a></span><span class="label blog_tag"><span>N-Gram</span><a title="Removing tag" href="javascript:;">x</a></span><span class="label blog_tag"><span>自然语言处理</span><a title="Removing tag" href="javascript:;">x</a></span><span class="label blog_tag"><span>模糊匹配</span><a title="Removing tag" href="javascript:;">x</a></span><span class="label blog_tag"><span>编辑距离</span><a title="Removing tag" href="javascript:;">x</a></span><input id="insertTag" class="insertTag" placeholder="请输入标签,按Enter生成(最多5项)" type="text" value="" name="insertTag" maxlength="21" style="color: rgb(51, 51, 51);"><input id="OrganTag" class="OrganTag" type="hidden" name="OrganTag" value="NLP,N-Gram,自然语言处理,模糊匹配,编辑距离,"><input id="OldOrganTag" class="OldOrganTag" type="hidden" name="OldOrganTag" value=""><input type="hidden" name="txtSearchTags"></div></li></ul></div><div class="bottom-bar"><a href="javascript:;" class="btn-submit btn-cancel">取消</a><span class="tracking-ad" data-mod="popu_250"><a class="btn-submit btn-confirm" href="javascript:;" target="_blank">确定</a></span></div></div></div><div style="position: absolute; width: 0px; height: 0px; overflow: hidden; padding: 0px; border: 0px; margin: 0px;"><div id="MathJax_Font_Test" style="position: absolute; visibility: hidden; top: 0px; left: 0px; width: auto; padding: 0px; border: 0px; margin: 0px; white-space: nowrap; text-align: left; text-indent: 0px; text-transform: none; line-height: normal; letter-spacing: normal; word-spacing: normal; font-size: 40px; font-weight: normal; font-style: normal; font-family: STIXSizeOneSym, sans-serif;"></div></div></body>

原创粉丝点击