Rspamd_rule_Html.lua自己的理解

来源:互联网 发布:java 多线程监听端口 编辑:程序博客网 时间:2024/05/21 10:21

        以下是我对Rspamd系统中的规则Html.lua一些理解和例句的详细例子,希望记录下来供健忘的寄几以后参阅。

**MIME_HTML_ONLY:**reconf['MIME_HTML_ONLY'] = {  re = 'has_only_html_part()',  score = 0.2,  description = 'Messages that have only HTML part',  group = 'header'}**rspamd_config.HTML_SHORT_LINK_IMG_1:**description= Short html part (0..1K) with a link to an image,图片height+width>=210&&embedded。score = 2.0,group = ‘html'处理的实例:<a href=“http://index.html”><img src="logo.jpg" /></a>**rspamd_config.HTML_SHORT_LINK_IMG_2 :**description = 'Short html part (1K..1.5K) with a link to an image'  score = 1.0,group = ‘html'**rspamd_config.HTML_SHORT_LINK_IMG_3:**description = ’Short html part (1.5K..2K) with a link to an image'  score = 0.5,group = ‘html',**rspamd_config.R_EMPTY_IMAGE:**如果有HTML part小于50比特,判断这段文本中是否有图片或者空白部分score = 2.0,group = 'html'**rspamd_config.R_SUSPICIOUS_IMAGES :**如果信息包含可疑信息则根据rule可知:pic_words+l>0, l=html_part:get_words_count()rel=pic_words / (l + pic_words)if rel > 0.5 then return true, (rel - 0.5) * 2score = 5.0,group = 'html',**rspamd_config.R_WHITE_ON_WHITE :**处理Message contains low contrast text:获取HTML context,处理HTML Tree中的'font', 'span', 'div', ‘p’标签,如果标签有color和bgcolor并且是visible的,那么取出color和bgcolor的rgb(x1,x2,x3)做如下运算:              local diff_r = math.abs(color[1] - bgcolor[1]) / 255.0              local diff_g = math.abs(color[2] - bgcolor[2]) / 255.0              local diff_b = math.abs(color[3] - bgcolor[3]) / 255.0              diff = (diff_r + diff_g + diff_b) / 3.0得到diff值,如果diff值小于0.1就做如下的处理,计算transp_len、normal_len,构造格式化字符串arg:ransp_len = (transp_len + tag:get_content_length()) *(0.1 - diff) * 5.0normal_len = normal_len - tag:get_content_length()arg = string.format('%s color #%x%x%x bgcolor #%x%x%x’,tostring(tag:get_type()),color[1], color[2], color[3],bgcolor[1], bgcolor[2], bgcolor[3])。最后调整transp_rate的值,return true,(transp_rate * 2.0),arg).  score = 6.0,  group = 'html',**rspamd_config.EXT_CSS:**description = ’Message contains external CSS reference’,获取HTML context,处理HTML Tree中的Link标签,并取出标签相应的extra数据,如果有CSS则Add symbol处理实例:<link rel="stylesheet" type="text/css" href="theme.css" />score = 1.0,group = ‘html’**rspamd_config.HTTP_TO_HTTPS**: description = 'Anchor text contains different scheme to target URL’(在 HTML 文档中 <a> 标签每出现一次,就会创建 Anchor 对象)。 识别:包含锚的文本邮件中包含指向URL的不同方案,比如:把http的链接伪装成https,或者把https伪装成http。算法:如果HTML Tree中的 <a> 标签的c=Tag <a> content data, e= Tag <a> extra data ;满足if ((c:match('^http:') and u:match('^https:')) or          (c:match('^https:') and u:match(‘^http:')))。则return true,表示add symbol.处理实例:<a href="http://www.test.com>https://www.test.com</a><a href=“https://www.test.com>http://www.test.com</a>score = 2.0,group = 'html'**rspamd_config.HTTP_TO_IP :**  description = 'Anchor points to an IP address’。Anchor 指向IP地址。则return true。表示add symbol.  score = 1.0,  group = ‘html'
原创粉丝点击