5.4 extract_addr函数:邮件地址解析
来源:互联网 发布:电子商务域名 编辑:程序博客网 时间:2024/06/05 07:40
我们知道postfix默认接受RFC822格式的邮件地址,不强制客户提供RFC821格式的地址。我们常见的地址如zhangsan@163.com格式。但RFC822所定义的邮件格式非常复杂,如下地址格式都是正确的:
TheBoss<zhangsan@163.com>
“TheBoss”<zhangsan@163.com>
zhangsan@163.com(TheBoss)
以下是Foxmail中查看某封邮件原信的结果(见图5-4):
图5-4 查看邮件原信中的地址
postfix为解析邮件地址定义了TOK822结构体:
/* *Internal address representation: a token tree. */typedef struct TOK822 { int type; /* token value, see below */ VSTRING *vstr; /*token contents */ struct TOK822 *prev; /*peer */ struct TOK822 *next; /*peer */ struct TOK822 *head; /*group members */ struct TOK822 *tail; /*group members */ struct TOK822 *owner; /*group owner */} TOK822;
该结构体的type字段定义节点类型,vstr字段定义节点的值,其他字段均为构成树的链接字段。由于邮件地址可能会有复杂的格式,所以有定义了多种节点类型:
/* * Token values for multi-character objects. Single-character operatorsare * represented by their own character value. */#define TOK822_MINTOK 256#define TOK822_ATOM 256 /* non-special character sequence */#define TOK822_QSTRING 257 /* stuff between "", notnesting */#define TOK822_COMMENT 258 /* comment including (), may nest */#define TOK822_DOMLIT 259 /* stuff between [] not nesting */#define TOK822_ADDR 260 /* actually a token group */#define TOK822_STARTGRP 261 /*start of named group */#define TOK822_MAXTOK2
\
61 tok822_parse函数所在的/global/tok822_parse.c有单元测试主函数,我们运行一下看看结果,地址“zhangsan”zhangsan@163.com会被组织成如下的树(见图5-5):
图5-5 extract_addr函数测试结果
该树的类型为address,即宏TOK822_ADDR。
用户一般是不会通过命令行向邮件服务器提供复杂的邮件地址的。MUA软件有可能这样做,extract_addr函数需要从可能存在的所有地址形式中提取出真正的邮件地址:
/smtpd/smtpd.c2122 /* extract_addr - extract address fromrubble */21232124 static int extract_addr(SMTPD_STATE*state, SMTPD_TOKEN *arg,2125 intallow_empty_addr, int strict_rfc821,2126 int smtputf8)2127 {2128 const char *myname = "extract_addr";2129 TOK822 *tree;2130 TOK822 *tp;2131 TOK822 *addr = 0;2132 int naddr;2133 int non_addr;2134 int err = 0;2135 char *junk = 0;2136 char *text;2137 char *colon;21382139 /*2140 * Special case.2141 */2142 #define PERMIT_EMPTY_ADDR 12143 #define REJECT_EMPTY_ADDR 021442145 /*2146 * Some mailers send RFC822-style address forms (with comments and such)2147 * in SMTP envelopes. We cannot blame users for this: the blame is with2148 * programmers violating the RFC, and with sendmail for being permissive.2149 *2150 * XXX The SMTP command tokenizer must leave the address in externalized2151 * (quoted) form, so that the address parser can correctly extract the2152 * address from surrounding junk.2153 *2154 * XXX We have only one address parser, written according to the rules of2155 * RFC 822. That standard differs subtly from RFC 821.2156 */2157 if (msg_verbose)2158 msg_info("%s: input: %s", myname, STR(arg->vstrval));2159 if (STR(arg->vstrval)[0] == '<'2160 && STR(arg->vstrval)[LEN(arg->vstrval) - 1] == '>') {2161 junk = text = mystrndup(STR(arg->vstrval) + 1, LEN(arg->vstrval) -2);2162 } else2163 text = STR(arg->vstrval);
2159-2163 客户可能提供两类地址:符合RFC821的放在尖括号内的地址或不符合RFC821的地址。对于前者我们取得尖括号内的地址,后者先记录下来。
21642165 /*2166 * Truncate deprecated route address form.2167 */2168 if (*text == '@' && (colon = strchr(text, ':')) != 0)2169 text = colon + 1;
2168-2169 忽略已经废弃的格式。
2170 tree = tok822_parse(text);
2170 将地址解析为TOK822树。
21712172 if (junk)2173 myfree(junk);21742175 /*2176 * Find trouble.2177 */2178 for (naddr = non_addr = 0, tp = tree; tp != 0; tp = tp->next) {2179 if (tp->type == TOK822_ADDR) {2180 addr = tp;2181 naddr += 1; /* count address forms*/2182 } else if (tp->type == '<' || tp->type == '>') {2183 /* void */ ; /* ignore brackets */2184 } else {2185 non_addr += 1; /* count non-addressforms */2186 }2187 }
2178-2187 搜索树节点,提取地址部分,记录地址和非地址部分的个数。
21882189 /*2190 * Report trouble. XXX Should log a warning only if we are going to2191 * sleep+reject so that attackers can't flood our logfiles.2192 *2193 * XXX Unfortunately, the sleep-before-reject feature had to be abandoned2194 * (at least for small error counts) because servers were DOS-ing2195 * themselves when flooded by backscatter traffic.2196 */2197 if (naddr > 12198 || (strict_rfc821 && (non_addr || *STR(arg->vstrval) !='<'))) {2199 msg_warn("Illegal address syntax from %s in %s command: %s",2200 state->namaddr,state->where,2201 printable(STR(arg->vstrval), '?'));2202 err = 1;2203 }22042205 /*2206 * Don't overwrite the input with the extracted address. We need the2207 * original (external) form in case the client does not send ORCPT2208 * information; and error messages are more accurate if we log the2209 * unmodified form. We need the internal form for all other purposes.2210 */2211 if (addr)2212 tok822_internalize(state->addr_buf, addr->head, TOK822_STR_DEFL);2213 else2214 vstring_strcpy(state->addr_buf, "");
2211-2214 函数tok822_internalize将地址树转化为字符串,接着将其保存在SMTPD_STATE->addr_buf字段中。我们还需要客户端提供的原地址,所以要用addr_buf字段得到解析后的地址,而不是覆盖原地址。
22152216 /*2217 * Report trouble. XXX Should log a warning only if we are going to2218 * sleep+reject so that attackers can't flood our logfiles. Log the2219 * original address.2220 */2221 if (err == 0)2222 if ((STR(state->addr_buf)[0] == 0 && !allow_empty_addr)2223 || (strict_rfc821 &&STR(state->addr_buf)[0] == '@')2224 || (SMTPD_STAND_ALONE(state) == 02225 &&smtpd_check_addr(STR(state->addr_buf), smtputf8) != 0)) {2226 msg_warn("Illegal addresssyntax from %s in %s command: %s",2227 state->namaddr,state->where,2228 printable(STR(arg->vstrval), '?'));2229 err = 1;2230 }
2221-2230 用smtpd_check_addr对解析出的地址做ACL检查。
22312232 /*2233 * Cleanup.2234 */2235 tok822_free_tree(tree);
2235 释放TOK822树。
2236 if (msg_verbose)2237 msg_info("%s: in: %s, result: %s",2238 myname, STR(arg->vstrval),STR(state->addr_buf));2239 return (err);2240 }
- 5.4 extract_addr函数:邮件地址解析
- 5.7.3 用extract_addr函数解析邮件地址
- 正则表达式 -- 邮件地址验证解析
- js : 正则表达式 -- 邮件地址验证解析
- 从给定的字符串中解析出邮件地址(以";"分隔)
- ms sql server中检测邮件地址的函数
- VBS正则表达式:识别是否为邮件地址的函数
- 集团邮件地址
- 分割邮件地址
- 邮件地址保护有巧招
- 伪造发件人邮件地址
- 邮件地址合法性正则检查
- 控制不让抓取邮件地址
- JS验证邮件地址
- C# 如何验证邮件地址
- 伪造发件人邮件地址
- js邮件地址验证
- jmailTo邮件地址隐藏方法
- 关于字符编码,中文所占字节的整理。
- window下tomcat内存溢出配置及查看
- 5.3.2 Milter与内容过滤
- Visual Studio Code
- linux下进程绑定cpu情况查看
- 5.4 extract_addr函数:邮件地址解析
- 构造MaxTree
- jvm调优之jdk工具的使用
- IntelliJ IDEA使用技巧一览表
- SpringMVC
- win7(64位)系统安装64位ODBC驱动的方法
- 使用props的一个基本原则
- Tree资源树的实战研究
- go语言的中“继承”