ASP.NET中17种正则表达式

来源：互联网发布：java高级工程师书籍编辑：程序博客网时间：2024/04/29 10:26

"^/d+$"　　//非负整数（正整数 + 0）
"^[0-9]*[1-9][0-9]*$"　　//正整数
"^((-/d+)|(0+))$"　　//非正整数（负整数 + 0）
"^-[0-9]*[1-9][0-9]*$"　　//负整数
"^-?/d+$"　　　　//整数
"^/d+(/./d+)?$"　　//非负浮点数（正浮点数 + 0）
"^(([0-9]+/.[0-9]*[1-9][0-9]*)|([0-9]*[1-9][0-9]*/.[0-9]+)|([0-9]*[1-9][0-9]*))$"　　//正浮点数
"^((-/d+(/./d+)?)|(0+(/.0+)?))$"　　//非正浮点数（负浮点数 + 0）
"^(-(([0-9]+/.[0-9]*[1-9][0-9]*)|([0-9]*[1-9][0-9]*/.[0-9]+)|([0-9]*[1-9][0-9]*)))$"　　//负浮点数
"^(-?/d+)(/./d+)?$"　　//浮点数
"^[A-Za-z]+$"　　//由26个英文字母组成的字符串
"^[A-Z]+$"　　//由26个英文字母的大写组成的字符串
"^[a-z]+$"　　//由26个英文字母的小写组成的字符串
"^[A-Za-z0-9]+$"　　//由数字和26个英文字母组成的字符串
"^/w+$"　　//由数字、26个英文字母或者下划线组成的字符串
"^[/w-]+(/.[/w-]+)*@[/w-]+(/.[/w-]+)+$"　　　　//email地址
"^[a-zA-z]+://(/w+(-/w+)*)(/.(/w+(-/w+)*))*(/?/S*)?$"　　//url

以下是增加

1.在HTML编辑器中编辑的内容在存入数据库前要替换成如下的格式:
<content>
   <item type=”text”> hello World! </item>
   <item type=”textfile”> http://www.humyu.com/App01/text01.txt </item>
   <item type=”textfile”> /App01/text01.txt</item>
   <item type=”image”> http://www.humyu.com/app01/image/image01.jpg </item>
   <item type=”image”> /app01/image/image02.gif </item>
   <item type=”voice”> /app01/voice/voice01.amr </item>
   <item type=”voice”> http://www.humyu.com/app01/voice/voice02.amr </item>
</content>
详细说明:

1.例子
HTML编辑器中内容如下：

你好、你好吗？<IMG src="http://localhost/aaa/20058169483.gif">我好我好<IMG src="http://localhost/aaa/200581694817.gif"><IMG

src="http://localhost/aaa/200581694822.gif">他好他好<TXT src="http://www.humyu.com/App01/text01.txt">都好。都好

替换后个内容为:
<item type="text"> 你好、你好吗？</item>
<item type="image">http://localhost/aaa/20058169483.gif </item>
<item type="text"> 我好我好</item>
<item type="image">http://localhost/aaa/200581694817.gif </item>
<item type="image">http://localhost/aaa/200581694822.gif </item>
<item type="text"> 他好他好</item>
<item type=”textfile”>http://www.humyu.com/App01/text01.txt</item>
<item type="text"> 都好。都好</item>

2.<item type=”voice”> /app01/voice/voice01.amr </item>好说的,它是放末尾的一个文件而已,内容中有<=1个.这里我可以自己解决的.

3.<item type="text">、<item type="image">、<item type=”textfile”>这三个的数量和位置是不固定的。拿<item type="text">来说，它

在内容里可能在第一个就是它，也可能在后面，也可能没有它，也可能连续几个都是它。

3.其实就是把内容转换成xml格式。

我说清楚了吧，如果没清楚，请指出来，我立刻回复；先谢谢各位大虾了，谢谢！！

回复人： fancyf(凡瑞) ( 两星(中级) ) 信誉：122 2005-8-16 14:53:37 得分: 148

现在正则表达式的复杂度已经向系统化发展了，真头疼~

string content = @"你好、你好吗？<IMG src=""http://localhost/aaa/20058169483.gif"">我好我好<IMG src=""http://localhost/aaa/200581694817.gif""><IMG

src=""http://localhost/aaa/200581694822.gif"">他好他好<TXT src=""http://www.humyu.com/App01/text01.txt"">都好。都好";
//content = aRegex.Replace(content, "");
Regex htmlRegex = new Regex(@"(^(?<text>[^<]+?)(?<e><))|((?<s>>)(?<text>[^<]+?)(?<e><))|((?<s>>)(?<text>[/S]+?)$)", RegexOptions.IgnoreCase | RegexOptions.Compiled);
content = htmlRegex.Replace(content, @"${s}<item type=""text"">${text}</item>${e}");
htmlRegex = new Regex(@"<img/s*?src=""(?<img>[^""]*?)"">", RegexOptions.IgnoreCase | RegexOptions.Compiled);
content = htmlRegex.Replace(content, @"<item type=""image"">${img}</item>");
htmlRegex = new Regex(@"<txt/s*?src=""(?<txt>[^""]*?)"">", RegexOptions.IgnoreCase | RegexOptions.Compiled);
content = htmlRegex.Replace(content, @"<item type=""textfile"">${txt}</item>");

content = content.Replace("</item>", "</item>/r/n");
Console.WriteLine(content);

输出：
<item type="text">你好、你好吗？</item>
<item type="image">http://localhost/aaa/20058169483.gif</item>
<item type="text">我好我好</item>
<item type="image">http://localhost/aaa/200581694817.gif</item>
<item type="image">http://localhost/aaa/200581694822.gif</item>
<item type="text">他好他好</item>
<item type="textfile">http://www.humyu.com/App01/text01.txt</item>
<item type="text">都好。都好</item>
好像挺逼真的

用这个表达式可以从网页中匹配第一个表格:＜table.*(?=Headline)(.|/n)*?＜/table＞
如:
＜table border="0" width="11%" class="Headline"＞
＜tr＞
＜td width="100%"＞
＜p align="center"＞这是第一个表格＜/td＞
<td>
<A href="http://www.163.com" target=_blank>网易</A></td>
＜/tr＞
＜/table＞
但要把“这是第一个表格”以及其中的链接地址与链接文字分别提取到三个字段中，如果用正则表达式匹配呢，请指教，谢谢！

回复人： fancyf(凡瑞) ( 两星(中级) ) 信誉：133 2005-8-29 14:04:17 得分: 56

＜table[^＞]*＞[/s/S]*?<a[/s/S]*?href=("(?<href>[^"]*)"|'(?<href>[^']*)'|(?<href>[^>/s]*))[^>]*?>(?<title>[/s/S]*?)</a>[/s/S]*?＜/table＞

测试：
string content = @"＜table border=""0"" width=""11%"" class=""Headline""＞
＜tr＞
＜td width=""100%""＞
＜p align=""center""＞这是第一个表格＜/td＞
<td>
<A href=""http://www.163.com"" target=_blank>网易</A></td>
＜/tr＞
＜/table＞";
Regex htmlRegex = new Regex(
@"＜table[^＞]*＞[/s/S]*?<a[/s/S]*?href=(""(?<href>[^""]*)""|'(?<"
+ @"href>[^']*)'|(?<href>[^>/s]*))[^>]*?>(?<title>[/s/S]*?)</a>["
+ @"/s/S]*?＜/table＞",
RegexOptions.IgnoreCase | RegexOptions.Compiled);
//content = htmlRegex.Replace(content, "");

MatchCollection mc = htmlRegex.Matches(content);
string[] div = new string[mc.Count];
for (int i=0; i<mc.Count; i++)
{
//int n = Int32.Parse(mc[i].Groups["Content"].Value);
Console.WriteLine(mc[i].Groups[0].Value);//The whole table
Console.WriteLine(mc[i].Groups["href"].Value);//URL
Console.WriteLine(mc[i].Groups["title"].Value);//Text
//Console.WriteLine();
//div[i] = mc[i].Groups["content"].Value;
}
输出：
＜table border="0" width="11%" class="Headline"＞
＜tr＞
＜td width="100%"＞
＜p align="center"＞这是第一个表格＜/td＞
<td>
<A href="http://www.163.com" target=_blank>网易</A></td>
＜/tr＞
＜/table＞
http://www.163.com
网易