c++ regex && sed正则表达式删除控制台特殊控制符
来源:互联网 发布:mac怎么玩lol 编辑:程序博客网 时间:2024/05/17 01:53
在c++中,有三种正则可以选择使用,C ++regex,C regex,boost regex ,如果在windows下开发c++,默认不支持后面两种正则,如果想快速应用,显然C++ regex 比较方便使用。文章将讨论C++ regex 正则表达式的使用。
C++ regex函数有3个:regex_match、 regex_search 、regex_replace
regex_match
regex_match是正则表达式匹配的函数,下面以例子说明。如果想系统的了解,参考regex_match
// regex_match example#include <iostream>#include <string>#include <regex>int main (){ if (std::regex_match ("subject", std::regex("(sub)(.*)") )) std::cout << "string literal matched\n"; std::string s ("subject"); std::regex e ("(sub)(.*)"); if (std::regex_match (s,e)) std::cout << "string object matched\n"; if ( std::regex_match ( s.begin(), s.end(), e ) ) std::cout << "range matched\n"; std::cmatch cm; // same as std::match_results<const char*> cm; std::regex_match ("subject",cm,e); std::cout << "string literal with " << cm.size() << " matches\n"; std::smatch sm; // same as std::match_results<string::const_iterator> sm; std::regex_match (s,sm,e); std::cout << "string object with " << sm.size() << " matches\n"; std::regex_match ( s.cbegin(), s.cend(), sm, e); std::cout << "range with " << sm.size() << " matches\n"; // using explicit flags: std::regex_match ( "subject", cm, e, std::regex_constants::match_default ); std::cout << "the matches were: "; for (unsigned i=0; i<sm.size(); ++i) { std::cout << "[" << sm[i] << "] "; } std::cout << std::endl; return 0;}
输出如下:
string literal matchedstring object matchedrange matchedstring literal with 3 matchesstring object with 3 matchesrange with 3 matchesthe matches were: [subject] [sub] [ject]
regex_search
regex_match是另外一个正则表达式匹配的函数,下面是regex_search的例子。regex_search和regex_match的主要区别是:regex_match是全词匹配,而regex_search是搜索其中匹配的字符串。如果想系统了解,请参考regex_search
// regex_search example#include <iostream>#include <regex>#include <string>int main(){ std::string s ("this subject has a submarine as a subsequence"); std::smatch m; std::regex e ("\\b(sub)([^ ]*)"); // matches words beginning by "sub" std::cout << "Target sequence: " << s << std::endl; std::cout << "Regular expression: /\\b(sub)([^ ]*)/" << std::endl; std::cout << "The following matches and submatches were found:" << std::endl; while (std::regex_search (s,m,e)) { for (auto x=m.begin();x!=m.end();x++) std::cout << x->str() << " "; std::cout << "--> ([^ ]*) match " << m.format("$2") <<std::endl; s = m.suffix().str(); }}
输出如下:
Target sequence: this subject has a submarine as a subsequenceRegular expression: /\b(sub)([^ ]*)/The following matches and submatches were found:subject sub ject --> ([^ ]*) match jectsubmarine sub marine --> ([^ ]*) match marinesubsequence sub sequence --> ([^ ]*) match sequence
regex_replace
regex_replace是替换正则表达式匹配内容的函数,下面是regex_replace的例子。如果想系统了解,请参考regex_replace
#include <regex> #include <iostream> int main() { char buf[20]; const char *first = "axayaz"; const char *last = first + strlen(first); std::regex rx("a"); std::string fmt("A"); std::regex_constants::match_flag_type fonly = std::regex_constants::format_first_only; *std::regex_replace(&buf[0], first, last, rx, fmt) = '\0'; std::cout << &buf[0] << std::endl; *std::regex_replace(&buf[0], first, last, rx, fmt, fonly) = '\0'; std::cout << &buf[0] << std::endl; std::string str("adaeaf"); std::cout << std::regex_replace(str, rx, fmt) << std::endl; std::cout << std::regex_replace(str, rx, fmt, fonly) << std::endl; return 0; }
输出如下:
AxAyAzAxayazAdAeAfAdaeaf
C++ regex正则表达式的规则和其他编程语言差不多,如下:
特殊字符(用于匹配很难形容的字符):
For example: \ca is the same as \u0001, \cb the same as \u0002, and so on...\xhhASCII charactera character whose code unit value has an hex value equivalent to the two hex digits hh.
For example: \x4c is the same as L, or \x23 the same as #.\uhhhhunicode charactera character whose code unit value has an hex value equivalent to the four hex digitshhhh.\0nulla null character (same as \u0000).\intbackreferencethe result of the submatch whose opening parenthesis is the int-th (int shall begin by a digit other than 0). See groups below for more info.\ddigita decimal digit character \Dnot digitany character that is not a decimal digit character\swhitespacea whitespace character \Snot whitespaceany character that is not a whitespace character\wwordan alphanumeric or underscore character \Wnot wordany character that is not an alphanumeric or underscore character\charactercharacterthe character character as it is, without interpreting its special meaning within a regex expression.
Any character can be escaped except those which form any of the special character sequences above.
Needed for: ^ $ \ . * + ? ( ) [ ] { } |[class]character classthe target character is part of the class [^class]negated character classthe target character is not part of the class 注意了,在C++反斜杠字符(\)会被转义
std::regex e1 ("\\d"); // \d -> 匹配数字字符
std::regex e2 ("\\\\"); // \\ -> 匹配反斜杠字符
数量:
注意了,模式 "(a+).*" 匹配 "aardvark" 将匹配到 aa,模式 "(a+?).*" 匹配 "aardvark" 将匹配到 a
组(用以匹配连续的多个字符):
单个字符
[abc] 匹配 a, b 或 c.
[^xyz] 匹配任何非 x, y, z的字符
范围
[a-z] 匹配任何小写字母 (a, b, c, ..., z).
[abc1-5] 匹配 a, b , c, 或 1 到 5 的数字.
c++ regex还有一种类POSIX的写法
sed例子:
Remove color codes (special characters) with sed
Remove color codes (special characters) with sed
sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[m|K]//g"
Remove ( color / special / escape / ANSI ) codes, from text, with sed
Credit to the original folks who I've copied this command from.
The diff here is:
Theirs: [m|K]
Theirs is supposed to remove \E[NUMBERS;NUMBERS[m OR K]
This statement is incorrect in 2 ways.
1. The letters m and K are two of more than 20+ possible letters that can end these sequences.
2. Inside []'s , OR is already assumed, so they are also looking for sequences ending with | which is not correct.
This : [a-zA-Z]
This resolves the "OR" issue noted above, and takes care of all sequences, as they all end with a lower or upper cased letter.
This ensures 100% of any escape code 'mess' is removed.
sed "s,\x1B\[[0-9;]*[a-zA-Z],,g"
- c++ regex && sed正则表达式删除控制台特殊控制符
- C++:Regex正则表达式
- C++:Regex正则表达式
- C++:Regex正则表达式
- C语言正则表达式regex
- c语言中的正则表达式regex.h
- c语言中的正则表达式regex.h
- Linux C 正则表达式运用(regex.h)
- 日志 c/c++ 正则表达式 regex
- C语言的正则表达式 regex
- C#Regex正则表达式学习笔记
- C语言正则表达式库RegEx库
- C++中三种正则表达式比较(C regex,C ++regex,boost regex)
- C++中三种正则表达式比较(C regex,C ++regex,boost regex)
- C++中三种正则表达式比较(C regex,C ++regex,boost regex)
- C++中三种正则表达式比较(C regex,C ++regex,boost regex)
- C++中三种正则表达式比较(C regex,C ++regex,boost regex)
- C++中三种正则表达式比较(C regex,C ++regex,boost regex)
- 我的iOS库-侧拉菜单(模糊效果)
- 调用ffmpeg的Android开发的播放器开源代码以及重点讲解----阿冬专栏
- Intel(R) Matrix Storage Manager 介绍
- easyui查询后返回第一页数据
- Android-drawable资源-clipdrawable
- c++ regex && sed正则表达式删除控制台特殊控制符
- 可能用到的jquery遍历json的代码
- 关于Android/iOS/Cocos2d-x中的广播监听
- 如何做好SEO中的链接战略
- table中设置的border=1,而实际效果是2px的问题
- java Pattern和Matcher详解
- 一个变量与常量的理解方式
- 【28系列DSP小结-1】.out转hex文件
- 重新认识与正确使用LayoutInflater