<Boost> 正則表達式boost::regex

来源：互联网发布：淘宝免费流量获取途径编辑：程序博客网时间：2024/06/06 01:05

1. 編譯boost regex

使用boost庫的regex先需要編譯，方法如下：

C:\Users\Administrator>cd "C:\Program Files\boost_1_57_0"C:\Program Files\boost_1_57_0>bootstrapC:\Program Files\boost_1_57_0>.\b2

2. regex的使用

regex是regular expression的縮寫, 即正則表達式。boost庫使用的是Perl正則表達式。

使用说明:

1. 创建regex对象：

#include<boost/regex.hpp>boost::regex reg(“(.*)”);

2. regex_match

该函数用来对一个字符串的完全匹配，在很多校验信息中可以广泛使用，具体使用示例见附后的测试代码

3. regex_rearch

说到这个函数，必须要说明下boost.match_result。 regex_rearch在执行查找时，通过一个match_result类型的对象来报告匹配的自表达式。

match_result主要封装了一个std::vector<sub_match<<…>> >类型的对象,而sub_match类继承自std::pair,主要记录匹配的结果信息。

4. regex_replace

该函数根据指定的fmt格式化通过正则表达式匹配的子串。需要注意的是，该函数不会修改原字符串，只是将格式化后的结果返回。具体使用示例见附后测试源码。

5. regex_iterator

通过多次调用regex_rearch我们可以处理所有满足匹配的字串。但是，Regex库还给我们提供了一个更优雅的方法——即通过regex_iterator。通过字符串和正则表达式构造regex_iterator的时候会构建一个match_result的对象用于保存匹配结果信息，再通过重载++运算符达到遍历所有匹配信息的目的。

6. regex_token_iterator

与regex_iterator相似，regex还提供了一个列举与正则表达式不匹配的子表达式，就是regex_token_iterator。与stl的设计类似，是通过迭代器适配器实现的。这个特性让我们很容易的分割字符串。

以下是示例代碼：

#include <boost/lambda/lambda.hpp>#include <boost/regex.hpp>#include <iostream>#include <iterator>#include <algorithm>using std::cout;using std::endl;using namespace std;class regex_callback{public:template <typename T>void operator()(const T& what) {std::cout << what << std::endl;}};void BoostRegex(){using namespace boost::lambda;//////////////////////////////////////////////////////////////////////////// regex boost::regex reg("select ([a-zA-Z]*) from ([a-zA-Z]*)");cout << "Status: " << reg.empty() << endl;// 正則表達式是否有效： 0表示正常cout << "Mark count: " << reg.mark_count() << endl;// 正則表達式的組數:小括號對數+1//////////////////////////////////////////////////////////////////////////// 完全匹配boost::regex reg1("select ([a-zA-Z]*) from ([a-zA-Z]*)");boost::cmatch match1;std::string str1 = "select me from dest";bool bRet = boost::regex_match(str1, reg1);// 只測試匹不匹配，不保存結果cout << (bRet ? "匹配" : "不匹配") << endl;bRet = boost::regex_match(str1.c_str(), match1, reg1 ); // 測試匹配，並保存結果std::for_each(match1.begin(), match1.end(), /*std::cout << _1 << " "*/regex_callback());cout << "-----------------------------" << endl;//////////////////////////////////////////////////////////////////////////// 部分匹配boost::cmatch match2;std::string str2 = "my select me from dest oh baby";bRet = boost::regex_search(str2.c_str(), match2, reg1);cout << match2.prefix() << endl;// 匹配成功部分的前綴字段cout << match2.suffix() << endl;// 匹配成功部分的後綴字段std::for_each(match2.begin(), match2.end(), /*std::cout << _1 << " "*/regex_callback());cout << "-----------------------------" << endl;//////////////////////////////////////////////////////////////////////////// 替換boost::regex reg3("(Colo)(u)(r)", boost::regex::icase | boost::regex::perl); // 對大小寫不敏感std::string str3 = "Colour, colours, color, colourize";std::string sRslt = boost::regex_replace(str3, reg3, "$1$3");// (Colo)(u)(r)三部分只取第一部分和第三部分cout << sRslt << endl;cout << "-----------------------------" << endl;//////////////////////////////////////////////////////////////////////////// regex_iteratorboost::regex reg4("(\\d+),?");std::string str4 = "1,2,3,4,5,6,7,85,ad2348(,hj";boost::sregex_iterator it(str4.begin(), str4.end(), reg4);boost::sregex_iterator itend;std::for_each(it, itend, cout << _1 << " ");cout << "\n-----------------------------" << endl;//////////////////////////////////////////////////////////////////////////// regex_token_iterator 分割字符串boost::regex reg5("/");std::string str5 = "Split/Vulue/Teather/Neusoft/Write/By/Lanwei";boost::sregex_token_iterator tit(str5.begin(), str5.end(), reg5, -1);boost::sregex_token_iterator titend;while (tit != titend){cout << *tit << " ";tit++;}cout << "\n-----------------------------" << endl;}

其中正則表達式：

"select ([a-zA-Z]*) from ([a-zA-Z]*)"：匹配SQL查詢語句, ([a-zA-Z]*)即匹配若干個字母，如: select me from dest, 第一個"([a-zA-Z]*)"匹配"me", 第二個匹配"dest".

"(\d+),?"：搜索字符串中的數字，直到遇到","，(\d+)即匹配若干個數字，"?"在這裏代表非貪婪匹配.

運行效果如下：

0 0