c/c++多线程编程与无锁数据结构漫谈

来源:互联网 发布:intent 传递数据 编辑:程序博客网 时间:2024/05/18 02:52

本文主要针对c/c++,系统主要针对linux。本文引述别人的资料均在引述段落加以声明。

场景:

thread...1...2...3...:多线程遍历

thread...a...b...c...:多线程插入删除修改

众所周知的stl是多线程不安全的。为何stl不提供线程安全的数据结构呢?这个问题我只能姑且猜测:可能stl追求性能的卓越性,再加上容器数据结构的线程安全确实太复杂了。

网上常见的线程安全的研究都是针对最simple的queue类型的容器。为何常见的show理论实力的博客均是针对queue类型数据结构呢?想必是因为queue一般不涉及迭代遍历,我感觉这个原因很靠谱。多线程对queue的操作一般都是多线程同时对queue进行pop和push。不涉及到一个或者多个线程只读(query),另外一个或者多个线程写操作(更新,删除,插入)。后者的需求,实现起来很棘手。

那么先说说queue类型数据结构是如何做lock-free操作的吧。

lock-free queue

CAS操作语句

lock-free queue都是基于CAS操作实现无锁的。CAS是compare-and-swap的简写,意思是比较交换。CAS指令需要CPU和编译器的支持,现在的CPU大多数是支持CAS指令的。如果是GCC编译器,则需要GCC4.1.0或更新版本。CAS在GCC中的实现有两个原子操作。大多数无锁数据结构都用到了下面两个函数的前者,其返回bool表明当前的原子操作是否成功,后者返回值是值类型。

bool __sync_bool_compare_and_swap (type *ptr, type oldval type newval, ...)type __sync_val_compare_and_swap (type *ptr, type oldval type newval, ...)


如下宏定义,定义自己的CAS函数

/// @brief Compare And Swap///        If the current value of *a_ptr is a_oldVal, then write a_newVal into *a_ptr/// @return true if the comparison is successful and a_newVal was written#define CAS(a_ptr, a_oldVal, a_newVal) __sync_bool_compare_and_swap(a_ptr, a_oldVal, a_newVal)

下面两段引用参考这篇博客的描述(十分建议读者阅读原博客,可以帮助理解。但是,我强烈建议读者别用这个代码商用,有bug,不适用map,list等),关于CAS队列的进出操作。

EnQueue(x) //进队列{    //准备新加入的结点数据    q = new record();    q->value = x;    q->next = NULL;     do {        p = tail; //取链表尾指针的快照    } while( CAS(p->next, NULL, q) != TRUE); //如果没有把结点链在尾指针上,再试     CAS(tail, p, q); //置尾结点}
我们可以看到,程序中的那个 do- while 的 Re-Try-Loop。就是说,很有可能我在准备在队列尾加入结点时,别的线程已经加成功了,于是tail指针就变了,于是我的CAS返回了false,于是程序再试,直到试成功为止。这个很像我们的抢电话热线的不停重播的情况。
你会看到,为什么我们的“置尾结点”的操作(第12行)不判断是否成功,因为:
1、如果有一个线程T1,它的while中的CAS如果成功的话,那么其它所有的 随后线程的CAS都会失败,然后就会再循环,
2、此时,如果T1 线程还没有更新tail指针,其它的线程继续失败,因为tail->next不是NULL了。
3、直到T1线程更新完tail指针,于是其它的线程中的某个线程就可以得到新的tail指针,继续往下走了。
这里有一个潜在的问题——如果T1线程在用CAS更新tail指针的之前,线程停掉或是挂掉了,那么其它线程就进入死循环了。下面是改良版的EnQueue()
EnQueue(x) //进队列改良版{    q = new record();    q->value = x;    q->next = NULL;     p = tail;    oldp = p    do {        while (p->next != NULL)            p = p->next;    } while( CAS(p.next, NULL, q) != TRUE); //如果没有把结点链在尾上,再试     CAS(tail, oldp, q); //置尾结点}
我们让每个线程,自己fetch 指针 p 到链表尾。但是这样的fetch会很影响性能。而通实际情况看下来,99.9%的情况不会有线程停转的情况,所以,更好的做法是,你可以接合上述的这两个版本,如果retry的次数超了一个值的话(比如说3次),那么,就自己fetch指针。
我们解决了EnQueue,我们再来看看DeQueue的代码:
DeQueue() //出队列{    do{        p = head;        if (p->next == NULL){            return ERR_EMPTY_QUEUE;        }    while( CAS(head, p, p->next) != TRUE );    return p->next->value;}

关于通用无锁数据结构

如果读者是用c语言实现的自定义链表等结构,那无需看本节关于通用无锁数据结构的描述,因为本节内容是C++相关。

方案一:stl+锁

stl/boost+锁是最常规的方案之一。如果需求满足(一个写线程,多个读线程),可以考虑boost::shared_mutex。

方案二:TBB

TBB库貌似和很多其他Intel的库一样,不出名。TBB是Threading Building Blocks@Intel 的缩写。

TBB的并发容器通过下面的方法做到高度并行操作:
细粒度锁(Fine-grained locking):使用细粒度锁,容器上的多线程操作只有同时存取同一位置时才会锁定,如果是同时存取不同位置,可以并行处理。
免锁算法(Lock-free algorithms):使用免锁算法,不同线程的评估并校正与其它线程之间的相互影响。
和std::map一样,concurrent_hash_map也是一个std::pair<const Key,T>的容器。为了避免出现竞争,我们不能直接存放散列表里的单元数据,而是使用accessor或const_accessor。
accessor是std::pair的智能指针,它负责对散列表中各单元的更新,只要它指向了一个单元,其它尝试对这个单元的操作就会被锁定直到accessor完成。const_accessor类似,不过它是只读的,多个const_accessor可以指向同一单元,这在频繁读取和少量更新的情形下能极大地提高并发性。

以下代码为TBB的sample里面concurrent_hash_map使用范例

/*    Copyright 2005-2014 Intel Corporation.  All Rights Reserved.    This file is part of Threading Building Blocks. Threading Building Blocks is free software;    you can redistribute it and/or modify it under the terms of the GNU General Public License    version 2  as  published  by  the  Free Software Foundation.  Threading Building Blocks is    distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the    implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.    See  the GNU General Public License for more details.   You should have received a copy of    the  GNU General Public License along with Threading Building Blocks; if not, write to the    Free Software Foundation, Inc.,  51 Franklin St,  Fifth Floor,  Boston,  MA 02110-1301 USA    As a special exception,  you may use this file  as part of a free software library without    restriction.  Specifically,  if other files instantiate templates  or use macros or inline    functions from this file, or you compile this file and link it with other files to produce    an executable,  this file does not by itself cause the resulting executable to be covered    by the GNU General Public License. This exception does not however invalidate any other    reasons why the executable file might be covered by the GNU General Public License.*/// Workaround for ICC 11.0 not finding __sync_fetch_and_add_4 on some of the Linux platforms.#if __linux__ && defined(__INTEL_COMPILER)#define __sync_fetch_and_add(ptr,addend) _InterlockedExchangeAdd(const_cast<void*>(reinterpret_cast<volatile void*>(ptr)), addend)#endif#include <string>#include <cstring>#include <cctype>#include <cstdlib>#include <cstdio>#include "tbb/concurrent_hash_map.h"#include "tbb/blocked_range.h"#include "tbb/parallel_for.h"#include "tbb/tick_count.h"#include "tbb/task_scheduler_init.h"#include "tbb/tbb_allocator.h"#include "../../common/utility/utility.h"//! String type with scalable allocator./** On platforms with non-scalable default memory allocators, the example scales     better if the string allocator is changed to tbb::tbb_allocator<char>. */typedef std::basic_string<char,std::char_traits<char>,tbb::tbb_allocator<char> > MyString;using namespace tbb;using namespace std;//! Set to true to counts.static bool verbose = false;static bool silent = false;//! Problem sizelong N = 1000000;const int size_factor = 2;//! A concurrent hash table that maps strings to ints.typedef concurrent_hash_map<MyString,int> StringTable;//! Function object for counting occurrences of strings.struct Tally {    StringTable& table;    Tally( StringTable& table_ ) : table(table_) {}    void operator()( const blocked_range<MyString*> range ) const {        for( MyString* p=range.begin(); p!=range.end(); ++p ) {            StringTable::accessor a;            table.insert( a, *p );            a->second += 1;        }    }};static MyString* Data;static void CountOccurrences(int nthreads) {    StringTable table;    tick_count t0 = tick_count::now();    parallel_for( blocked_range<MyString*>( Data, Data+N, 1000 ), Tally(table) );    tick_count t1 = tick_count::now();    int n = 0;    for( StringTable::iterator i=table.begin(); i!=table.end(); ++i ) {        if( verbose && nthreads )            printf("%s %d\n",i->first.c_str(),i->second);        n += i->second;    }    if ( !silent ) printf("total = %d  unique = %u  time = %g\n", n, unsigned(table.size()), (t1-t0).seconds());}/// Generator of random wordsstruct Sound {    const char *chars;    int rates[3];// begining, middle, ending};Sound Vowels[] = {    {"e", {445,6220,1762}}, {"a", {704,5262,514}}, {"i", {402,5224,162}}, {"o", {248,3726,191}},    {"u", {155,1669,23}}, {"y", {4,400,989}}, {"io", {5,512,18}}, {"ia", {1,329,111}},    {"ea", {21,370,16}}, {"ou", {32,298,4}}, {"ie", {0,177,140}}, {"ee", {2,183,57}},    {"ai", {17,206,7}}, {"oo", {1,215,7}}, {"au", {40,111,2}}, {"ua", {0,102,4}},    {"ui", {0,104,1}}, {"ei", {6,94,3}}, {"ue", {0,67,28}}, {"ay", {1,42,52}},    {"ey", {1,14,80}}, {"oa", {5,84,3}}, {"oi", {2,81,1}}, {"eo", {1,71,5}},    {"iou", {0,61,0}}, {"oe", {2,46,9}}, {"eu", {12,43,0}}, {"iu", {0,45,0}},    {"ya", {12,19,5}}, {"ae", {7,18,10}}, {"oy", {0,10,13}}, {"ye", {8,7,7}},    {"ion", {0,0,20}}, {"ing", {0,0,20}}, {"ium", {0,0,10}}, {"er", {0,0,20}}};Sound Consonants[] = {    {"r", {483,1414,1110}}, {"n", {312,1548,1114}}, {"t", {363,1653,251}}, {"l", {424,1341,489}},    {"c", {734,735,260}}, {"m", {732,785,161}}, {"d", {558,612,389}}, {"s", {574,570,405}},    {"p", {519,361,98}}, {"b", {528,356,30}}, {"v", {197,598,16}}, {"ss", {3,191,567}},    {"g", {285,430,42}}, {"st", {142,323,180}}, {"h", {470,89,30}}, {"nt", {0,350,231}},    {"ng", {0,117,442}}, {"f", {319,194,19}}, {"ll", {1,414,83}}, {"w", {249,131,64}},    {"k", {154,179,47}}, {"nd", {0,279,92}}, {"bl", {62,235,0}}, {"z", {35,223,16}},    {"sh", {112,69,79}}, {"ch", {139,95,25}}, {"th", {70,143,39}}, {"tt", {0,219,19}},    {"tr", {131,104,0}}, {"pr", {186,41,0}}, {"nc", {0,223,2}}, {"j", {184,32,1}},    {"nn", {0,188,20}}, {"rt", {0,148,51}}, {"ct", {0,160,29}}, {"rr", {0,182,3}},    {"gr", {98,87,0}}, {"ck", {0,92,86}}, {"rd", {0,81,88}}, {"x", {8,102,48}},    {"ph", {47,101,10}}, {"br", {115,43,0}}, {"cr", {92,60,0}}, {"rm", {0,131,18}},    {"ns", {0,124,18}}, {"sp", {81,55,4}}, {"sm", {25,29,85}}, {"sc", {53,83,1}},    {"rn", {0,100,30}}, {"cl", {78,42,0}}, {"mm", {0,116,0}}, {"pp", {0,114,2}},    {"mp", {0,99,14}}, {"rs", {0,96,16}}, /*{"q", {52,57,1}},*/ {"rl", {0,97,7}},    {"rg", {0,81,15}}, {"pl", {56,39,0}}, {"sn", {32,62,1}}, {"str", {38,56,0}},    {"dr", {47,44,0}}, {"fl", {77,13,1}}, {"fr", {77,11,0}}, {"ld", {0,47,38}},    {"ff", {0,62,20}}, {"lt", {0,61,19}}, {"rb", {0,75,4}}, {"mb", {0,72,7}},    {"rc", {0,76,1}}, {"gg", {0,74,1}}, {"pt", {1,56,10}}, {"bb", {0,64,1}},    {"sl", {48,17,0}}, {"dd", {0,59,2}}, {"gn", {3,50,4}}, {"rk", {0,30,28}},    {"nk", {0,35,20}}, {"gl", {40,14,0}}, {"wh", {45,6,0}}, {"ntr", {0,50,0}},    {"rv", {0,47,1}}, {"ght", {0,19,29}}, {"sk", {23,17,5}}, {"nf", {0,46,0}},    {"cc", {0,45,0}}, {"ln", {0,41,0}}, {"sw", {36,4,0}}, {"rp", {0,36,4}},    {"dn", {0,38,0}}, {"ps", {14,19,5}}, {"nv", {0,38,0}}, {"tch", {0,21,16}},    {"nch", {0,26,11}}, {"lv", {0,35,0}}, {"wn", {0,14,21}}, {"rf", {0,32,3}},    {"lm", {0,30,5}}, {"dg", {0,34,0}}, {"ft", {0,18,15}}, {"scr", {23,10,0}},    {"rch", {0,24,6}}, {"rth", {0,23,7}}, {"rh", {13,15,0}}, {"mpl", {0,29,0}},    {"cs", {0,1,27}}, {"gh", {4,10,13}}, {"ls", {0,23,3}}, {"ndr", {0,25,0}},    {"tl", {0,23,1}}, {"ngl", {0,25,0}}, {"lk", {0,15,9}}, {"rw", {0,23,0}},    {"lb", {0,23,1}}, {"tw", {15,8,0}}, /*{"sq", {15,8,0}},*/ {"chr", {18,4,0}},    {"dl", {0,23,0}}, {"ctr", {0,22,0}}, {"nst", {0,21,0}}, {"lc", {0,22,0}},    {"sch", {16,4,0}}, {"ths", {0,1,20}}, {"nl", {0,21,0}}, {"lf", {0,15,6}},    {"ssn", {0,20,0}}, {"xt", {0,18,1}}, {"xp", {0,20,0}}, {"rst", {0,15,5}},    {"nh", {0,19,0}}, {"wr", {14,5,0}}};const int VowelsNumber = sizeof(Vowels)/sizeof(Sound);const int ConsonantsNumber = sizeof(Consonants)/sizeof(Sound);int VowelsRatesSum[3] = {0,0,0}, ConsonantsRatesSum[3] = {0,0,0};int CountRateSum(Sound sounds[], const int num, const int part){    int sum = 0;    for(int i = 0; i < num; i++)        sum += sounds[i].rates[part];    return sum;}const char *GetLetters(int type, const int part){    Sound *sounds; int rate, i = 0;    if(type & 1)        sounds = Vowels, rate = rand() % VowelsRatesSum[part];    else        sounds = Consonants, rate = rand() % ConsonantsRatesSum[part];    do {        rate -= sounds[i++].rates[part];    } while(rate > 0);    return sounds[--i].chars;}static void CreateData() {    for(int i = 0; i < 3; i++) {        ConsonantsRatesSum[i] = CountRateSum(Consonants, ConsonantsNumber, i);        VowelsRatesSum[i] = CountRateSum(Vowels, VowelsNumber, i);    }    for( int i=0; i<N; ++i ) {        int type = rand();        Data[i] = GetLetters(type++, 0);        for( int j = 0; j < type%size_factor; ++j )            Data[i] += GetLetters(type++, 1);        Data[i] += GetLetters(type, 2);    }    MyString planet = Data[12]; planet[0] = toupper(planet[0]);    MyString helloworld = Data[0]; helloworld[0] = toupper(helloworld[0]);    helloworld += ", "+Data[1]+" "+Data[2]+" "+Data[3]+" "+Data[4]+" "+Data[5];    if ( !silent ) printf("Message from planet '%s': %s!\nAnalyzing whole text...\n", planet.c_str(), helloworld.c_str());}int main( int argc, char* argv[] ) {    try {        tbb::tick_count mainStartTime = tbb::tick_count::now();        srand(2);        //! Working threads count        // The 1st argument is the function to obtain 'auto' value; the 2nd is the default value        // The example interprets 0 threads as "run serially, then fully subscribed"        utility::thread_number_range threads(tbb::task_scheduler_init::default_num_threads,0);        utility::parse_cli_arguments(argc,argv,            utility::cli_argument_pack()            //"-h" option for displaying help is present implicitly            .positional_arg(threads,"n-of-threads",utility::thread_number_range_desc)            .positional_arg(N,"n-of-strings","number of strings")            .arg(verbose,"verbose","verbose mode")            .arg(silent,"silent","no output except elapsed time")            );        if ( silent ) verbose = false;        Data = new MyString[N];        CreateData();        if ( threads.first ) {            for(int p = threads.first;  p <= threads.last; p = threads.step(p)) {                if ( !silent ) printf("threads = %d  ", p );                task_scheduler_init init( p );                CountOccurrences( p );            }        } else { // Number of threads wasn't set explicitly. Run serial and parallel version            { // serial run                if ( !silent ) printf("serial run   ");                task_scheduler_init init_serial(1);                CountOccurrences(1);            }            { // parallel run (number of threads is selected automatically)                if ( !silent ) printf("parallel run ");                task_scheduler_init init_parallel;                CountOccurrences(0);            }        }        delete[] Data;        utility::report_elapsed_time((tbb::tick_count::now() - mainStartTime).seconds());        return 0;    } catch(std::exception& e) {        std::cerr<<"error occurred. error text is :\"" <<e.what()<<"\"\n";    }}


0 0
原创粉丝点击