Parallel Programming

来源:互联网 发布:网络的3类地址 编辑:程序博客网 时间:2024/05/16 02:37


    1. Where  is the parallelism,which variable is used as the variable in parallel for
    2. Load balance
    3. Use atomic operations instead of mutex, signal whenever possible
    4. Try to use Map-reduce, parallel sort to organize the data

     

     

    ForGPU

    1. Check shared memory per thread to see whether we can fully utilize the GPU SM processors
    2. Check number of registers and shared memory
    3. Optimize memory storage: packing your data structure; block storage for large uniform data structure (1D - nD matrix); if two variables are frequently read together, put them in the closest position in the memory.
    4. Using memory pool  to reduce the cost of the memory allocation costs
    5. Bit operations are important
    原创粉丝点击