how to use cudaMallocPitch
来源:互联网 发布:男性网络个人基金产品 编辑:程序博客网 时间:2024/05/20 02:30
by Steven Mark Ford
http://www.stevenmarkford.com/allocating-2d-arrays-in-cuda/
Allocating 2D arrays in CUDA can be a little confusing at first. There are a couple of mistakes you may make while trying to allocate your first 2D array.
Wrong Way #1:
The problem with doing the above is that cudaMalloc assigns memory on the device and once it is on the device your main thread looses access to it, that is, it can only be accessed within kernels. So, When you try call cudaMalloc on the 2nd dimension of the array it throws an "Access violation writing location" exception.
Wrong Way #2:
The issue with the above code is that the 1st dimension of the array now belongs to the host (we used malloc) and the second dimension of the array belongs to the device (we used cuda malloc). This can also cause access violations on the host and other issues on the kernel.
The Semi-Correct Way:
Flatten out your 2D array into a 1D array and use pointer arithmetic to access the chunk of array you desire. This works but the issue is probable performance loss due to classical data structure alignment issues (for theory on this see Wikipedia's article: http://en.wikipedia.org/wiki/Data_structure_alignment). The CUDA documentation also talks about memory alignment for optimal performance.. For CUDA specific alignment and padding requirements see "CUDA_C_Programming_Guide Version 4.0" Page 94.
The Recommended Way:
Use the built-in CUDA array allocation methods e.g. cudaMalloocPitch() and cudaMalloc3D().These are also optomised for performance.
A quote from the "CUDA_C_Programming_Guide Version 4.0" Page 21: "These functions are recommended for allocations of 2D or 3D
arrays as it makes sure that the allocation is appropriately padded to meet the
alignment requirements described in Section 5.3.2.1, therefore ensuring best
performance when accessing the row addresses or performing copies between 2D
arrays and other regions of device memory"
Example:
Below is a picture explaining the meaning of pitch (the numbers aren't realistic):
- how to use cudaMallocPitch
- How To Use DataGird
- how to use typedef
- how to use gz
- How to use chkconfig
- How to use ,,,,
- how to use dialog
- How to use UIDs
- How to use dmalloc
- How to use abld
- How to use BSTR
- How to use Ant
- how to use mstsclib
- how to use gcc
- how to use thread
- how to use ffmpeg
- How to use RDebug
- How to use regex
- Space Elevator(DP)
- vijos1232 核电站问题
- caffe的Matlab接口的使用方法
- weka+eclipse算法二次开发(1)
- LCS LIS LCIS 算法
- how to use cudaMallocPitch
- openssl win7安装及应用于code blocks
- [iOS学习]block初识
- HDOJ1713(相遇周期)(有点坑)
- 程序猿的爱情(一)
- 读书笔记--ContentProvider
- eclipse中装了MyEclipse插件之后不能创建web project
- 20 找出第1500个丑数
- 用 IntelliJ IDEA 15.x 创建并发布JavaWeb项目