CUDA图像旋转的实现

来源：互联网发布：pdf加密软件编辑：程序博客网时间：2024/05/20 18:55

CUDA图像旋转的实现

由于最近工作比较忙，有一段时间没写博客了，今天就将以前基于CUDA做的图像旋转的demo奉献给大家。

在网上看到很多图像旋转的博客文章，可以说大部分做的只是图像旋转后还保持原来图像的大小，那么这就带来一个问题，旋转后图像有一部分内容会造成缺失，所以为了保持图像的完整性，需要确定旋转后图像的大小，即多少行和列。

由于本人主要从事3S方面的研发工作，读取图像我就用GDAL开源库，图像就采用我所熟悉的遥感影像。这里我采用图像绕着图像的中心旋转，那么旋转之后究竟图像有多大，其实很简单，只要求出以图像中心点为坐标原点，求出原始图像四个角点变换之后所在的行列号，然后求出最大和最小坐标就OK了，具体的代码如下：

// 打开原始图像GDALDataset* poInDs = (GDALDataset*)GDALOpen(pszFileName, GA_ReadOnly);if (NULL == poInDs){return ;}double dbGeonTran[6];poInDs->GetGeoTransform(dbGeonTran);//获得原始图像的行数和列数int nXsize = poInDs->GetRasterXSize();int nYsize =  poInDs->GetRasterYSize();unsigned char* poDataIn = new unsigned char[nXsize*nYsize];int n = 1;poInDs->RasterIO(GF_Read,0,0,nXsize,nYsize,poDataIn,nXsize,nYsize,GDT_Byte,1,&n,0,0,0);//设置旋转的角度float fAngle = 45.0;float sinTheta = sin(fAngle*M_PI/180.0);float cosTheta = cos(fAngle*M_PI/180.0);//求出旋转后图像的长和宽float x0 = nXsize/2.0f;float y0 = nYsize/2.0f;float dfPixel[4];float dfLine[4];dfPixel[0] = (int) ( cosTheta*(0-x0) + sinTheta*(0-y0) + x0 );dfLine[0] = (int) ( -sinTheta*(0-x0) + cosTheta*(0-y0) + y0 );dfPixel[1] = (int) ( cosTheta*(nXsize-1-x0) + sinTheta*(0-y0) + x0 );dfLine[1] = (int) ( -sinTheta*(nXsize-1-x0) + cosTheta*(0-y0) + y0 );dfPixel[2] = (int) ( cosTheta*(nXsize-1-x0) + sinTheta*(nYsize-1-y0) + x0 );dfLine[2] = (int) ( -sinTheta*(nXsize-1-x0) + cosTheta*(nYsize-1-y0) + y0 );dfPixel[3] = (int) ( cosTheta*(0-x0) + sinTheta*(nYsize-1-y0) + x0 );dfLine[3] = (int) ( -sinTheta*(0-x0) + cosTheta*(nYsize-1-y0) + y0 );float fminx = *( std::min_element(dfPixel,dfPixel+4) );float fmaxx = *( std::max_element(dfPixel,dfPixel+4) );float fminy = *( std::min_element(dfLine,dfLine+4) );float fmaxy = *( std::max_element(dfLine,dfLine+4) );int nNewXsize = int(fmaxx-fminx);int nNewYsize = int(fmaxy-fminy);

这样，上面代码中nNewXsize和nNewYsize就是求出来的图像大小，然后分配内存，读取原始图像，我这里只拿第一个波段以及8位的数据作为实验。到这里数据都准备好了，下面只管往显卡里面送数据，图像旋转可以借助点的旋转公式解决，可以参考我的博客二维图形旋转公式的推导，其中核心的GPU内核函数如下：

__global__ void ImageRotate_kernel(unsigned char* poDataIn,  unsigned char* poDataOut,  int nWidth,  int nHeight,  int nNewWidth,  int nNewHeight,  float sinTheta,   float cosTheta){int idy = blockIdx.y*blockDim.y + threadIdx.y;//行int idx = blockIdx.x*blockDim.x + threadIdx.x;//列float x0 = nNewWidth/2.0f;float y0 = nNewHeight/2.0f;float x1 = nWidth/2.0f;float y1 = nHeight/2.0f;//求出idx，idy所在原始图像上的坐标int x2 = (int) ( cosTheta*(idx-x0) + sinTheta*(idy-y0) + x0 );int y2 = (int) ( -sinTheta*(idx-x0) + cosTheta*(idy-y0) + y0 );//还要减去偏移量x2 -= (x0-x1);y2 -= (y0-y1);if (idx < nNewWidth && idy < nNewHeight){if (x2 < 0 || x2 >= nWidth || y2 < 0 || y2 >= nHeight){poDataOut[ idy*nNewWidth + idx] = 0;}else{poDataOut[ idy*nNewWidth + idx] = poDataIn[y2*nWidth + x2];}}}

上面函数中poDataOut变量就存储计算好的新图像的数据，然后将结果拷贝到主机端，再写回到文件。当然，这中间还需要一个中间的函数，用来启动CUDA设备，调用核函数以及主机端和GPU设备端数据的来回拷贝，该函数如下：

void ImageRotateCUDA(unsigned char* poDataIn, unsigned char* poDataOut, int nWidth, int nHeight, int nNewWidth, int nNewHeight, float sinTheta,  float cosTheta){unsigned char* poDataIn_d = NULL;cudaMalloc(&poDataIn_d,nWidth*nHeight*sizeof(unsigned char));cudaMemcpy(poDataIn_d,poDataIn,nWidth*nHeight*sizeof(unsigned char),cudaMemcpyHostToDevice);unsigned char* poDataOut_d = NULL;cudaMalloc(&poDataOut_d,nNewWidth*nNewHeight*sizeof(unsigned char));//调用核函数dim3 dimBlock(32,32,1);dim3 dimGrid(ceil(nNewWidth/32.0),ceil(nNewHeight/32.0),1);ImageRotate_kernel<<<dimGrid,dimBlock>>>(poDataIn_d,poDataOut_d,nWidth,nHeight,nNewWidth,nNewHeight,sinTheta,cosTheta);cudaMemcpy(poDataOut,poDataOut_d,nNewWidth*nNewHeight*sizeof(unsigned char),cudaMemcpyDeviceToHost);//释放显卡显存cudaFree(poDataOut_d);cudaFree(poDataOut_d);}

下面的图像时原始图像：

旋转后的图像如下：

可能有些读者有些疑问，为什么旋转45度后成了这个样子，那是因为在遥感中图像一般都是以左上角为原点，如果你不习惯，可以自己画一个坐标系，然后上下颠倒看就明白了。

当然，这个只是一个简单的demo，如果图像非常大，那么就需要分块来做。这样就不会导致程序崩溃。

1 0