OpenCL 1.0 Specification阅读笔记(7)

来源:互联网 发布:网络主播公司怎么注册 编辑:程序博客网 时间:2024/05/04 15:44

Executing Kernels

 

global_work_offset现在为NULL

global_work_size为总线程数

local_work_size为一个group的线程数,group中的线程必须在一个处理器中运行,不同group的线程可能在不同的处理器上,同步代价很大,local_work_size可以为NULL,driver为其设定划分,不推荐。

 

 

 

Event Objects

 

Event objects can be used to refer to a kernel execution command or read, write, map and copy commands on memory objects

 

event object可以用来跟踪命令执行的状态,每当向command queue中加入command,都会返回一个event object(如果成功的话),这些状态包括:

(1)CL_QUEUED:命令入队

(2)CL_SUBMITTED:入队的命令已经提交给设备

(3)CL_RUNNING:命令正在设备上执行

(4)CL_COMPLETE:命令已经完成或者返回错误标识(负数)

 

clWaitForEvents(): waits on the host thread for commands identified by event objects in event_list to complete

clGetEventInfo(): returns information about the event object

clRetainEvent(): increments the event reference count

clReleaseEvent(): decrements the event reference count

 

Out of order Execution

 

clEnqueueMarker(): enqueue a marker command to command_queue. The marker command returns an event which can be used by to queue a wait on this event, i.e. wait for all commands queued before the marker command to complete.插了一个fence呗

 

clEnqueueWaitForEvents(): enqueue a wait for  a specific  event or a list of events to complete before any further commands queued in the command-queue are executed.插了一个Wait。

 

clEnqueueBarrier():直接插一个同步点

 

Profiling Operations on Memory Objects and Kernels

 

需要创建command queue时指定Profiling Enable

 

clGetEventProfilingInfo():return profiling information for the command associated with event

仅有命令入队、提交、开始和结束的时间,nanoseconds

 

Flush and Finish