Video4Linux2 part 6b: Streaming I/O

来源:互联网 发布:h3c路由器关闭端口 编辑:程序博客网 时间:2024/05/02 02:31

The previous installment inthis series discussed how to transfer video frames with theread()and write() system calls. Such an implementation can get thebasic job done, but it is not normally the preferred method for performingvideo I/O. For the highest performance and the best information transfer,video drivers should support the V4L2 streaming I/O API.

With the read() and write() methods, each video frame iscopied between user and kernel space as part of the I/O operation. Whenstreaming I/O is being used, instead, this copying does not happen;instead, the application and the driver exchange pointers to buffers.These buffers will be mapped into the application's address space, makingit possible to perform zero-copy frame I/O. There are twodifferent types of streaming I/O buffers:

 

  • Memory-mapped buffers (type V4L2_MEMORY_MMAP) are allocated in kernel space; the application maps them into its address space with themmap() system call. The buffers can be large, contiguous DMA buffers, virtual buffers created withvmalloc(), or, if the hardware supports it, they can be located directly in the video device's I/O memory.

     

  • User-space buffers (V4L2_MEMORY_USERPTR) are allocated by the application in user space. Clearly, in this situation, nommap() call is required, but the driver may have to work harder to support efficient I/O to user-space buffers.

Note that drivers are not required to support streaming I/O, and, if theydo support streaming, they do not have to handle both buffer types. Adriver which is more flexible will support more applications; in practice,it seems that most applications are written to use memory-mapped buffers.It is not possible to use both types of buffer simultaneously.

We will now delve into the numerous grungy details involved in supportingstreaming I/O. Any Video4Linux2 driver writer will need to understand thisAPI; it is worth noting, however, that there is a higher-level API whichcan help in the writing of streaming drivers. That layer (calledvideo-buf) can make life easier when the underlying device can supportscatter/gather I/O. The video-buf API will be discussed in a futureinstallment.

Drivers which support streaming I/O should inform the application of thatfact by setting theV4L2_CAP_STREAMING flag in theirvidioc_querycap() method. Note that there is no way to describewhich buffer types are supported; that comes later.

 

The v4l2_buffer structure

When streaming I/O is active, frames are passed between the application andthe driver in the form ofstruct v4l2_buffer. This structure is acomplicated beast which will take a while to describe. A good startingpoint is to note that there are three fundamental states that a buffer canbe in:

 

  • In the driver's incoming queue. Buffers are placed in this queue by the application in the expectation that the driver will do something useful with them. For a video capture device, buffers in the incoming queue will be empty, waiting for the driver to fill them with video data. For an output device, these buffers will have frame data to be sent to the device.

     

  • In the driver's outgoing queue. These buffers have been processed by the driver and are waiting for the application to claim them. For capture devices, outgoing buffers will have new frame data; for output devices, these buffers are empty.

     

  • In neither queue. In this state, the buffer is owned by user space and will not normally be touched by the driver. This is the only time that the application should do anything with the buffer. We'll call this the "user space" state.

These states, and the operations which cause transitions between them, cometogether as shown in the diagram below:

 

[Buffer states]

The actual v4l2_buffer structure looks like this:

 

    struct v4l2_buffer    {__u32index;enum v4l2_buf_type      type;__u32bytesused;__u32flags;enum v4l2_fieldfield;struct timevaltimestamp;struct v4l2_timecodetimecode;__u32sequence;/* memory location */enum v4l2_memory        memory;union {__u32           offset;unsigned long   userptr;} m;__u32length;__u32input;__u32reserved;    };

The index field is a sequence number identifying the buffer; it isonly used with memory-mapped buffers. Like other objects which can beenumerated in the V4L2 interface, memory-mapped buffers start with index 0and go up sequentially from there. Thetype field describes thetype of the buffer, usually V4L2_BUF_TYPE_VIDEO_CAPTURE orV4L2_BUF_TYPE_VIDEO_OUTPUT.

The size of the buffer is given by length, which is in bytes. Thesize of the image data contained within the buffer is found inbytesused; obviouslybytesused <= length.For capture devices, the driver will set bytesused; for outputdevices the application must set this field.

field describes which field of an image is stored in the buffer;fields were discussed inpart 5a of this series.

The timestamp field, for input devices, tells when the frame wascaptured. For output devices, the driver should not send the frame outbefore the time found in this field; atimestamp of zero means "assoon as possible." The driver will set timestamp to the time thatthe first byte of the frame was transferred to the device - or as close tothat time as it can get.timecode can be used to hold a timecode value,useful for video editing applications; seethistable for details on timecodes.

The driver maintains a incrementing count of frames passing through thedevice; it stores the current sequence number insequence as eachframe is transferred. For input devices, the application can watch thisfield to detect dropped frames.

memory tells whether the buffer is memory-mapped or user-space.For memory-mapped buffers,m.offset describes where the buffer isto be found. The specification describes it as "the offset of thebuffer from the start of the device memory," but the truth of thematter is that it is simply a magic cookie that the application can pass tommap() to specify which buffer is being mapped. For user-spacebuffers, instead,m.userptr is the user-space address of thebuffer.

The input field can be used to quickly switch between inputs on acapture device - assuming the device supports quick switching betweenframes. Thereserved field should be set to zero.

Finally, there are several flags defined:

 

  • V4L2_BUF_FLAG_MAPPED indicates that the buffer has been mapped into user space. It is only applicable to memory-mapped buffers.

     

  • V4L2_BUF_FLAG_QUEUED: the buffer is in the driver's incoming queue.

     

  • V4L2_BUF_FLAG_DONE: the buffer is in the driver's outgoing queue.

     

  • V4L2_BUF_FLAG_KEYFRAME: the buffer holds a key frame - useful in compressed streams.

     

  • V4L2_BUF_FLAG_PFRAME and V4L2_BUF_FLAG_BFRAME are also used with compressed streams; they indicated predicted or difference frames.

     

  • V4L2_BUF_FLAG_TIMECODE: the timecode field is valid.

     

  • V4L2_BUF_FLAG_INPUT: the input field is valid.

 

Buffer setup

Once a streaming application has performed its basic setup, it will turn tothe task of organizing its I/O buffers. The first step is to establish aset of buffers with theVIDIOC_REQBUFS ioctl(), which isturned by V4L2 into a call to the driver'svidioc_reqbufs()method:

 

    int (*vidioc_reqbufs) (struct file *file, void *private_data,    struct v4l2_requestbuffers *req);

Everything of interest will be in the v4l2_requestbuffersstructure, which looks like this:

 

    struct v4l2_requestbuffers    {__u32count;enum v4l2_buf_type      type;enum v4l2_memory        memory;__u32reserved[2];    };

The type field describes the type of I/O to be done; it willusually be eitherV4L2_BUF_TYPE_VIDEO_CAPTURE for a videoacquisition device or V4L2_BUF_TYPE_VIDEO_OUTPUT for an outputdevice. There are other types, but they are beyond the scope of thisarticle.

If the application wants to use memory-mapped buffers, it will setmemory toV4L2_MEMORY_MMAP and count to thenumber of buffers it wants to use. If the driver does not supportmemory-mapped buffers, it should return-EINVAL. Otherwise, itshould allocate the requested buffers internally and return zero. Onreturn, the application will expect the buffers to exist, so any part ofthe task which could fail (memory allocation, for example) should be doneat this stage.

Notethat the driver is not required to allocate exactly the requested number ofbuffers. In many cases there is a minimum number of buffers which makessense; if the application requests fewer than the minimum, it may actuallyget more buffers than it asked for. In your editor's experience, forexample, the mplayer application will request two buffers, whichmakes it susceptible to overruns (and thus lost frames) if things slowdown in user space. By enforcing a higher minimum buffer count (adjustable with a moduleparameter), the cafe_ccic driver is able to make the streaming I/O path alittle more robust. Thecount field should be setto the number of buffers actually allocated before the method returns.

Setting count to zero is a way for the application to request thatall existing buffers be released. In this case, the driver must stop anyDMA operations before freeing the buffers or terrible things could happen.It is also not possible to free buffers if they are current mapped intouser space.

If, instead, user-space buffers are to be used, the only fields whichmatter are the buffertype and a value ofV4L2_MEMORY_USERPTR in the memory field. The applicationneed not specify the number of buffers that it intends to use; since theallocation will be happening in user space, the driver need not care. Ifthe driver supports user-space buffers, it need only note that theapplication will be using this feature and return zero; otherwise the usual-EINVAL return is called for.

The VIDIOC_REQBUFS command is the only way for an application todiscover which types of streaming I/O buffer are supported by a givendriver.

 

Mapping buffers into user space

If user-space buffers are being used, the driver will not see any morebuffer-related calls until the application starts putting buffers on theincoming queue. Memory-mapped buffers require more setup, though. Theapplication will typically step through each allocated buffer and map itinto its address space. The first stop is the VIDIOC_QUERYBUFcommand, which becomes a call to the driver'svidioc_querybuf()method:

 

    int (*vidioc_querybuf)(struct file *file, void *private_data,                            struct v4l2_buffer *buf);

On entry to this method, the only fields of buf which will be setare type (which should be checked against the type specified whenthe buffers were allocated) andindex, which identifies thespecific buffer. The driver should make sure thatindex makessense and fill in the rest of the fields in buf. Typicallydrivers store an array ofv4l2_buffer structures internally, sothe core of a vidioc_querybuf() method is just a structureassignment.

The only way for an application to access memory-mapped buffers is to mapthem into their address space, so avidioc_querybuf() call willtypically be followed by a call to the driver'smmap() method -this method, remember, is stored in the fops field of thevideo_device structure associated with this device. How thedriver handlesmmap() will depend on just how the buffers are setup in the kernel. If the buffer can be mapped up front withremap_pfn_range() orremap_vmalloc_range(), that shouldbe done at this time. For buffers in kernel space, pages can also bemapped individually at page-fault time by setting up anopage()method in the usual way. A good discussion of handling mmap() can be found inLinux Device Drivers for those who need it.

When mmap() is called, the VMA structure passed in should have theaddress of one of your buffers in thevm_pgoff field -right-shifted by PAGE_SHIFT, of course. It should, in particular,be theoffset value that your driver returned in response to aVIDIOC_QUERYBUF call. Please iterate through your list of buffersand be sure that the incoming address matches one of them; video driversshould not be a means by which hostile programs can map arbitrary regionsof memory.

The offset value you provide can be almost anything,incidentally. Some drivers just return(index<<PAGE_SHIFT),meaning that the incoming vm_pgoff field should just be the bufferindex. The one thing you shouldnot do is store the actualkernel-space address of the buffer in offset; leaking kerneladdresses into user space is never a good idea.

When user space maps a buffer, the driver should set theV4L2_BUF_FLAG_MAPPED flag in the associatedv4l2_bufferstructure. It must also set up open() and close() VMAoperations so that it can track the number of processes which have thebuffer mapped. As long as this buffer remains mapped somewhere, it cannotbe released back to the kernel. If the mapping count of one or morebuffers drops to zero, the driver should also stop any in-progress I/O, asthere will be no process which can make use of it.

 

Streaming I/O

So far we have looked at a lot of setup without the transfer of a singleframe. We're getting closer, but there is one more step which must happenfirst. When the application obtains buffers withVIDIOC_REQBUFS,those buffers are all in the user-space state; if they are user-spacebuffers, they do not really even exist yet. Before the application canstart streaming I/O, it must put at least one buffer into the driver'sincoming queue; for an output device, of course, those buffers should alsobe filled with valid frame data.

To enqueue a buffer, the application will issue a VIDIOC_QBUFioctl(), which the V4L2 maps into a call to the driver'svidioc_qbuf() method:

 

    int (*vidioc_qbuf) (struct file *file, void *private_data,                         struct v4l2_buffer *buf);

For memory-mapped buffers, once again, only the type andindex fields ofbuf are valid. The driver can justperform the obvious checks (type andindex make sense,the buffer is not already on one of the driver's queues, the buffer ismapped, etc.), put the buffer on its incoming queue (setting theV4L2_BUF_FLAG_QUEUED flag), and return.

User-space buffers can be more complicated at this point, because thedriver will have never seen this buffer before. When using this method,applications are allowed to pass a different address every time they enqueuea buffer, so the driver can do no setup ahead of time. If your driver isbouncing frames through a kernel-space buffer, it need only make a note ofthe user-space address provided by the application. If you are trying toDMA the data directly into user-space, however, life is significantly morechallenging.

To ship data directly into user space, the driver must first fault in allof the pages of the buffer and lock them into place;get_user_pages() is the tool to use for this job. Note that thisfunction can perform significant amounts of memory allocation and disk I/O- it could block for a long time. You will need to take care to ensurethat important driver functions do not stall whileget_user_pages(), which can block for long enough for many videoframes to go by, does its thing.

Then there is the matter of telling the device to transfer image data to(or from) the user-space buffer. This buffer will not be contiguous inphysical memory - it will, instead, be broken up into a large number ofseparate 4096-byte pages (on most architectures). Clearly, the device willhave to be able to do scatter/gather DMA operations. If the device transfers full video framesat once, it will need to accept a scatterlist which holds a great manypages; a VGA-resolution image in a 16-bit format requires 150 pages. Asthe image size grows, so will the size of the scatterlist. The V4L2specification says:

 

If required by the hardware the driver swaps memory pages withinphysical memory to create a continuous area of memory. This happenstransparently to the application in the virtual memory subsystem ofthe kernel.

Your editor, however, is unwilling to recommend that driver writers attemptthis kind of deep virtual memory trickery. A more promising approach couldbe to require user-space buffers to be located in hugetlb pages, but nodrivers do that now.

If your device transfers images in smaller pieces (a USB camera, forexample), direct DMA to user space may be easier to set up. In any case,when faced with the challenges of supporting direct I/O to user-spacebuffers, the driver writer should (1) be sure that it is worth thetrouble, given that applications tend to expect to use memory-mappedbuffers anyway, and (2) make use of the video-buf layer, which canhandle some of the pain for you.

Once streaming I/O starts, the driver will grab buffers from its incomingqueue, have the device perform the requested transfer, then move the bufferto the outgoing queue. The buffer flags should be adjusted accordinglywhen this transition happens; fields like the sequence number and time stampshould alsobe filled in at this time. Eventually the application will want to claimbuffers in the outgoing queue, returning them to the user-space state.That is the job ofVIDIOC_DQBUF, which becomes a call to:

 

    int (*vidioc_dqbuf) (struct file *file, void *private_data,                          struct v4l2_buffer *buf);

Here, the driver will remove the first buffer from the outgoing queue,storing the relevant information in*buf. Normally, if theoutgoing queue is empty, this call should block until a buffer becomesavailable. V4L2 drivers are expected to handle non-blocking I/O, though, so if thevideo device has been opened withO_NONBLOCK, the driver shouldreturn -EAGAIN in the empty-queue case. Needless to say, thisrequirement also implies that the driver must supportpoll() forstreaming I/O.

The only remaining step is to actually tell the device to start performingstreaming I/O. The Video4Linux2 driver methods for this task are:

 

    int (*vidioc_streamon) (struct file *file, void *private_data,                             enum v4l2_buf_type type);    int (*vidioc_streamoff)(struct file *file, void *private_data,                         enum v4l2_buf_type type);

The call to vidioc_streamon() should start the device afterchecking thattype makes sense. The driver can, if need be,require that a certain number of buffers be in the incoming queue beforestreaming can be started.

When the application is done it should generate a call tovidioc_streamoff(), which must stop the device. The driver should also remove all buffers from both the incoming and outgoing queues, leavingthem all in the user-space state. Of course, the driver must be preparedfor the application to simply close the device without stopping streamingfirst.

原创粉丝点击