zero copy architecture of my video process application on TI 8168 using C6runLib

来源：互联网发布：对徐贲的评价知乎编辑：程序博客网时间：2024/04/27 23:05

1 my design using c6runlib

Following is my design based on TI c6runlib compiler. Core structure is a "working item line", which contains a queue made of pointers pointing to buffers allocated from heap, and a lock used whenever the queue need to be modified. All accessing to the queue would be limited to functions where locking, dequeueing are implemented. What the design want is, locking the queue instead of locking the buffers, so a buffer need to be processed, lock the structure, dequeue the buffer and then unlock the structure, so this would reduce conflict chances between threads.

There are 3 threads running: thread Capture running on ARM with "working item line" resCap, thread VideoProcess running on ARM and DSP, thread PostProcess running on ARM with resPost.

graph 1: static model

At runtime, the queue in resCap running as a ring buffer, the Capture thread would using the oldest buffer to do video capture, and then put it to the head of the list, so the head is always the latest frame. The VideoProcess thread would dequeue the latest frame from resCap, and then look at the time stamp on it to see if it need to be processed. If it is, process it, and then put it to queue resPost. The PostProcess thread do further works such as saving, sending, and it pulls working item on queue resPost. When work is done it will put the buffer back to resCap, which belongs to thread Capture.

graph 2, runtime model

The design is based on following conditions: thread VideoProcess is much slower than other threads, so the input and output queue of VideoProcess are saved. The total speed depends on the slowest one thread: VideoProcess, but this design have saved frame copy and some queue accessing with locking.