An ffmpeg and SDL Tutorial 学习笔记（五）

来源：互联网发布：jsp网上订餐系统源码编辑：程序博客网时间：2024/06/04 23:26

官网：

http://dranger.com/ffmpeg/tutorial05.html

Tutorial 05: Synching Video

Code: tutorial05.c

CAVEAT

When I first made this tutorial, all of my syncing code was pulled from ffplay.c. Today, it is a totally different program, and improvements in the ffmpeg libraries (and in ffplay.c itself) have caused some strategies to change. While this code still works, it doesn't look good, and there are many more improvements that this tutorial could use.

How Video Syncs

So this whole time, we've had an essentially useless movie player. It plays the video, yeah, and it plays the audio, yeah, but it's not quite yet what we would call a movie. So what do we do?

PTS and DTS

Fortunately, both the audio and video streams have the information about how fast and when you are supposed to play them inside of them. Audio streams have a sample rate, and the video streams have a frames per second value. However, if we simply synced the video by just counting frames and multiplying by frame rate, there is a chance that it will go out of sync with the audio. Instead, packets from the stream might have what is called a decoding time stamp (DTS) and a presentation time stamp (PTS). To understand these two values, you need to know about the way movies are stored. Some formats, like MPEG, use what they call "B" frames (B stands for "bidirectional"). The two other kinds of frames are called "I" frames and "P" frames ("I" for "intra" and "P" for "predicted"). I frames contain a full image. P frames depend upon previous I and P frames and are like diffs or deltas. B frames are the same as P frames, but depend upon information found in frames that are displayed both before and after them! This explains why we might not have a finished frame after we call avcodec_decode_video2.

So let's say we had a movie, and the frames were displayed like: I B B P. Now, we need to know the information in P before we can display either B frame. Because of this, the frames might be stored like this: I P B B. This is why we have a decoding timestamp and a presentation timestamp on each frame. The decoding timestamp tells us when we need to decode something, and the presentation time stamp tells us when we need to display something. So, in this case, our stream might look like this:

   PTS: 1 4 2 3   DTS: 1 2 3 4Stream: I P B B

Generally the PTS and DTS will only differ when the stream we are playing has B frames in it.

When we get a packet from av_read_frame(), it will contain the PTS and DTS values for the information inside that packet. But what we really want is the PTS of our newly decoded raw frame, so we know when to display it.

Fortunately, FFMpeg supplies us with a "best effort" timestamp, which you can get via, av_frame_get_best_effort_timestamp()

Synching

Now, while it's all well and good to know when we're supposed to show a particular video frame, but how do we actually do so? Here's the idea: after we show a frame, we figure out when the next frame should be shown. Then we simply set a new timeout to refresh the video again after that amount of time. As you might expect, we check the value of the PTS of the next frame against the system clock to see how long our timeout should be. This approach works, but there are two issues that need to be dealt with.

First is the issue of knowing when the next PTS will be. Now, you might think that we can just add the video rate to the current PTS — and you'd be mostly right. However, some kinds of video call for frames to be repeated. This means that we're supposed to repeat the current frame a certain number of times. This could cause the program to display the next frame too soon. So we need to account for that.

The second issue is that as the program stands now, the video and the audio chugging away happily, not bothering to sync at all. We wouldn't have to worry about that if everything worked perfectly. But your computer isn't perfect, and a lot of video files aren't, either. So we have three choices: sync the audio to the video, sync the video to the audio, or sync both to an external clock (like your computer). For now, we're going to sync the video to the audio.

Coding it: getting the frame PTS

Now let's get into the code to do all this. We're going to need to add some more members to our big struct, but we'll do this as we need to. First let's look at our video thread. Remember, this is where we pick up the packets that were put on the queue by our decode thread. What we need to do in this part of the code is get the PTS of the frame given to us by avcodec_decode_video2. The first way we talked about was getting the DTS of the last packet processed, which is pretty easy:

  double pts;  for(;;) {    if(packet_queue_get(&is->videoq, packet, 1) < 0) {      // means we quit getting packets      break;    }    pts = 0;    // Decode video frame    len1 = avcodec_decode_video2(is->video_st->codec,                                pFrame, &frameFinished, packet);    if(packet->dts != AV_NOPTS_VALUE) {      pts = av_frame_get_best_effort_timestamp(pFrame);    } else {      pts = 0;    }    pts *= av_q2d(is->video_st->time_base);

We set the PTS to 0 if we can't figure out what it is.

Well, that was easy. A technical note: You may have noticed we're using int64 for the PTS. This is because the PTS is stored as an integer. This value is a timestamp that corresponds to a measurement of time in that stream's time_base unit. For example, if a stream has 24 frames per second, a PTS of 42 is going to indicate that the frame should go where the 42nd frame would be if there we had a frame every 1/24 of a second (certainly not necessarily true).

We can convert this value to seconds by dividing by the framerate. The time_base value of the stream is going to be 1/framerate (for fixed-fps content), so to get the PTS in seconds, we multiply by the time_base.

Coding: Synching and using the PTS

So now we've got our PTS all set. Now we've got to take care of the two synchronization problems we talked about above. We're going to define a function called synchronize_video that will update the PTS to be in sync with everything. This function will also finally deal with cases where we don't get a PTS value for our frame. At the same time we need to keep track of when the next frame is expected so we can set our refresh rate properly. We can accomplish this by using an internal video_clock value which keeps track of how much time has passed according to the video. We add this value to our big struct.

typedef struct VideoState {  double          video_clock; // pts of last decoded frame / predicted pts of next decoded frame

Here's the synchronize_video function, which is pretty self-explanatory:

double synchronize_video(VideoState *is, AVFrame *src_frame, double pts) {  double frame_delay;  if(pts != 0) {    /* if we have pts, set video clock to it */    is->video_clock = pts;  } else {    /* if we aren't given a pts, set it to the clock */    pts = is->video_clock;  }  /* update the video clock */  frame_delay = av_q2d(is->video_st->codec->time_base);  /* if we are repeating a frame, adjust clock accordingly */  frame_delay += src_frame->repeat_pict * (frame_delay * 0.5);  is->video_clock += frame_delay;  return pts;}

You'll notice we account for repeated frames in this function, too.

Now let's get our proper PTS and queue up the frame using queue_picture, adding a new pts argument:

    // Did we get a video frame?    if(frameFinished) {      pts = synchronize_video(is, pFrame, pts);      if(queue_picture(is, pFrame, pts) < 0) {break;      }    }

The only thing that changes about queue_picture is that we save that pts value to the VideoPicture structure that we queue up. So we have to add a pts variable to the struct and add a line of code:

typedef struct VideoPicture {  ...  double pts;}int queue_picture(VideoState *is, AVFrame *pFrame, double pts) {  ... stuff ...  if(vp->bmp) {    ... convert picture ...    vp->pts = pts;    ... alert queue ...  }

So now we've got pictures lining up onto our picture queue with proper PTS values, so let's take a look at our video refreshing function. You may recall from last time that we just faked it and put a refresh of 80ms. Well, now we're going to find out how to actually figure it out.

Our strategy is going to be to predict the time of the next PTS by simply measuring the time between the previous pts and this one. At the same time, we need to sync the video to the audio. We're going to make an audio clock: an internal value thatkeeps track of what position the audio we're playing is at. It's like the digital readout on any mp3 player. Since we're synching the video to the audio,the video thread uses this value to figure out if it's too far ahead or too far behind.

We'll get to the implementation later; for now let's assume we have a get_audio_clock function that will give us the time on the audio clock. Once we have that value, though,

what do we do if the video and audio are out of sync?

It would silly to simply try and leap to the correct packet through seeking or something. Instead, we're just going to adjust the value we've calculated for the next refresh: if the PTS is too far behind the audio time, we double our calculated delay. if the PTS is too far ahead of the audio time, we simply refresh as quickly as possible.

Now that we have our adjusted refresh time, or delay, we're going to compare that with our computer's clock by keeping a running frame_timer. This frame timer will sum up all of our calculated delays while playing the movie. In other words,this frame_timer is what time it should be when we display the next frame.

//最核心的就是这句话：

We simply add the new delay to the frame timer, compare it to the time on our computer's clock, and use that value to schedule the next refresh. This might be a bit confusing, so study the code carefully:

void video_refresh_timer(void *userdata) {  VideoState *is = (VideoState *)userdata;  VideoPicture *vp;  double actual_delay, delay, sync_threshold, ref_clock, diff;    if(is->video_st) {    if(is->pictq_size == 0) {      schedule_refresh(is, 1);    } else {      vp = &is->pictq[is->pictq_rindex];      delay = vp->pts - is->frame_last_pts; /* the pts from last time */      if(delay <= 0 || delay >= 1.0) {/* if incorrect delay, use previous one */delay = is->frame_last_delay;      }      /* save for next time */      is->frame_last_delay = delay;      is->frame_last_pts = vp->pts;      /* update delay to sync to audio */      ref_clock = get_audio_clock(is);      diff = vp->pts - ref_clock;      /* Skip or repeat the frame. Take delay into account FFPlay still doesn't "know if this is the best guess." */      sync_threshold = (delay > AV_SYNC_THRESHOLD) ? delay : AV_SYNC_THRESHOLD;      if(fabs(diff) < AV_NOSYNC_THRESHOLD) {if(diff <= -sync_threshold) {  delay = 0;} else if(diff >= sync_threshold) {  delay = 2 * delay;}      }      is->frame_timer += delay;      /* computer the REAL delay */      actual_delay = is->frame_timer - (av_gettime() / 1000000.0);      if(actual_delay < 0.010) {/* Really it should skip the picture instead */actual_delay = 0.010;      }      schedule_refresh(is, (int)(actual_delay * 1000 + 0.5));      /* show the picture! */      video_display(is);            /* update queue for next picture! */      if(++is->pictq_rindex == VIDEO_PICTURE_QUEUE_SIZE) {is->pictq_rindex = 0;      }      SDL_LockMutex(is->pictq_mutex);      is->pictq_size--;      SDL_CondSignal(is->pictq_cond);      SDL_UnlockMutex(is->pictq_mutex);    }  } else {    schedule_refresh(is, 100);  }}

There are a few checks we make:

1. first, we make sure that the delay between the PTS and the previous PTS make sense. If it doesn't we just guess and use the last delay.

2. Next, we make sure we have a synch threshold because things are never going to be perfectly in synch. ffplay uses 0.01 for its value.

3. We also make sure that the synch threshold is never smaller than the gaps in between PTS values.

4. Finally, we make the minimum refresh value 10 milliseconds*.

* Really here we should skip the frame, but we're not going to bother.

//注意下面两个值的初始化：

We added a bunch of variables to the big struct so don't forget to check the code. Also, don't forget to initialize the frame timer and the initial previous frame delay in stream_component_open:

    is->frame_timer = (double)av_gettime() / 1000000.0;    is->frame_last_delay = 40e-3;

Synching: The Audio Clock

Now it's time for us to implement the audio clock. We can update the clock time in our audio_decode_frame function, which is where we decode the audio. Now, remember that we don't always process a new packet every time we call this function,so there are two places we have to update the clock at.

The first place is where we get the new packet: we simply set the audio clock to the packet's PTS.

Then if a packet has multiple frames, we keep time the audio play by counting the number of samples and multiplying them by the given samples-per-second rate. So once we have the packet:

    /* if update, update the audio clock w/pts */    if(pkt->pts != AV_NOPTS_VALUE) {      is->audio_clock = av_q2d(is->audio_st->time_base)*pkt->pts;    }

And once we are processing the packet:

      /* Keep audio_clock up-to-date */      pts = is->audio_clock;      *pts_ptr = pts;      n = 2 * is->audio_st->codec->channels;      is->audio_clock += (double)data_size /(double)(n * is->audio_st->codec->sample_rate);

A few fine details: the template of the function has changed to include pts_ptr, so make sure you change that. pts_ptr is a pointer we use to inform audio_callback the pts of the audio packet. This will be used next time for synchronizing the audio with the video.

Now we can finally implement our get_audio_clock function. It's not as simple as getting the is->audio_clockvalue, thought. Notice that we set the audio PTS every time we process it, but if you look at the audio_callbackfunction, it takes time to move all the data from our audio packet into our output buffer.

That means that the value in our audio clock could be too far ahead. So we have to check how much we have left to write. Here's the complete code:

double get_audio_clock(VideoState *is) {  double pts;  int hw_buf_size, bytes_per_sec, n;    pts = is->audio_clock; /* maintained in the audio thread */  hw_buf_size = is->audio_buf_size - is->audio_buf_index;  bytes_per_sec = 0;  n = is->audio_st->codec->channels * 2;  if(is->audio_st) {    bytes_per_sec = is->audio_st->codec->sample_rate * n;  }  if(bytes_per_sec) {    pts -= (double)hw_buf_size / bytes_per_sec;  }  return pts;}

You should be able to tell why this function works by now ;)

So that's it! Go ahead and compile it:

gcc -o tutorial05 tutorial05.c -lavutil -lavformat -lavcodec -lswscale -lz -lm \`sdl-config --cflags --libs`

and finally! you can watch a movie on your own movie player. Next time we'll look at audio synching, and then the tutorial after that we'll talk about seeking.

这一篇应该是最难的，我在代码里面加点注释，方便大家理解：

// tutorial05.c// A pedagogical video player that really works!//// Code based on FFplay, Copyright (c) 2003 Fabrice Bellard, // and a tutorial by Martin Bohme (boehme@inb.uni-luebeckREMOVETHIS.de)// Tested on Gentoo, CVS version 5/01/07 compiled with GCC 4.1.1// With updates from https://github.com/chelyaev/ffmpeg-tutorial// Updates tested on:// LAVC 54.59.100, LAVF 54.29.104, LSWS 2.1.101, SDL 1.2.15// on GCC 4.7.2 in Debian February 2015// Use//// gcc -o tutorial05 tutorial05.c -lavformat -lavcodec -lswscale -lz -lm `sdl-config --cflags --libs`// to build (assuming libavformat and libavcodec are correctly installed, // and assuming you have sdl-config. Please refer to SDL docs for your installation.)//// Run using// tutorial04 myvideofile.mpg//// to play the video stream on your screen.#include <libavcodec/avcodec.h>#include <libavformat/avformat.h>#include <libswscale/swscale.h>#include <SDL.h>#include <SDL_thread.h>#ifdef __MINGW32__#undef main /* Prevents SDL from overriding main() */#endif#include <stdio.h>#include <assert.h>#include <math.h>// compatibility with newer API#if LIBAVCODEC_VERSION_INT < AV_VERSION_INT(55,28,1)#define av_frame_alloc avcodec_alloc_frame#define av_frame_free avcodec_free_frame#endif#define SDL_AUDIO_BUFFER_SIZE 1024#define MAX_AUDIO_FRAME_SIZE 192000#define MAX_AUDIOQ_SIZE (5 * 16 * 1024)#define MAX_VIDEOQ_SIZE (5 * 256 * 1024)#define AV_SYNC_THRESHOLD 0.01#define AV_NOSYNC_THRESHOLD 10.0#define FF_REFRESH_EVENT (SDL_USEREVENT)#define FF_QUIT_EVENT (SDL_USEREVENT + 1)#define VIDEO_PICTURE_QUEUE_SIZE 1typedef struct PacketQueue {  AVPacketList *first_pkt, *last_pkt;  int nb_packets;  int size;  SDL_mutex *mutex;  SDL_cond *cond;} PacketQueue;typedef struct VideoPicture {  SDL_Overlay *bmp;  int width, height; /* source height & width */  int allocated;  double pts;} VideoPicture;typedef struct VideoState {  AVFormatContext *pFormatCtx;  int             videoStream, audioStream;  //audio的时钟，保存的是最新送入SDL的audio数据的pts，注意SDL内部其实还有缓存数据，  //所以当前正在播的audio pts值是audio_clock减去缓冲区的audio数据长度（duration）  double          audio_clock;  AVStream        *audio_st;  AVCodecContext  *audio_ctx;  PacketQueue     audioq;  uint8_t         audio_buf[(AVCODEC_MAX_AUDIO_FRAME_SIZE * 3) / 2];  unsigned int    audio_buf_size;  unsigned int    audio_buf_index;  AVFrame         audio_frame;  AVPacket        audio_pkt;  uint8_t         *audio_pkt_data;  int             audio_pkt_size;  int             audio_hw_buf_size;    double          frame_timer;  double          frame_last_pts;  double          frame_last_delay;  double          video_clock; ///<pts of last decoded frame / predicted pts of next decoded frame  AVStream        *video_st;  AVCodecContext  *video_ctx;  PacketQueue     videoq;  struct SwsContext *sws_ctx;  VideoPicture    pictq[VIDEO_PICTURE_QUEUE_SIZE];  int             pictq_size, pictq_rindex, pictq_windex;  SDL_mutex       *pictq_mutex;  SDL_cond        *pictq_cond;    SDL_Thread      *parse_tid;  SDL_Thread      *video_tid;  char            filename[1024];  int             quit;} VideoState;SDL_Surface     *screen;SDL_mutex       *screen_mutex;/* Since we only have one decoding thread, the Big Struct   can be global in case we need it. */VideoState *global_video_state;void packet_queue_init(PacketQueue *q) {  memset(q, 0, sizeof(PacketQueue));  q->mutex = SDL_CreateMutex();  q->cond = SDL_CreateCond();}int packet_queue_put(PacketQueue *q, AVPacket *pkt) {  AVPacketList *pkt1;  if(av_dup_packet(pkt) < 0) {    return -1;  }  pkt1 = av_malloc(sizeof(AVPacketList));  if (!pkt1)    return -1;  pkt1->pkt = *pkt;  pkt1->next = NULL;    SDL_LockMutex(q->mutex);  if (!q->last_pkt)    q->first_pkt = pkt1;  else    q->last_pkt->next = pkt1;  q->last_pkt = pkt1;  q->nb_packets++;  q->size += pkt1->pkt.size;  SDL_CondSignal(q->cond);    SDL_UnlockMutex(q->mutex);  return 0;}static int packet_queue_get(PacketQueue *q, AVPacket *pkt, int block){  AVPacketList *pkt1;  int ret;  SDL_LockMutex(q->mutex);    for(;;) {        if(global_video_state->quit) {      ret = -1;      break;    }    pkt1 = q->first_pkt;    if (pkt1) {      q->first_pkt = pkt1->next;      if (!q->first_pkt)q->last_pkt = NULL;      q->nb_packets--;      q->size -= pkt1->pkt.size;      *pkt = pkt1->pkt;      av_free(pkt1);      ret = 1;      break;    } else if (!block) {      ret = 0;      break;    } else {      SDL_CondWait(q->cond, q->mutex);    }  }  SDL_UnlockMutex(q->mutex);  return ret;}double get_audio_clock(VideoState *is) {  double pts;  int hw_buf_size, bytes_per_sec, n;    pts = is->audio_clock; /* maintained in the audio thread */  hw_buf_size = is->audio_buf_size - is->audio_buf_index;  bytes_per_sec = 0;  n = is->audio_ctx->channels * 2;  if(is->audio_st) {    bytes_per_sec = is->audio_ctx->sample_rate * n;  }  if(bytes_per_sec) {    pts -= (double)hw_buf_size / bytes_per_sec;    //需要减去缓冲区的audio duration  }  return pts;}int audio_decode_frame(VideoState *is, uint8_t *audio_buf, int buf_size, double *pts_ptr) {  int len1, data_size = 0;  AVPacket *pkt = &is->audio_pkt;  double pts;  int n;  for(;;) {    while(is->audio_pkt_size > 0) {      int got_frame = 0;      len1 = avcodec_decode_audio4(is->audio_ctx, &is->audio_frame, &got_frame, pkt);      if(len1 < 0) {/* if error, skip frame */is->audio_pkt_size = 0;break;      }      data_size = 0;      if(got_frame) {data_size = av_samples_get_buffer_size(NULL,        is->audio_ctx->channels,       is->audio_frame.nb_samples,       is->audio_ctx->sample_fmt,       1);assert(data_size <= buf_size);memcpy(audio_buf, is->audio_frame.data[0], data_size);      }      is->audio_pkt_data += len1;      is->audio_pkt_size -= len1;      if(data_size <= 0) {/* No data yet, get more frames */continue;      }      pts = is->audio_clock;      *pts_ptr = pts;      n = 2 * is->audio_ctx->channels;      is->audio_clock += (double)data_size /(double)(n * is->audio_ctx->sample_rate); //更新audio_clock      /* We have data, return it and come back for more later */      return data_size;    }    if(pkt->data)      av_free_packet(pkt);    if(is->quit) {      return -1;    }    /* next packet */    if(packet_queue_get(&is->audioq, pkt, 1) < 0) {      return -1;    }    is->audio_pkt_data = pkt->data;    is->audio_pkt_size = pkt->size;    /* if update, update the audio clock w/pts */    if(pkt->pts != AV_NOPTS_VALUE) {      is->audio_clock = av_q2d(is->audio_st->time_base)*pkt->pts;      //更新audio_clock    }  }}void audio_callback(void *userdata, Uint8 *stream, int len) {  VideoState *is = (VideoState *)userdata;  int len1, audio_size;  double pts;  while(len > 0) {    if(is->audio_buf_index >= is->audio_buf_size) {      /* We have already sent all our data; get more */      audio_size = audio_decode_frame(is, is->audio_buf, sizeof(is->audio_buf), &pts);      if(audio_size < 0) {/* If error, output silence */is->audio_buf_size = 1024;memset(is->audio_buf, 0, is->audio_buf_size);      } else {is->audio_buf_size = audio_size;      }      is->audio_buf_index = 0;    }    len1 = is->audio_buf_size - is->audio_buf_index;    if(len1 > len)      len1 = len;    memcpy(stream, (uint8_t *)is->audio_buf + is->audio_buf_index, len1);    len -= len1;    stream += len1;    is->audio_buf_index += len1;  }}static Uint32 sdl_refresh_timer_cb(Uint32 interval, void *opaque) {  SDL_Event event;  event.type = FF_REFRESH_EVENT;  event.user.data1 = opaque;  SDL_PushEvent(&event);  return 0; /* 0 means stop timer */}/* schedule a video refresh in 'delay' ms */static void schedule_refresh(VideoState *is, int delay) {  SDL_AddTimer(delay, sdl_refresh_timer_cb, is);}void video_display(VideoState *is) {  SDL_Rect rect;  VideoPicture *vp;  float aspect_ratio;  int w, h, x, y;  int i;  vp = &is->pictq[is->pictq_rindex];  if(vp->bmp) {    if(is->video_ctx->sample_aspect_ratio.num == 0) {      aspect_ratio = 0;    } else {      aspect_ratio = av_q2d(is->video_ctx->sample_aspect_ratio) *is->video_ctx->width / is->video_ctx->height;    }    if(aspect_ratio <= 0.0) {      aspect_ratio = (float)is->video_ctx->width /(float)is->video_ctx->height;    }    h = screen->h;    w = ((int)rint(h * aspect_ratio)) & -3;    if(w > screen->w) {      w = screen->w;      h = ((int)rint(w / aspect_ratio)) & -3;    }    x = (screen->w - w) / 2;    y = (screen->h - h) / 2;        rect.x = x;    rect.y = y;    rect.w = w;    rect.h = h;    SDL_LockMutex(screen_mutex);    SDL_DisplayYUVOverlay(vp->bmp, &rect);    SDL_UnlockMutex(screen_mutex);  }}void video_refresh_timer(void *userdata) {  VideoState *is = (VideoState *)userdata;  VideoPicture *vp;  double actual_delay, delay, sync_threshold, ref_clock, diff;    if(is->video_st) {    if(is->pictq_size == 0) {      schedule_refresh(is, 1);    } else {      vp = &is->pictq[is->pictq_rindex];  //delay是当前帧和上一帧的帧间隔，同时也是上一帧的显示时长（duration）      delay = vp->pts - is->frame_last_pts; /* the pts from last time */      if(delay <= 0 || delay >= 1.0) {/* if incorrect delay, use previous one */delay = is->frame_last_delay;  //如果pts异常（跳变），过大（大于1秒）或者过小（小于前一帧pts），则需调整pts      }      /* save for next time */      is->frame_last_delay = delay;   //更新上一帧duration      is->frame_last_pts = vp->pts;   //更新上一帧pts      /* update delay to sync to audio */      ref_clock = get_audio_clock(is);    //拿到参考时钟，这里是audio master，所以拿到audio clock      diff = vp->pts - ref_clock;         //比较当前video pts和参考时钟audio clock，看看是video比audio慢，还是video比audio快，根据这个对delay作出调整      /* Skip or repeat the frame. Take delay into account FFPlay still doesn't "know if this is the best guess." */      sync_threshold = (delay > AV_SYNC_THRESHOLD) ? delay : AV_SYNC_THRESHOLD;      if(fabs(diff) < AV_NOSYNC_THRESHOLD) {if(diff <= -sync_threshold) {  delay = 0;                             //如果video比audio慢很多，则应该马上render video} else if(diff >= sync_threshold) {  delay = 2 * delay;                     //如果video比audio快很多，则让video等等，等待时间加倍}                                        //这里强调一点：delay是上一帧video的显示时间（duration），通过调整这个delay来调整video的播放速度（加快或放慢）      }      is->frame_timer += delay;              //frame_timer是下一帧video在现实中实际应该显示的时刻点，如果是25fps,正常情况是每隔40ms显示一帧，frame_timer=av_gettime()+40*nums      /* computer the REAL delay */      actual_delay = is->frame_timer - (av_gettime() / 1000000.0);   //actual_delay是我们SDL事件消息需要delay的实际时间，就是经过actual_delay毫秒后就将当前video帧送出去显示      if(actual_delay < 0.010) {/* Really it should skip the picture instead */actual_delay = 0.010;                                //如果实际的actual_delay等待时间过小，那么也需要强制赋值为10毫秒，因为上一帧至少要显示10毫秒      }      schedule_refresh(is, (int)(actual_delay * 1000 + 0.5));   //SDL发送消息，告诉显示线程，actual_delay毫秒后去显示下一video帧            /* show the picture! */      video_display(is);                                       //显示当前帧            /* update queue for next picture! */      if(++is->pictq_rindex == VIDEO_PICTURE_QUEUE_SIZE) {     //更新读指针is->pictq_rindex = 0;      }      SDL_LockMutex(is->pictq_mutex);      is->pictq_size--;      SDL_CondSignal(is->pictq_cond);      SDL_UnlockMutex(is->pictq_mutex);    }  } else {    schedule_refresh(is, 100);  }}      void alloc_picture(void *userdata) {  VideoState *is = (VideoState *)userdata;  VideoPicture *vp;  vp = &is->pictq[is->pictq_windex];  if(vp->bmp) {    // we already have one make another, bigger/smaller    SDL_FreeYUVOverlay(vp->bmp);  }  // Allocate a place to put our YUV image on that screen  SDL_LockMutex(screen_mutex);  vp->bmp = SDL_CreateYUVOverlay(is->video_ctx->width, is->video_ctx->height, SDL_YV12_OVERLAY, screen);  SDL_UnlockMutex(screen_mutex);  vp->width = is->video_ctx->width;  vp->height = is->video_ctx->height;  vp->allocated = 1;}int queue_picture(VideoState *is, AVFrame *pFrame, double pts) {  VideoPicture *vp;  int dst_pix_fmt;  AVPicture pict;  /* wait until we have space for a new pic */  SDL_LockMutex(is->pictq_mutex);  while(is->pictq_size >= VIDEO_PICTURE_QUEUE_SIZE &&!is->quit) {    SDL_CondWait(is->pictq_cond, is->pictq_mutex);  }  SDL_UnlockMutex(is->pictq_mutex);  if(is->quit)    return -1;  // windex is set to 0 initially  vp = &is->pictq[is->pictq_windex];  /* allocate or resize the buffer! */  if(!vp->bmp ||     vp->width != is->video_ctx->width ||     vp->height != is->video_ctx->height) {    SDL_Event event;    vp->allocated = 0;    alloc_picture(is);    if(is->quit) {      return -1;    }  }  /* We have a place to put our picture on the queue */  if(vp->bmp) {    SDL_LockYUVOverlay(vp->bmp);    vp->pts = pts;        dst_pix_fmt = PIX_FMT_YUV420P;    /* point pict at the queue */    pict.data[0] = vp->bmp->pixels[0];    pict.data[1] = vp->bmp->pixels[2];    pict.data[2] = vp->bmp->pixels[1];        pict.linesize[0] = vp->bmp->pitches[0];    pict.linesize[1] = vp->bmp->pitches[2];    pict.linesize[2] = vp->bmp->pitches[1];        // Convert the image into YUV format that SDL uses    sws_scale(is->sws_ctx, (uint8_t const * const *)pFrame->data,      pFrame->linesize, 0, is->video_ctx->height,      pict.data, pict.linesize);        SDL_UnlockYUVOverlay(vp->bmp);    /* now we inform our display thread that we have a pic ready */    if(++is->pictq_windex == VIDEO_PICTURE_QUEUE_SIZE) {      is->pictq_windex = 0;    }    SDL_LockMutex(is->pictq_mutex);    is->pictq_size++;    SDL_UnlockMutex(is->pictq_mutex);  }  return 0;}//这个API和AVSYNC没什么关系，主要作用是调整video ptsdouble synchronize_video(VideoState *is, AVFrame *src_frame, double pts) {  double frame_delay;  if(pts != 0) {    /* if we have pts, set video clock to it */    is->video_clock = pts;              //更新video clock  } else {    /* if we aren't given a pts, set it to the clock */    pts = is->video_clock;             //如果没有pts，则用video clock赋值该帧的pts  }  /* update the video clock */  frame_delay = av_q2d(is->video_ctx->time_base);          /* if we are repeating a frame, adjust clock accordingly */  frame_delay += src_frame->repeat_pict * (frame_delay * 0.5);  is->video_clock += frame_delay;      //正常情况video clock每次增加40ms，如果有需要重复的帧，则需要多加上重复帧的delay时间  return pts;}int video_thread(void *arg) {  VideoState *is = (VideoState *)arg;  AVPacket pkt1, *packet = &pkt1;  int frameFinished;  AVFrame *pFrame;  double pts;  pFrame = av_frame_alloc();  for(;;) {    if(packet_queue_get(&is->videoq, packet, 1) < 0) {      // means we quit getting packets      break;    }    if(packet_queue_get(&is->videoq, packet, 1) < 0) {      // means we quit getting packets      break;    }    pts = 0;    // Decode video frame    avcodec_decode_video2(is->video_ctx, pFrame, &frameFinished, packet);    if((pts = av_frame_get_best_effort_timestamp(pFrame)) == AV_NOPTS_VALUE) {      pts = 0;    }    pts *= av_q2d(is->video_st->time_base);    // Did we get a video frame?    if(frameFinished) {      pts = synchronize_video(is, pFrame, pts);      if(queue_picture(is, pFrame, pts) < 0) {break;      }    }    av_free_packet(packet);  }  av_frame_free(&pFrame);  return 0;}int stream_component_open(VideoState *is, int stream_index) {  AVFormatContext *pFormatCtx = is->pFormatCtx;  AVCodecContext *codecCtx = NULL;  AVCodec *codec = NULL;  SDL_AudioSpec wanted_spec, spec;  if(stream_index < 0 || stream_index >= pFormatCtx->nb_streams) {    return -1;  }  codec = avcodec_find_decoder(pFormatCtx->streams[stream_index]->codec->codec_id);  if(!codec) {    fprintf(stderr, "Unsupported codec!\n");    return -1;  }  codecCtx = avcodec_alloc_context3(codec);  if(avcodec_copy_context(codecCtx, pFormatCtx->streams[stream_index]->codec) != 0) {    fprintf(stderr, "Couldn't copy codec context");    return -1; // Error copying codec context  }  if(codecCtx->codec_type == AVMEDIA_TYPE_AUDIO) {    // Set audio settings from codec info    wanted_spec.freq = codecCtx->sample_rate;    wanted_spec.format = AUDIO_S16SYS;    wanted_spec.channels = codecCtx->channels;    wanted_spec.silence = 0;    wanted_spec.samples = SDL_AUDIO_BUFFER_SIZE;    wanted_spec.callback = audio_callback;    wanted_spec.userdata = is;        if(SDL_OpenAudio(&wanted_spec, &spec) < 0) {      fprintf(stderr, "SDL_OpenAudio: %s\n", SDL_GetError());      return -1;    }    is->audio_hw_buf_size = spec.size;  }  if(avcodec_open2(codecCtx, codec, NULL) < 0) {    fprintf(stderr, "Unsupported codec!\n");    return -1;  }  switch(codecCtx->codec_type) {  case AVMEDIA_TYPE_AUDIO:    is->audioStream = stream_index;    is->audio_st = pFormatCtx->streams[stream_index];    is->audio_ctx = codecCtx;    is->audio_buf_size = 0;    is->audio_buf_index = 0;    memset(&is->audio_pkt, 0, sizeof(is->audio_pkt));    packet_queue_init(&is->audioq);    SDL_PauseAudio(0);    break;  case AVMEDIA_TYPE_VIDEO:    is->videoStream = stream_index;    is->video_st = pFormatCtx->streams[stream_index];    is->video_ctx = codecCtx;    is->frame_timer = (double)av_gettime() / 1000000.0;    is->frame_last_delay = 40e-3;        packet_queue_init(&is->videoq);    is->video_tid = SDL_CreateThread(video_thread, is);    is->sws_ctx = sws_getContext(is->video_ctx->width, is->video_ctx->height, is->video_ctx->pix_fmt, is->video_ctx->width, is->video_ctx->height, PIX_FMT_YUV420P, SWS_BILINEAR, NULL, NULL, NULL );    break;  default:    break;  }}int decode_thread(void *arg) {  VideoState *is = (VideoState *)arg;  AVFormatContext *pFormatCtx;  AVPacket pkt1, *packet = &pkt1;  int video_index = -1;  int audio_index = -1;  int i;  is->videoStream=-1;  is->audioStream=-1;  global_video_state = is;  // Open video file  if(avformat_open_input(&pFormatCtx, is->filename, NULL, NULL)!=0)    return -1; // Couldn't open file  is->pFormatCtx = pFormatCtx;    // Retrieve stream information  if(avformat_find_stream_info(pFormatCtx, NULL)<0)    return -1; // Couldn't find stream information    // Dump information about file onto standard error  av_dump_format(pFormatCtx, 0, is->filename, 0);    // Find the first video stream  for(i=0; i<pFormatCtx->nb_streams; i++) {    if(pFormatCtx->streams[i]->codec->codec_type==AVMEDIA_TYPE_VIDEO &&       video_index < 0) {      video_index=i;    }    if(pFormatCtx->streams[i]->codec->codec_type==AVMEDIA_TYPE_AUDIO &&       audio_index < 0) {      audio_index=i;    }  }  if(audio_index >= 0) {    stream_component_open(is, audio_index);  }  if(video_index >= 0) {    stream_component_open(is, video_index);  }     if(is->videoStream < 0 || is->audioStream < 0) {    fprintf(stderr, "%s: could not open codecs\n", is->filename);    goto fail;  }  // main decode loop  for(;;) {    if(is->quit) {      break;    }    // seek stuff goes here    if(is->audioq.size > MAX_AUDIOQ_SIZE ||       is->videoq.size > MAX_VIDEOQ_SIZE) {      SDL_Delay(10);      continue;    }    if(av_read_frame(is->pFormatCtx, packet) < 0) {      if(is->pFormatCtx->pb->error == 0) {SDL_Delay(100); /* no error; wait for user input */continue;      } else {break;      }    }    // Is this a packet from the video stream?    if(packet->stream_index == is->videoStream) {      packet_queue_put(&is->videoq, packet);    } else if(packet->stream_index == is->audioStream) {      packet_queue_put(&is->audioq, packet);    } else {      av_free_packet(packet);    }  }  /* all done - wait for it */  while(!is->quit) {    SDL_Delay(100);  } fail:  if(1){    SDL_Event event;    event.type = FF_QUIT_EVENT;    event.user.data1 = is;    SDL_PushEvent(&event);  }  return 0;}int main(int argc, char *argv[]) {  SDL_Event       event;  VideoState      *is;  is = av_mallocz(sizeof(VideoState));  if(argc < 2) {    fprintf(stderr, "Usage: test <file>\n");    exit(1);  }  // Register all formats and codecs  av_register_all();    if(SDL_Init(SDL_INIT_VIDEO | SDL_INIT_AUDIO | SDL_INIT_TIMER)) {    fprintf(stderr, "Could not initialize SDL - %s\n", SDL_GetError());    exit(1);  }  // Make a screen to put our video#ifndef __DARWIN__        screen = SDL_SetVideoMode(640, 480, 0, 0);#else        screen = SDL_SetVideoMode(640, 480, 24, 0);#endif  if(!screen) {    fprintf(stderr, "SDL: could not set video mode - exiting\n");    exit(1);  }  screen_mutex = SDL_CreateMutex();  av_strlcpy(is->filename, argv[1], sizeof(is->filename));  is->pictq_mutex = SDL_CreateMutex();  is->pictq_cond = SDL_CreateCond();  schedule_refresh(is, 40);  is->parse_tid = SDL_CreateThread(decode_thread, is);  if(!is->parse_tid) {    av_free(is);    return -1;  }  for(;;) {    SDL_WaitEvent(&event);    switch(event.type) {    case FF_QUIT_EVENT:    case SDL_QUIT:      is->quit = 1;      SDL_Quit();      return 0;      break;    case FF_REFRESH_EVENT:      video_refresh_timer(event.user.data1);      break;    default:      break;    }  }  return 0;}

阅读全文

0 0