深入理解Android音视频同步机制(五)如何从零开始写一个音视频同步的播放器

来源:互联网 发布:网络主播签约注意事项 编辑:程序博客网 时间:2024/05/21 17:49
  • 深入理解Android音视频同步机制(一)概述
  • 深入理解Android音视频同步机制(二)ExoPlayer的avsync逻辑
  • 深入理解Android音视频同步机制(三)NuPlayer的avsync逻辑
  • 深入理解Android音视频同步机制(四)MediaSync的使用与原理
  • 深入理解Android音视频同步机制(五)如何从零开始写一个音视频同步的播放器

前面我们分析了三个播放器的av sync逻辑,可以看到他们都各有不同,那么究竟哪种方法可以达到最好的avsync结果?哪些逻辑是必要的?如果我们想自己从零开始写一个av同步的播放器,都需要做哪些工作?

首先我们测试了几个播放器的音视频同步表现,使用的是syncOne官网的1080p 24fps H264 AAC测试片源,只测试speaker下的结果,测试结果如下
这里写图片描述
下面我们参考cts中的mediacodec使用示例,尝试着写出一个avsync结果与上面结果接近的播放器

MediaCodecPlayer Demo

demo的地址如下:
https://github.com/zhanghuicuc/simplest_android_avplayer

首先来看一下整体流程,如下图所示
这里写图片描述
总体流程和ExoPlayer基本相似。图中的doSomeWork是核心大loop,同步地调用MediaCodc接口,关键的avsync逻辑在drainOutputBuffer方法中实现。

最朴素简单的同步逻辑

先来总体说一下同步逻辑,然后再详细看看代码

Video部分

1.对实际送显时间的计算

private boolean drainOutputBuffer()//传入参数为pts,返回的realTimeUs其实就是pts,并没有做什么调整   long realTimeUs =       mMediaTimeProvider.getRealTimeUsForMediaTime(info.presentationTimeUs);   //这里的nowUs代表audio播放的时间   long nowUs = mMediaTimeProvider.getNowUs(); //audio play time   //nowUs - realTimeUs代表还有多久该播放这一帧   long lateUs = nowUs - realTimeUs;…    //这里调用的是没有timestamp参数的releaseOutputBuffer方法    mCodec.releaseOutputBuffer(index, render);

Audio部分

1、current play time的计算

public long getAudioTimeUs()//这里返回audio的播放时间,就是利用的getPlaybackHeadPosition方法   int numFramesPlayed = mAudioTrack.getPlaybackHeadPosition();   return (numFramesPlayed * 1000000L) / mSampleRate;

小结

就是拿video的pts与audioTrack.getPlaybackHeadPosition方法返回的播放时间进行对比,然后直接调用没有timestamp参数的releaseOutputBuffer的方法进行显示

详细的代码如下

private boolean drainOutputBuffer() {...   int index = mAvailableOutputBufferIndices.peekFirst().intValue();   MediaCodec.BufferInfo info = mAvailableOutputBufferInfos.peekFirst();…   //传入参数为pts,返回的realTimeUs其实就是pts,并没有做什么调整,参见1.1   long realTimeUs =       mMediaTimeProvider.getRealTimeUsForMediaTime(info.presentationTimeUs);   //这里的nowUs代表audio播放的时间,参见1.1   long nowUs = mMediaTimeProvider.getNowUs(); //audio play time   //nowUs - realTimeUs代表还有多久该播放这一帧   long lateUs = nowUs - realTimeUs;   if (mAudioTrack != null) {       ...   } else {       // video       boolean render;       if (lateUs < -45000) {           // too early;如果video来早了45ms,则等待下一次loop           return false;       } else if (lateUs > 30000) {           //默认的丢帧门限是30ms           Log.d(TAG, "video late by " + lateUs + " us.");           render = false;       } else {           render = true;           mPresentationTimeUs = info.presentationTimeUs;       }       //这里调用的是没有timestamp参数的releaseOutputBuffer方法       mCodec.releaseOutputBuffer(index, render);       mAvailableOutputBufferIndices.removeFirst();       mAvailableOutputBufferInfos.removeFirst();       return true;   }}public long getRealTimeUsForMediaTime(long mediaTimeUs) {   if (mDeltaTimeUs == -1) {      //第一次调用的时候会走进这个分支,mDeltaTimeUs代表初始pts的偏差,正常情况下它是0       long nowUs = getNowUs();       mDeltaTimeUs = nowUs - mediaTimeUs;   }   //所以getRealTimeUsForMediaTime返回的就是pts   return mDeltaTimeUs + mediaTimeUs;}public long getNowUs() {   //如果是video only的流,则返回系统时间,否则返回audio播放的时间   if (mAudioTrackState == null) {       return System.currentTimeMillis() * 1000;   }   return mAudioTrackState.getAudioTimeUs();}public long getAudioTimeUs() {   //这里返回audio的播放时间,就是利用的getPlaybackHeadPosition方法   int numFramesPlayed = mAudioTrack.getPlaybackHeadPosition();   return (numFramesPlayed * 1000000L) / mSampleRate;}

测试一下这个最简单avsync逻辑的结果 : -173ms,果然惨不忍睹

第一步改造

改造audio time的获取,加上latency

public long getAudioTimeUs() {  int numFramesPlayed = mAudioTrack.getPlaybackHeadPosition();  if (getLatencyMethod != null) {      try {          latencyUs = (Integer) getLatencyMethod.invoke(mAudioTrack, (Object[]) null) * 1000L /2;          latencyUs = Math.max(latencyUs, 0);      } catch (Exception e){          getLatencyMethod = null;      }  }  return (numFramesPlayed * 1000000L) / mSampleRate - latencyUs;}

此时的测试结果是:-128ms, 好了一些,但还不够

第二步改造

考虑到getLatency方法是一个被google自己吐槽的接口,我们再来下一步改造,用上getTimeStamp方法,如下

public long getAudioTimeUs() {   long systemClockUs = System.nanoTime() / 1000;   int numFramesPlayed = mAudioTrack.getPlaybackHeadPosition();   if (systemClockUs - lastTimestampSampleTimeUs >= MIN_TIMESTAMP_SAMPLE_INTERVAL_US) {       audioTimestampSet = mAudioTrack.getTimestamp(audioTimestamp);       if (getLatencyMethod != null) {           try {               latencyUs = (Integer) getLatencyMethod.invoke(mAudioTrack, (Object[]) null) * 1000L / 2;               latencyUs = Math.max(latencyUs, 0);           } catch (Exception e) {               getLatencyMethod = null;           }       }       lastTimestampSampleTimeUs = systemClockUs;   }   if (audioTimestampSet) {       // Calculate the speed-adjusted position using the timestamp (which may be in the future).       long elapsedSinceTimestampUs = System.nanoTime() / 1000 - (audioTimestamp.nanoTime / 1000);       long elapsedSinceTimestampFrames = elapsedSinceTimestampUs * mSampleRate / 1000000L;       long elapsedFrames = audioTimestamp.framePosition + elapsedSinceTimestampFrames;       long durationUs = (elapsedFrames * 1000000L) / mSampleRate;       return durationUs;   } else {       long durationUs = (numFramesPlayed * 1000000L) / mSampleRate - latencyUs;       //durationUs = Math.max(durationUs, 0);       return durationUs;   }}

当我们对比getTimeStamp和getPosition返回的结果,就能发现两者的精度差距

1.getPosition, 最后的nowUs代表audio播放的duration12-06 16:11:47.695 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 166667,realTimeUs is 46667,nowUs is 4000012-06 16:11:47.696 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 208333,realTimeUs is 88333,nowUs is 4000012-06 16:11:47.700 20194-20657/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 362666,realTimeUs is 242666,nowUs is 4000012-06 16:11:47.706 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 208333,realTimeUs is 88333,nowUs is 4000012-06 16:11:47.707 20194-20657/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 384000,realTimeUs is 264000,nowUs is 4000012-06 16:11:47.714 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 208333,realTimeUs is 88333,nowUs is 8000012-06 16:11:47.716 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 250000,realTimeUs is 130000,nowUs is 8000012-06 16:11:47.720 20194-20657/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 405333,realTimeUs is 285333,nowUs is 8000012-06 16:11:47.726 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 250000,realTimeUs is 130000,nowUs is 8000012-06 16:11:47.728 20194-20657/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 426666,realTimeUs is 306666,nowUs is 8000012-06 16:11:47.734 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 250000,realTimeUs is 130000,nowUs is 8000012-06 16:11:47.736 20194-20657/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 448000,realTimeUs is 328000,nowUs is 12000012-06 16:11:47.742 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 250000,realTimeUs is 130000,nowUs is 12000012-06 16:11:47.743 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 291667,realTimeUs is 171667,nowUs is 12000012-06 16:11:47.746 20194-20657/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 469333,realTimeUs is 349333,nowUs is 12000012-06 16:11:47.753 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 291667,realTimeUs is 171667,nowUs is 12000012-06 16:11:47.756 20194-20657/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 490666,realTimeUs is 370666,nowUs is 12000012-06 16:11:47.764 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 291667,realTimeUs is 171667,nowUs is 16000012-06 16:11:47.764 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 333333,realTimeUs is 213333,nowUs is 16000012-06 16:11:47.767 20194-20657/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 512000,realTimeUs is 392000,nowUs is 16000012-06 16:11:47.774 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 333333,realTimeUs is 213333,nowUs is 16000012-06 16:11:47.776 20194-20657/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 533333,realTimeUs is 413333,nowUs is 16000012-06 16:11:47.782 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 333333,realTimeUs is 213333,nowUs is 16000012-06 16:11:47.783 20194-20657/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 554666,realTimeUs is 434666,nowUs is 16000012-06 16:11:47.790 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 333333,realTimeUs is 213333,nowUs is 16000012-06 16:11:47.791 20194-20657/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 576000,realTimeUs is 456000,nowUs is 16000012-06 16:11:47.798 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 333333,realTimeUs is 213333,nowUs is 16000012-06 16:11:47.806 20194-20657/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 597333,realTimeUs is 477333,nowUs is 20000012-06 16:11:47.813 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 333333,realTimeUs is 213333,nowUs is 20000012-06 16:11:47.814 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 375000,realTimeUs is 255000,nowUs is 20000012-06 16:11:47.817 20194-20657/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 618666,realTimeUs is 498666,nowUs is 20000012-06 16:11:47.825 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 375000,realTimeUs is 255000,nowUs is 20000012-06 16:11:47.827 20194-20657/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 640000,realTimeUs is 520000,nowUs is 20000012-06 16:11:47.836 20194-20657/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 375000,realTimeUs is 255000,nowUs is 20000012-06 16:11:47.840 20194-20657/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 661333,realTimeUs is 541333,nowUs is 2000002.getTimeStamp,同样的,最后的nowUs代表audio播放的duration,其实这其中的一大部分功劳也是因为系统时间参与了计算12-06 16:17:22.122 30072-30257/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 333333,realTimeUs is 213333,nowUs is 16131212-06 16:17:22.124 30072-30257/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 554666,realTimeUs is 434666,nowUs is 16325012-06 16:17:22.131 30072-30257/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 333333,realTimeUs is 213333,nowUs is 17012512-06 16:17:22.131 30072-30257/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 375000,realTimeUs is 255000,nowUs is 17056212-06 16:17:22.133 30072-30257/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 576000,realTimeUs is 456000,nowUs is 17266612-06 16:17:22.141 30072-30257/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 375000,realTimeUs is 255000,nowUs is 18060412-06 16:17:22.142 30072-30257/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 597333,realTimeUs is 477333,nowUs is 18166612-06 16:17:22.150 30072-30257/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 375000,realTimeUs is 255000,nowUs is 18993712-06 16:17:22.153 30072-30257/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 618666,realTimeUs is 498666,nowUs is 19214512-06 16:17:22.159 30072-30257/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 375000,realTimeUs is 255000,nowUs is 19845812-06 16:17:22.160 30072-30257/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 640000,realTimeUs is 520000,nowUs is 19968712-06 16:17:22.166 30072-30257/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 375000,realTimeUs is 255000,nowUs is 20552012-06 16:17:22.167 30072-30257/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 661333,realTimeUs is 541333,nowUs is 20681212-06 16:17:22.173 30072-30257/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 375000,realTimeUs is 255000,nowUs is 21289512-06 16:17:22.174 30072-30257/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 416667,realTimeUs is 296667,nowUs is 21310412-06 16:17:22.176 30072-30257/com.example.zhanghui.avplayer D/avsync: audio: presentationUs is 682666,realTimeUs is 562666,nowUs is 21512512-06 16:17:22.182 30072-30257/com.example.zhanghui.avplayer D/avsync: video: presentationUs is 416667,realTimeUs is 296667,nowUs is 221479

此时的测试结果是:-87.5ms, 好了一些,已经达到和ijkPlayer差不多的水平了,是否还能更好呢?

第三步改造

基于vsync调整送显时间

audio部分似乎已经没什么可以改的了,下面我们来看看video部分加上基于vsync的调整后会怎么样,前面我们看到在最朴素的逻辑中,使用的是不带timestamp参数的releaseOutputBuffer方法,实际上,这时在framework中会把pts作为timestamp传进去,如下

status_t MediaCodec::onReleaseOutputBuffer(const sp<AMessage> &msg) {...    BufferInfo *info = &mPortBuffers[kPortIndexOutput].editItemAt(index);...    if (render && info->mData != NULL && info->mData->size() != 0) {        info->mNotify->setInt32("render", true);        int64_t mediaTimeUs = -1;        info->mData->meta()->findInt64("timeUs", &mediaTimeUs);        int64_t renderTimeNs = 0;        if (!msg->findInt64("timestampNs", &renderTimeNs)) {            // use media timestamp if client did not request a specific render timestamp            ALOGV("using buffer PTS of %lld", (long long)mediaTimeUs);            renderTimeNs = mediaTimeUs * 1000;        }...

简单起见,我们直接复用exoplayer中基于vsync调整的逻辑,做如下修改

这个方法以前是直接返回pts,现在我们让他返回经过vsync调整后的release时间public long getRealTimeUsForMediaTime(long mediaTimeUs) {   if (mDeltaTimeUs == -1) {       long nowUs = getNowUs();       mDeltaTimeUs = nowUs - mediaTimeUs; //-32000   }   long earlyUs = mDeltaTimeUs + mediaTimeUs - getNowUs();   long unadjustedFrameReleaseTimeNs = System.nanoTime() + (earlyUs * 1000);   long adjustedReleaseTimeNs = frameReleaseTimeHelper.adjustReleaseTime(           mediaTimeUs, unadjustedFrameReleaseTimeNs);   return adjustedReleaseTimeNs / 1000;}相应的,在drainOutputBuffer中也要改一下private boolean drainOutputBuffer() {   ...   long realTimeUs =       mMediaTimeProvider.getRealTimeUsForMediaTime(info.presentationTimeUs); //返回调整后的releaseTime   long nowUs = mMediaTimeProvider.getNowUs(); //audio play time   long lateUs = System.nanoTime()/1000 - realTimeUs;   if (mAudioTrack != null) {       ...       return true;   } else {       // video       ….       //mCodec.releaseOutputBuffer(index, render);       mCodec.releaseOutputBuffer(index, realTimeUs*1000);       mAvailableOutputBufferIndices.removeFirst();       mAvailableOutputBufferInfos.removeFirst();       return true;   }}

此时的测试结果是-76ms,又好了一些,但是已经不甚明显了,能不能再做进一步的优化呢?

第四步改造

提前两倍vsync时间调用releaseOutputBuffer

我们想到之前在NuPlayer和MediaSync中都有提前两倍vsync duration时间调用releaseOutputBuffer方法的逻辑,加上试试看,如下

private boolean drainOutputBuffer() {  ...   long lateUs = System.nanoTime()/1000 - realTimeUs;   if (mAudioTrack != null) {      …   } else {       // video       boolean render;       //这里改为如果video比预期release时间来的早了2*vsyncduration时间以上,则跳过并且进入下次循环,否则予以显示       long twiceVsyncDurationUs = 2 * mMediaTimeProvider.getVsyncDurationNs()/1000;       if (lateUs < -twiceVsyncDurationUs) {           // too early;           return false;       } else if (lateUs > 30000) {           Log.d(TAG, "video late by " + lateUs + " us.");           render = false;       } else {           render = true;           mPresentationTimeUs = info.presentationTimeUs;       }       //mCodec.releaseOutputBuffer(index, render);       mCodec.releaseOutputBuffer(index, realTimeUs*1000);...   }}

此时的测试结果是-66ms,又好了一点点,但是同样已经不甚明显了

第五步改造

再做一些微调
比如在解码audio时也做了一轮vsync调整,这显然是多余的,去掉它

private boolean drainOutputBuffer() {   if (mAudioTrack != null) {   } else {       // video       boolean render;       long twiceVsyncDurationUs = 2 * mMediaTimeProvider.getVsyncDurationNs()/1000;       long realTimeUs =               mMediaTimeProvider.getRealTimeUsForMediaTime(info.presentationTimeUs);        long nowUs = mMediaTimeProvider.getNowUs(); //audio play time       long lateUs = System.nanoTime()/1000 - realTimeUs;     ...       mCodec.releaseOutputBuffer(index, realTimeUs*1000);       mAvailableOutputBufferIndices.removeFirst();       mAvailableOutputBufferInfos.removeFirst();       return true;   }}

此时测试的结果是-56ms,看似是和avsync无关的调整,却也产生了一定的效果,也从侧面证明了根据vsync调整releaseTime是一个很耗时的操作

总结一下我们每一步所做的事情和avsync优化的结果,如此,我们可以更好地理解avsync逻辑中各个环节的必要性和所产生的影响了

这里写图片描述

各位看官,如果您觉得本人的博客对您有所帮助,可以扫描如下二维码进行打赏,打赏多少您随意~
这里写图片描述

阅读全文
0 0
原创粉丝点击
热门问题 老师的惩罚 人脸识别 我在镇武司摸鱼那些年 重生之率土为王 我在大康的咸鱼生活 盘龙之生命进化 天生仙种 凡人之先天五行 春回大明朝 姑娘不必设防,我是瞎子 宜宾南门桥逆行怎么办 去加拿大探亲签证怎么办 重庆两路口到菜坝园怎么办 摄像头uid忘了怎么办 电梯钢丝绳断了怎么办 在小区车被砸了怎么办 甲米天气下雨怎么办 电影院不开空调怎么办 电脑总弹广告怎么办 下雨天了怎么办神接 天下雨了怎么办套路 下雨天了怎么办 新套路 万达兑换券过期怎么办 直播间没一个人怎么办 快手直播人少怎么办 快手直播人气少怎么办 被火山主播踢出房间怎么办 遇到同事的排挤怎么办 交警处理事故不公平怎么办 派出所处理事情不公平怎么办 淘宝号不健康了怎么办 作业盒子忘记密码怎么办 一起作业忘记密码怎么办 手机不能录视频怎么办 网络机顶盒连不上wifi怎么办 obs游戏源黑屏怎么办 微信视频打不开怎么办? xp关闭hdmi声音怎么办 大锅天线无信号怎么办 人喝酒喝醉了怎么办 电信网络电视不清楚怎么办 饭店老板拖欠员工工资怎么办 出国旅游不会英语怎么办 香港转机21小时怎么办 动车坐过站了怎么办 在美国开车超速怎么办? 办出国旅游护照怎么办 中关村三小午饭怎么办 电视不能看直播怎么办 电视系统无信号怎么办 电视视频无信号怎么办