挖掘视频网站【优酷】上被截断的视频的地址--001

来源:互联网 发布:淘宝复制宝贝有影响吗 编辑:程序博客网 时间:2024/05/16 07:36

不知道大家看视频的时候有没有注意过,一个稍微长的视频(比如超过20分钟),你刚开始看的时候暂停播放,它的进度条会在中途某一个位置停止加载,当你把播放位置调节到那个停顿的地方,视频又开始继续加载,如果视频还有很多,它会停顿很多次,我们不禁要问,为什么会这样?是视频网站自动优化:当我们一段时间不观看的时候,它自动停止加载吗?还是???其实,一个“很长”的视频在传输到我们的浏览器的时候,为了更快速的发送过来,会将整体分为几个大块,之后一般都是将其使用断点续传来传递,为什么分成很多块而不一下子传递过来?个人认为,就像马拉松跑步,一口气跑到头不如提前设计几段去一段段的作为目标,呵呵,开个玩笑。在分析的时候,发现被分开的不同的视频块传递过来使用的服务器是不一样的ip地址,分开的时候,就好像一个网络TCP协议包,将其分开,如果一个线路阻塞,该块会尝试其他网络路径,这样分开的几块可以并发的发过来,时间上会更快一些;当我分析的时候,也偶尔会看到,本来视频顺序是0,1,2,3,但是实际上发送过来的顺序有时候会是1,3,2,1,当然实际上的原理可能更深一些,还希望明白的朋友不吝教诲。好了,介绍一下我的分析结果吧(之前的一些收获看这里:http://blog.csdn.net/duhaomin/article/details/17578489):

作为测试的一个视频地址:http://v.youku.com/v_show/id_XMjQwOTQ1NjQ4.html     加载前按下F12,点开NetWork,同时,打开wireshark过滤条件设置为:

src 123.126.99.52 or dst 123.126.99.52,这个ip地址是尝试的时候在地址http://api.youku.com/player/getPlayList/VideoIDS/XMjQwOTQ1NjQ4/timezone/+08/version/5/source/video?password=&ran=9657&n=3 抓包的时候获取的视频信息列表服务器地址,好了,开始加载视频吧。

 

既然会与不同的几个视频传输服务器连同,而且肯定是我们客户端向服务器发送的握手请求,那么我们是怎么获得的那些服务器地址呢(不是上边那个视频信息列表服务器)?之前在考虑这方面的时候,我上来就开始抓包分析,走了弯路,其实应该去分析这个流程,分析后就会很清晰:肯定是客户端向服务器请求视频地址,之后服务器将地址发送过来,至于多段视频地址,就多发几次请求地址HTTP请求,按照这个思路,看一下刚才这个视频加载后的NetWork结果:

客户端会向服务器发送获取播放信息列表,获取到的结果data_table

{"data":[{"ct":"h","cs":"2212|2224|2205","logo":"http:\/\/g1.ykimg.com\/11270F1F46512411F090D7000000004D027D9E-DE6A-BAC7-FDF3-
D24DB6235436","seed":5763,"tags":["\u52a8\u6f2b"],"categories":"100","videoid":"60236412","vidEncoded":"XMjQwOTQ1NjQ4","list":
[{"seq":"1","vid":"60234612","vidEncoded":"XMjQwOTM4NDQ4","title":"\u7b2c01\u8bdd \u5149\u7684\u7ee7
\u627f\u8005","vv":"32374247"},{"seq":"2","vid":"60235319","vidEncoded":"XMjQwOTQxMjc2","title":"\u7b2c02\u8bdd \u77f3\u5934
\u795e\u8bdd","vv":"12040353"},{"seq":"3","vid":"60235868","vidEncoded":"XMjQwOTQzNDcy","title":"\u7b2c03\u8bdd \u6076\u9b54
\u7684\u9884\u8a00","vv":"7296666"},{"seq":"4","vid":"60236412","vidEncoded":"XMjQwOTQ1NjQ4","title":"\u7b2c04\u8bdd
\u518d\u89c1\u4e86,\u5730\u7403","vv":"5235483"},{"seq":"5","vid":"60236932","vidEncoded":"XMjQwOTQ3NzI4","title":"\u7b2c05
\u8bdd \u602a\u517d\u51fa\u6ca1\u7684\u65e5\u5b50","vv":"4102278"},
{"seq":"6","vid":"60237107","vidEncoded":"XMjQwOTQ4NDI4","title":"\u7b2c06\u8bdd \u7b2c\u4e8c\u6b21\u63a5
\u89e6","vv":"3059223"},{"seq":"7","vid":"60237150","vidEncoded":"XMjQwOTQ4NjAw","title":"\u7b2c07\u8bdd \u964d\u4e34\u5230
\u5730\u7403\u7684\u5916\u661f\u4eba","vv":"2696353"},
{"seq":"8","vid":"60237193","vidEncoded":"XMjQwOTQ4Nzcy","title":"\u7b2c08\u8bdd \u4e07\u5723\u8282\u7684
\u591c\u665a","vv":"2442596"},{"seq":"9","vid":"60237241","vidEncoded":"XMjQwOTQ4OTY0","title":"\u7b2c09\u8bdd \u7b49\u5f85
\u602a\u517d\u7684\u5c11\u5973","vv":"2110075"},{"seq":"10","vid":"60234671","vidEncoded":"XMjQwOTM4Njg0","title":"\u7b2c10
\u8bdd \u5c01\u95ed\u7684\u6e38\u4e50\u56ed","vv":"2406034"},
{"seq":"11","vid":"60234745","vidEncoded":"XMjQwOTM4OTgw","title":"\u7b2c11\u8bdd \u5b89\u9b42\u66f2","vv":"2167581"},
{"seq":"12","vid":"60234807","vidEncoded":"XMjQwOTM5MjI4","title":"\u7b2c12\u8bdd \u6df1\u6d77\u6765
\u7684SOS","vv":"1797161"},{"seq":"13","vid":"60234872","vidEncoded":"XMjQwOTM5NDg4","title":"\u7b2c13\u8bdd
\u4e0d\u505a\u5974\u96b6 \u52d2\u6bd4\u514b\u661f\u4eba","vv":"2040850"},
{"seq":"14","vid":"60234936","vidEncoded":"XMjQwOTM5NzQ0","title":"\u7b2c14\u8bdd \u88ab\u6d41\u653e\u7684
\u76ee\u6807","vv":"2133945"},{"seq":"15","vid":"60235001","vidEncoded":"XMjQwOTQwMDA0","title":"\u7b2c15\u8bdd \u68a6
\u5e7b\u75be\u8d70","vv":"1872661"},{"seq":"16","vid":"60235073","vidEncoded":"XMjQwOTQwMjky","title":"\u7b2c16\u8bdd
\u9b3c\u795e\u9192\u6765","vv":"1802269"},{"seq":"17","vid":"60235140","vidEncoded":"XMjQwOTQwNTYw","title":"\u7b2c17\u8bdd
\u7ea2\u4e0e\u84dd\u7684\u51b3\u6218","vv":"1545555"},
{"seq":"18","vid":"60235207","vidEncoded":"XMjQwOTQwODI4","title":"\u7b2c18\u8bdd \u54e5\u5c14\u8d5e\u7684
\u53cd\u88ad","vv":"1481889"},{"seq":"19","vid":"60235263","vidEncoded":"XMjQwOTQxMDUy","title":"\u7b2c19\u8bdd GUTS\u5954
\u5411\u5b87\u5b99(\u4e0a)","vv":"1242648"},{"seq":"20","vid":"60235384","vidEncoded":"XMjQwOTQxNTM2","title":"\u7b2c20\u8bdd
GUTS\u5954\u5411\u5b87\u5b99(\u4e0b)","vv":"1360124"}],"list_pre":
{"seq":"3","vid":"60235868","vidEncoded":"XMjQwOTQzNDcy","title":"\u7b2c03\u8bdd \u6076\u9b54\u7684\u9884
\u8a00","vv":"7296666"},"list_next":{"seq":"5","vid":"60236932","vidEncoded":"XMjQwOTQ3NzI4","title":"\u7b2c05\u8bdd
\u602a\u517d\u51fa\u6ca1\u7684\u65e5\u5b50","vv":"4102278"},"username":"\u4e16
\u7eaa\u534e\u521bSCLA","userid":"82663132","title":"\u7b2c04\u8bdd \u518d\u89c1\u4e86,\u5730
\u7403","up":0,"down":0,"ts":"rJsMKjf1OJiqCcEBGT6rmg","tsup":"rJt-
MjT1OJiqCcECAU2rmg","key1":"bd7394d5","key2":"5168cf59e76949ee","tt":"0","show":
{"showid":"104924","showid_encode":"70cd4334278211e097c0","showname":"\u8fea\u8fe6\u5965\u7279
\u66fc","paid":0,"paid_type":"","show_paid":0,"paid_url":"","copyright":1,"show_videotype":1,"theaterid":0,"stage":"4"},"dvd":
{"notsharing":"0"},"seconds":"1253.55",
"streamfileids":
{
"flv":
"34*5*34*34*34*36*34*25*34*34*3*32*36*25*34*42*35*66*12*25*43*5*34*25*50*48*3*60*48*42*42*67*50*60*3*50*42*60*4*43*43*60
*34*4*66*34*42*12*4*60*34*48*30*4*60*67*50*12*36*30*67*43*67*60*43*3*",
"mp4":
"34*5*34*34*34*67*34*25*34*34*3*32*36*25*32*32*50*
48*12*25*43*5*34*25*50*48*3*60*48*42*42*67*50*60*3*50*42*60*4*43*43*60*34*4*66*34*42*12*4*60*34*48*30*4*60*67*50*12*36*30*67*43
*67*60*43*3*"},
"segs":{
"flv":
[{"no":"0","size":"12313892","seconds":"364","k":"21c0a8a565522278261d6c75","k2":"1ed11c5897a7bf9b2"},
{"no":"1","size":"12337387","seconds":"397","k":"94bcaa30d905aacf28293170","k2":"10213d3eab95adc42"},
{"no":"2","size":"9352058","seconds":"219","k":"b3a6814602923d7628293170","k2":"1f3284a17ab896d02"},
{"no":"3","size":"11025133","seconds":"273","k":"93c439930535cd0c28293170","k2":"15dabacdb010e73f8"}],
"mp4":
[{"no":"0","size":"26357984","seconds":"418","k":"30940234c47bbd6028293170","k2":"14a3934c18bfa3eec"},
{"no":"1","size":"24504929","seconds":"392","k":"c95a133006aba4232411a779","k2":"158160e741fcfb08e"},
{"no":"2","size":"16795391","seconds":"198","k":"de8b035f07278c3f261d6c75","k2":"187e035fb63584d24"},
{"no":"3","size":"18700517","seconds":"246","k":"f4d95c475e847462261d6c75","k2":"135a347eb5a764cc0"}]},
"streamsizes":
{"flv":"45028470","mp4":"86358821"},
"stream_ids":{
"flv":"31436147",
"mp4":"31438597"
},"streamlogos":
{"flv":0,"mp4":0},"streamtypes":["flv","mp4"],"streamtypes_o":["flvhd","flv","mp4"]}],"user":{"id":0},"controller":
{"search_count":true,"mp4_restrict":1,"stream_mode":2,"video_capture":true,"hd3_enabled":false,"area_code":110000,"dma_code":48
08,"continuous":1,"playmode":"show","circle":false,"tsflag":false,"other_disable":false,"share_disabled":false,"download_disabl
ed":true,"pc_disabled":false,"pad_disabled":false,"mobile_disabled":false,"tv_disabled":false,"comment_disabled":false}}

继续按照时间顺序在NetWork里边向下看,找到这里:


我们可以看到,当向服务器提交getFlvPath请求的时候,它会返回真正的视频地址。接下来只要找到那些获取真实地址的请求的方法就能得到真实的视频地址了。

同样原理,将其他那几段视频的请求地址复制出来,对比一下:

http://f.youku.com/player/getFlvPath/sid/138838280403193127397_00/st/mp4/fileid/0300080400512411EDB49304ED56DCC8E65EC6-9960-F0CB-60DA-68EB2A898695?start=0&K=30940234c47bbd6028293170&hd=1&myp=0&ts=418&ymovie=1&ypp=0http://f.youku.com/player/getFlvPath/sid/138838280403193127397_01/st/mp4/fileid/0300080401512411EDB49304ED56DCC8E65EC6-9960-F0CB-60DA-68EB2A898695?start=0&K=c95a133006aba4232411a779&hd=1&myp=0&ts=392&ymovie=1&ypp=0http://f.youku.com/player/getFlvPath/sid/138838280403193127397_02/st/mp4/fileid/0300080402512411EDB49304ED56DCC8E65EC6-9960-F0CB-60DA-68EB2A898695?start=0&K=de8b035f07278c3f261d6c75&hd=1&myp=0&ts=198&ymovie=1&ypp=0http://f.youku.com/player/getFlvPath/sid/138838280403193127397_03/st/mp4/fileid/0300080403512411EDB49304ED56DCC8E65EC6-9960-F0CB-60DA-68EB2A898695?start=0&K=f4d95c475e847462261d6c75&hd=1&myp=0&ts=246&ymovie=1&ypp=0
很容易发现他们的区别,除了红色的规律很容易找到外,K值和ts都不能猜到,它们怎么得到的?是不是客户端通过一定的计算得到的呢?比如时间游标之类的?想的太简单了,向上查看一下,发现,原来这个k值和ts值都是服务器传过来的data_table,我加粗的部分,是不是顿时终于找到了方向呢?此时才仅仅找到了一点方向,因为这个路径里边:

http://f.youku.com/player/getFlvPath/sid/138838280403193127397_00/st/mp4/fileid/0300080400512411EDB49304ED56DCC8E65EC6-9960-F0CB-60DA-68EB2A898695?start=0&K=30940234c47bbd6028293170&hd=1&myp=0&ts=418&ymovie=1&ypp=0

粗线部分我们并不知道怎么得到的,而服务器传回来的信息,找不到任何的帮助信息,可是传回来的一个名字引起了我的注意:

"streamfileids":{"flv":"34*5*34*34*34*36*34*25*34*34*3*32*36*25*34*42*35*66*12*25*43*5*34*25*50*48*3*60*48*42*42*67*50*60*3*50*42*60*4*43*43*60*34*4*66*34*42*12*4*60*34*48*30*4*60*67*50*12*36*30*67*43*67*60*43*3*","mp4":"34*5*34*34*34*67*34*25*34*34*3*32*36*25*32*32*50*48*12*25*43*5*34*25*50*48*3*60*48*42*42*67*50*60*3*50*42*60*4*43*43*60*34*4*66*34*42*12*4*60*34*48*30*4*60*67*50*12*36*30*67*43*67*60*43*3*"},
这个是什么?文件id?会不会是路径里边的/st/mp4/fileid呢?

通过仔细的对比发现,还真是这个fileid,仅仅是通过最简单的加密:

加密后加密前340536782543532136250E48D12B43960642C4-66F60A

这样fileid找到了,至于这4个fileid里边的0,1,2,3出现在哪一位,还需要进一步寻找,(当然我这里获取的是mp4格式的,对于flv格式的,还需要将另外一套找出来)至此离找到请求的地址还有两处需要解决:

sid怎么来的,以及0,1,2,3在哪一位是固定的还是根据返回来的信息设置的

我们所能找到的还是来源于data_table,看一下那里边的红色部分,刚才的K值在这个表里边有很重要的意义,而那个K2会不会也同样与加密非常的紧密?既然还需要求sid,那么这个stream_ids必然也是直接的答案来源,已经无法找到更加有帮助的信息了,那么就从视频播放的客户端开始深入着手吧,将视频网站上的flash播放器:flash.swf下载,看一下它实现的代码,暂时就到这里,分析出来后贴出来与大家分享学习。

 

我又回来了,刚准备仔细看flash播放器实现的代码,回忆起网上有这方面的网站,就看了下硕鼠(专门从事解析视频网站视频真实地址,帮助用户快速便捷的下载视频:http://www.flvcd.com/)解析结果,发现解析出来的sid居然全部是:

00_00,这也能行吗?于是我讲我解析出来的地址里边的sid换成00_00,居然真的能够下载,莫非这个字段是形同虚设,迷惑学习者的?暂时先这样,之后再什么看代码。

如果这样就下手,感觉是找到了需要的信息,可是,实际上并不是这样,当经过几次测试,发现,上边那个加密解密表,是动态改变的,看起来,而客户端是不回经常更新版本,那么客户端如何识别这个改变的加密表呢?一定与返回的某些字段,通过一定的运算之后,计算出的这个表,而服务器与客户端都这样计算,就保持了统一,但是我大可以不这样,不管它是如何加密解密,第一个视频名字,与后边的名字,总是只有第10项是改变的,我可以获取到第一个地址后,以后的直接推算出来了,当然,如果视频网站修改这些被发现的小规律,可能会修改方案,那么我随时需要调节自己的学习结果,但是,掌握方法和学习过程就行了,不需要真的去破解他,除非像硕鼠这样,必须破解它的方案。

【注意,本博文内容仅供学习交流使用,请勿用于侵害他人利益】

0 0
原创粉丝点击