Python之FTP多线程下载文件之多线程分块下载文件

来源:互联网 发布:淘宝能发布多少个宝贝 编辑:程序博客网 时间:2024/06/07 23:10

Python中的ftplib模块用于对FTP的相关操作,常见的如下载,上传等。使用python从FTP下载较大的文件时,往往比较耗时,如何提高从FTP下载文件的速度呢?多线程粉墨登场,本文给大家分享我的多线程下载代码,需要用到的python主要模块包括:ftplib和threading。

首先讨论我们的下载思路,示意如下:

1. 将文件分块,比如我们打算采用20个线程去下载同一个文件,则需要将文件以二进制方式打开,平均分成20块,然后分别启用一个线程去下载一个块:

复制代码
 1 def setupThreads(self, filePath, localFilePath, threadNumber = 20): 2     """ 3     set up the threads which will be used to download images 4     list of threads will be returned if success, else 5     None will be returned 6     """ 7     try: 8         temp = self.ftp.sendcmd('SIZE ' + filePath) 9         remoteFileSize = int(string.split(temp)[1])10         blockSize = remoteFileSize / threadNumber11         rest = None12         threads = []13         for i in range(0, threadNumber - 1):14             beginPoint = blockSize * i15             subThread = threading.Thread(target = self.downloadFileMultiThreads, args = (i, filePath, localFilePath, beginPoint, blockSize, rest,))16             threads.append(subThread)17             18         assigned = blockSize * threadNumber19         unassigned = remoteFileSize - assigned20         lastBlockSize = blockSize + unassigned21         beginPoint = blockSize * (threadNumber - 1)22         subThread = threading.Thread(target = self.downloadFileMultiThreads, args = (threadNumber - 1, filePath, localFilePath, beginPoint, lastBlockSize, rest,))23         threads.append(subThread)24         return threads25     except Exception, diag:26         self.recordLog(str(diag), 'error')27         return None
复制代码

其中的downloadFileMultiThreads函数如下:

复制代码
 1 def downloadFileMultiThreads(self, threadIndex, remoteFilePath, localFilePath, \ 2                                  beginPoint, blockSize, rest = None): 3     """ 4     A sub thread used to download file 5     """ 6     try: 7         threadName = threading.currentThread().getName() 8         # temp local file 9         fp = open(localFilePath + '.part.' + str(threadIndex), 'wb')10         callback = fp.write11         12         # another connection to ftp server, change to path, and set binary mode13         myFtp = FTP(self.host, self.user, self.passwd)14         myFtp.cwd(os.path.dirname(remoteFilePath))15         myFtp.voidcmd('TYPE I')16         17         finishedSize = 018         # where to begin downloading19         setBeginPoint = 'REST ' + str(beginPoint)20         myFtp.sendcmd(setBeginPoint)21         # begin to download22         beginToDownload = 'RETR ' + os.path.basename(remoteFilePath)23         connection = myFtp.transfercmd(beginToDownload, rest)24         readSize = self.fixBlockSize25         while 1:26             if blockSize > 0:27                 remainedSize = blockSize - finishedSize28                 if remainedSize > self.fixBlockSize:29                     readSize = self.fixBlockSize30                 else:31                     readSize = remainedSize32             data = connection.recv(readSize)33             if not data:34                 break35             finishedSize = finishedSize + len(data)36             # make sure the finished data no more than blockSize37             if finishedSize == blockSize:38                 callback(data)39                 break40             callback(data)41         connection.close()42         fp.close()43         myFtp.quit()44         return True45     except Exception, diag:46         return False
复制代码

2. 等待下载完成之后我们需要对各个文件块进行合并,合并的过程见本系列之二:Python之FTP多线程下载文件之分块多线程文件合并

 

感谢大家的阅读,希望能够帮到大家!

0 0
原创粉丝点击