Python os.walk 函数

来源:互联网 发布:js text readonly 编辑:程序博客网 时间:2024/06/03 06:46

Python os.walk 函数

概述:
    os.walk() 方法用于通过在目录树种游走输出在目录中的文件名,向上或者向下。在Unix,Windows中有效。
语法:
os.walk(top[, topdown=True[, onerror=None[, followlinks=False]]])
参数:
  • top – 根目录下的每一个文件夹(包含它自己), 产生3-元组 (dirpath, dirnames, filenames)【文件夹路径, 文件夹名字, 文件名】

  • topdown –可选,为True或者没有指定, 一个目录的的3-元组将比它的任何子文件夹的3-元组先产生 (目录自上而下)。如果topdown为 False, 一个目录的3-元组将比它的任何子文件夹的3-元组后产生 (目录自下而上)。

  • onerror – 可选,是一个函数; 它调用时有一个参数, 一个OSError实例。报告这错误后,继续walk,或者抛出exception终止walk。

  • followlinks – 设置为 true,则通过软链接访问目录

返回值:
该方法没有返回值
实例:
import osfrom os.path import join, getsizedef getdirsize(dir):        size = 0L        for root, dirs, files in os.walk(dir):                print 'file = ', files  # file in current dictionary                for name in files:                        full_name = join(root,name)                        print full_name                        size += getsize(full_name)         return sizeif __name__ == '__main__':        filesize = getdirsize(r'/home/data/crawler_data/cs_CZK')        print 'There are %.3f' % (filesize/1024/1024), 'Mbytes in /home/hadoop/data/crawler_data'
结果:
file =  []file =  ['2017051104_content', '2017051010_content', '2017051008_content', '2017051113_content', '2017051012_content', '2017051118_content', '2017051117_content', '2017051103_content', '2017051015_content', '2017051007_content', '2017051014_content', '2017051022_content', '2017051019_content', '2017051102_content', '2017051023_content', '2017051021_content', '2017051020_content', '2017051009_content', '2017051106_content', '2017051202_content', '2017051013_content', '2017051112_content', '2017051119_content', '2017051111_content', '2017051108_content', '2017051017_content', '2017051121_content', '2017051101_content', '2017051201_content', '2017051018_content', '2017051006_content', '2017051110_content', '2017051105_content', '2017051116_content', '2017051123_content', '2017051011_content', '2017051200_content', '2017051120_content', '2017051100_content', '2017051016_content', '2017051107_content', '2017051115_content', '2017051109_content', '2017051122_content', '2017051114_content']/home/data/crawler_data/cs_CZK/74IwnIzFSn0THho0KHkTKLAHsfu-5tZy/2017051104_contentThere are 1837.000 Mbytes in /home/data/crawler_data
参考:

Python os.walk() 方法

0 0