Beautifulsoup 小用
来源:互联网 发布:mac 卸载 编辑:程序博客网 时间:2024/04/29 14:37
用 beautifulsoup 爬了下伯克利大学 programming languages and compilers 的课件
import reimport requestsfrom bs4 import BeautifulSoupr = requests.get( "http://inst.eecs.berkeley.edu/~cs164/fa11/lectures/index.html" )soup = BeautifulSoup( r.text, "html.parser" )for elem in soup.findAll( name = "a", attrs = { "href" : re.compile( "lecture[0-9]*.pdf" ) } ): file_name = elem["href"][:-4] + "-" +\ reduce( lambda a, b: a + " " + b, elem.find_parent().find_previous_sibling().get_text().split( ":" ) ) + ".pdf" file_url = "http://inst.eecs.berkeley.edu/~cs164/fa11/lectures/" + elem["href"] file_get = requests.get( file_url, stream = True ) with open( file_name, "wb" ) as f: for chunk in file_get.iter_content( chunk_size = 1024 ): if chunk: f.write( chunk )
0 0
- Beautifulsoup 小用
- BeautifulSoup小试
- 04 BeautifulSoup小实例
- 用BeautifulSoup分析html
- 用pycharm安装beautifulsoup
- BeautifulSoup
- BeautifulSoup
- BeautifulSoup
- beautifulsoup
- BeautifulSoup
- BeautifulSoup
- BeautifulSoup
- BeautifulSoup
- BeautifulSoup
- beautifulsoup
- BeautifulSoup
- BeautifulSoup
- BeautifulSoup
- GitHub学习
- 学习笔记
- eclipse启动报错
- int main(int argc,char * argv)详解
- Intent(意图)转跳页面
- Beautifulsoup 小用
- STL之map
- putty提示Network error:Software caused connection abort
- Editplus配置成python开发环境
- ADO.NET之command录入数据-视图同步更新
- 【翻译自mos文章】在重建控制文件之前应该考虑的事情
- hdu 1936 Emoticons :-)
- Leetcode: Longest Increasing Path in a Matrix
- javaweb导出Excel数据与图片