[python]抓取网页的内容

来源：互联网发布：mac ps无法退出全屏编辑：程序博客网时间：2024/04/30 01:31

#-*- coding: UTF-8 -*- import urllib2, BeautifulSoup# @param url: complete url#             完整的url# @param usr, pwd: if the page need account, #        \p usr and \p pwd will be used#             当访问的页面需要密码的时候会用到# @return: the formatted string content of the url#             用了BeautifulSoup返回结果文本def getWebPage(url, usr=None, pwd=None):    if not usr and not pwd:        content = urllib2.urlopen(url).read()    else:        pwdMgr = urllib2.HTTPPasswordMgrWithDefaultRealm()        pwdMgr.add_password(None, url, usr, pwd)        handler = urllib2.HTTPBasicAuthHandler(pwdMgr)        opener = urllib2.build_opener(handler)        page = opener.open(url).read()        content = BeautifulSoup.BeautifulSoup(page).prettify()    return contenturl='http://www.csdn.net/'print getWebPage(url)

[python]抓取网页的内容
python抓取网页内容
python抓取网页内容
python 网页内容抓取
Python抓取网页内容
python 抓取网页内容
Python抓取网页内容
Python抓取one网页上的内容
python 抓取网页内容教程
Python网页抓取：获取页面中某段内容的xpath
用Python的Lxml库抓取网页内容
网页内容抓取图片的抓取方法
抓取网页内容的函数
有关网页抓取的内容
paip.抓取网页内容--java php python
python beautifulsoup 抓取网页正文内容
Python使用代理抓取网页内容
【python】网页内容抓取遭遇乱码问题
用广度优先搜索解迷宫问题 By LYLtim
C/C++转义字符
C++程序设计语言-第一章：致读者
用C++设计一个不能被继承的类实现java final的作用
在case语句中定义变量的问题
[python]抓取网页的内容
linux 下删除文件夹（文件夹不为空时）
电影小结
smarty将一个变量为另一个变量的key时取值的写法zz
一步一步写算法（之寻路）
Cocoa教学：Windows OOP与Cocoa MVC之对比
纯javascript unicode shellcode
response常见应用(No.34)
oprofile 使用之一（build）