php获取网页源代码

来源：互联网发布：网络能上网不能玩游戏编辑：程序博客网时间：2024/06/04 19:10

正好我也在做标签的解析,互相探讨一下获取网页源码用file_get_contents($url)就可以了url如果是要解析网址的话记得加上http://比如在输入框内输入http://www.baidu.com//以下代码保存为catch.php<?phpif(isset($_POST["url"])){ $url=$_POST["url"]; $fcont=file_get_contents($url); if(eregi('<table(.*)table>',$fcont,$re))  echo "Yes"; else  echo "No"; echo $re[0];}else{?><form action="catch.php" method="post">url:<input type="text" size=30 name="url"><input type="submit" name="submit" value="Catch"></form><?php}?>此外，还有一个现成的解析工具包simplehtmldom，详见参考资料

提问者评价

说得很详细，谢谢

参考资料： http://simplehtmldom.sourceforge.net/

以下就是几种常用的用php抓取网页中的内容的方法。1.file_get_contents PHP代码 代码如下:>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><?php $url = "http://www.jb51.net"; $contents = file_get_contents($url); //如果出现中文乱码使用下面代码 //$getcontent = iconv("gb2312", "utf-8",$contents); echo $contents; ?> 2.curl PHP代码 代码如下:>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><?php $url = "http://www.jb51.net"; $ch = curl_init(); $timeout = 5; curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); //在需要用户检测的网页里需要增加下面两行 //curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY); //curl_setopt($ch, CURLOPT_USERPWD, US_NAME.":".US_PWD); $contents = curl_exec($ch); curl_close($ch); echo $contents; ?> 3.fopen->fread->fclose PHP代码 代码如下:>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><?php $handle = fopen ("http://www.jb51.net", "rb"); $contents = ""; do { $data = fread($handle, 1024); if (strlen($data) == 0) { break; } $contents .= $data; } while(true); fclose ($handle); echo $contents; ?> 注意： 1.使用file_get_contents和fopen必须空间开启allow_url_fopen。方法：编辑php.ini，设置allow_url_fopen = On，allow_url_fopen关闭时fopen和file_get_contents都不能打开远程文件。 2.使用curl必须空间开启curl。方法：windows下修改php.ini，将extension=php_curl.dll前面的分号去掉，而且需要拷贝ssleay32.dll和libeay32.dll到C:\WINDOWS\system32下；Linux下要安装curl扩展。##############至于说哪种方法好,这个不好说,看你要完成的任务有多难来定吧,我一般用CURL比较多....自己去网上搜搜,有很多博客里有教程,感兴趣的事加上聪明的人,一下下就学会了哦..祝你好运.

0 0