获取目录下所有html文件

来源：互联网发布：mysql limit offset 编辑：程序博客网时间：2024/06/05 18:53

因为需要解析一些html，所以要遍历各个目录下的所有html

方法：

private static void GetFile(String path){  File file=new File(path);  File[] tempList = file.listFiles();  //System.out.println("该目录下对象个数："+tempList.length);  for (int i = 0; i < tempList.length; i++) {   if (tempList[i].isFile()) {   if(tempList[i].toString().endsWith("htm")){   System.out.println("进入文件："+tempList[i]);     try {GetHtml(tempList[i].toString());} catch (IOException e) {// TODO Auto-generated catch blocke.printStackTrace();} System.out.println("离开文件："+tempList[i]);   }   }   if (tempList[i].isDirectory()) {   GetFile(tempList[i].toString());    //System.out.println("文件夹："+tempList[i]);   }  }}

接下来就是使用jsoup了

private static void GetHtml(String filename) throws IOException {

File input = new File(filename);
Document doc = Jsoup.parse(input, "ISO-8859-1", "");

。。。。

。。。。。

阅读全文

0 0