运行heritrix1.14.4报错 thread-10 org.archive.util.ArchiveUtils.() TLD list unavailable
来源:互联网 发布:那个软件看电视免费 编辑:程序博客网 时间:2024/06/09 18:59
运行heritrix1.14.4报错 thread-10 org.archive.util.ArchiveUtils.<clinit>() TLD list unavailable
最近需要用到heritrix做一个需求,下来研究了一下。
根据网上的文章在eclipse中启动报了下面的错误
10:02:59.968 EVENT Starting Jetty/4.2.23
10:03:00.765 EVENT Started WebApplicationContext[/,Heritrix Console]
10:03:00.859 EVENT The scratchDir you specified: F:/project3.5/heritrix/target/jsp-compiled-development is unusable.
10:03:01.000 EVENT Started SocketListener on 127.0.0.1:8088
10:03:01.000 EVENT Started org.mortbay.jetty.Server@1f6ba0f
2010-07-10 10:03:01.250 严重 thread-10 org.archive.util.ArchiveUtils.<clinit>() TLD list unavailable
java.lang.NullPointerException
at java.io.Reader.<init>(Unknown Source)
at java.io.InputStreamReader.<init>(Unknown Source)
at org.archive.util.ArchiveUtils.<clinit>(ArchiveUtils.java:759)
at org.archive.crawler.settings.CrawlSettingsSAXHandler$DateHandler.endElement(CrawlSettingsSAXHandler.java:385)
at org.archive.crawler.settings.CrawlSettingsSAXHandler.endElement(CrawlSettingsSAXHandler.java:248)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at org.archive.crawler.settings.XMLSettingsHandler.readSettingsObject(XMLSettingsHandler.java:298)
at org.archive.crawler.settings.XMLSettingsHandler.readSettingsObject(XMLSettingsHandler.java:339)
at org.archive.crawler.settings.SettingsHandler.initialize(SettingsHandler.java:130)
at org.archive.crawler.settings.XMLSettingsHandler.initialize(XMLSettingsHandler.java:124)
at org.archive.crawler.admin.CrawlJobHandler.loadProfile(CrawlJobHandler.java:385)
at org.archive.crawler.admin.CrawlJobHandler.loadProfiles(CrawlJobHandler.java:348)
at org.archive.crawler.admin.CrawlJobHandler.<init>(CrawlJobHandler.java:217)
at org.archive.crawler.admin.CrawlJobHandler.<init>(CrawlJobHandler.java:186)
at org.archive.crawler.Heritrix.<init>(Heritrix.java:405)
at org.archive.crawler.Heritrix.<init>(Heritrix.java:393)
at org.archive.crawler.Heritrix.doCmdLineArgs(Heritrix.java:718)
at org.archive.crawler.Heritrix.main(Heritrix.java:556)
虽然报错,但是可以进入登陆页面,UI已经正常启动。
这个东西没有用过,前一天刚刚使用cmd命令运行成功,今天在eclipse中建工程又碰到新问题。
一步一坎啊。
昨天运行的时候后台是没有报这个错误的,但是今天在eclipse下配置文件位置不对也报过NullPointerException的错误。
因此分析还是少了某个文件。
经过几个小时调试,发现是少了一个名字为tlds-alpha-by-domain.txt的文件。
发布包中对应位置是有该文件的,具体位置为org/archive/util,在该路径下补充该文件就不报错了。
至于该文件的用途还不清楚,有高手可以指点一下。
该文件可以在源文件包src/resources路径下找到。
- 运行heritrix1.14.4报错 thread-10 org.archive.util.ArchiveUtils.() TLD list unavailable
- 运行heritrix1.14.4报错 thread-10 org.archive.util... (转载)
- eclipse下运行heritrix1.14.4报错
- 项目运行报错:严重: A child container failed during start java.util.concurrent.ExecutionException: org.apache
- 解决报错:org/springframework/util/backoff/BackOff
- heritrix1.14.4 源代码在eclipse下最简单的配置方法-------不会报错!
- 多线程报错 : Exception in thread "Thread-3" java.util.ConcurrentModificationException 并发修改异常
- List删除元素报Exception in thread "main" java.util.ConcurrentModificationException异常,或数据删除不完整
- java.lang.NoClassDefFoundError: org/eclipse/jetty/util/thread/QueuedThreadPool$1
- IIS报Service Unavailable错的解决方案
- IIS报Service Unavailable错的解决方案
- Magento Service Temporarily Unavailable报错解决办法
- Magento Service Temporarily Unavailable报错解决办法
- Caused by: java.lang.NoClassDefFoundError: org/apache/tomcat/util/descriptor/tld/TldParser
- javamail开发报错Exception in thread "main" java.lang.NoClassDefFoundError: com/sun/mail/util
- quartz报错java.lang.NoSuchMethodError: org.apache.commons.collections.SetUtils.orderedSet(Ljava/util/Set;)Ljava/util/Set
- org.apache.taglibs.standard.tlv.JstlCoreTLV报错运行
- spring环境tomcat启动报错NoSuchMethodError: org.springframework.util.Assert.noNullElements([Ljava/lan...
- 使用MongoDB存储访问者信息
- pku1026Cipher 置换群
- js encode ,java decode,virtools 页面播放~
- 安装和管理MySql数据库(MySql数据库初探一)
- 今年想做几件事
- 运行heritrix1.14.4报错 thread-10 org.archive.util.ArchiveUtils.() TLD list unavailable
- 差分信号
- sigprocmask
- Lightinthebox模版快速修改
- 1034. Forest
- 10个影响性能的问题
- 转:如何做一个售前工程师?
- Delphi ListView基本用法大全
- 计算机网络通信协议结构