Flume-ng出现HDFS IO error,Callable timed out异常

来源:互联网 发布:软件著作权 受理登记 编辑:程序博客网 时间:2024/04/30 19:03

这两台flume-ng晚上9点~11点flume出现异常:

25 Mar 2014 22:18:25,189 ERROR [hdfs-thrift_hdfsSink-roll-timer-0] (org.apache.flume.sink.hdfs.BucketWriter$2.call:257)  - Unexpected errorjava.io.IOException: Callable timed out after 10000 ms on file: /logdata/2014/03/25/software_log/197a.1395757003934.tmp     at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:550)     at org.apache.flume.sink.hdfs.BucketWriter.doFlush(BucketWriter.java:353)     at org.apache.flume.sink.hdfs.BucketWriter.flush(BucketWriter.java:319)     at org.apache.flume.sink.hdfs.BucketWriter.close(BucketWriter.java:277)     at org.apache.flume.sink.hdfs.BucketWriter$2.call(BucketWriter.java:255)     at org.apache.flume.sink.hdfs.BucketWriter$2.call(BucketWriter.java:250)     at java.util.concurrent.FutureTask.run(FutureTask.java:262)     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)     at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)     at java.lang.Thread.run(Thread.java:724)Caused by: java.util.concurrent.TimeoutException     at java.util.concurrent.FutureTask.get(FutureTask.java:201)     at org.apache.flume.sink.hdfs.BucketWriter.callWithTimeout(BucketWriter.java:543)     ... 11 more25 Mar 2014 22:34:17,639 WARN  [ResponseProcessor for block BP-928773537-10.31.246.10-1392969615809:blk_-6973900543529394933_175021] (org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run:747)  - DFSOutputStream ResponseProcessor exception  for block BP-928773537-10.31.246.10-1392969615809:blk_-6973900543529394933_175021java.io.EOFException: Premature EOF: no length prefix available     at org.apache.hadoop.hdfs.protocol.HdfsProtoUtil.vintPrefixed(HdfsProtoUtil.java:171)     at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:114)     at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:694)

查看日志高峰期:


可以明显的看到18点开始,到23点之间是高峰期,hadoop集群百兆带宽,在日志写入高峰期时,达到带宽上限。hadoop这边我们还没有部署监控工具(-。。-)

目前解决方案:

  • 根据http://blog.csdn.net/yangbutao/article/details/8845025,修改flume hdfs客户端超时时间。
  • 修改hdfs.callTimeout从默认的10秒改成:40秒
转载请注明:来自http://blog.csdn.net/wsscy2004
0 0