Tez不能找到压缩类问题

来源:互联网 发布:windows下安装hadoop 编辑:程序博客网 时间:2024/06/06 04:25

为提高程序处理效率添加了
set mapreduce.map.output.compress=true;
set mapreduce.map.output.compress.codec=org.apache.hadoop.io.compress.SnappyCodec;
这样会导致tez程序报错,具体错误如下: 

TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
at org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy(Native Method)
at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:63)
at org.apache.hadoop.io.compress.SnappyCodec.getCompressorType(SnappyCodec.java:133)
at org.apache.tez.runtime.library.common.sort.impl.ExternalSorter.<init>(ExternalSorter.java:191)
at org.apache.tez.runtime.library.common.sort.impl.dflt.DefaultSorter.<init>(DefaultSorter.java:119)
at org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput.start(OrderedPartitionedKVOutput.java:115)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:148)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:162)
... 13 more
]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex vertex_1463493135662_436567_1_00 [Map 1] killed/failed due to:null]
Vertex killed, vertexName=Reducer 2, vertexId=vertex_1463493135662_436567_1_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as other vertex failed. failedTasks:0, Vertex vertex_1463493135662_436567_1_01 [Reducer 2] killed/failed due to:null]
DAG failed due to vertex failure. failedVertices:1 killedVertices:1
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask
查看对应的launch_container.sh脚本,发现export LD_LIBRARY_PATH="$PWD:/opt/hadoop-client/lib/native:$PWD" 是指向客户端的native包路径

查看文档说是可以配置tez.task.launch.env来配置LD_LIBRARY_PATH,经测试该参数无效,查看代码
 

这里是如果没有配置LD_LIBRARY_PATH,才使用tez.task.launch_env中配置的LD_LIBRARY_PATH参数。
再来分析具体是哪里添加的这个环境变量

 
所以这里实际上要配置的是如下的这个参数,
 
而默认的mapred-site.xml中是没有配置该参数的,为避免影响到其他的非tez类作业,将该参数添加到tez-site.xml文件中,该文件存放到/opt/hive-client/conf目录下。
这样修改后测试,不在报上述错误。

0 0