pig的 Replicated Join 失败
来源:互联网 发布:李子树下埋死人 知乎 编辑:程序博客网 时间:2024/06/05 13:31
在使用pig的特殊join是报了类似下边的错误‘发现这是pig的bug地址为 https://issues.apache.org/jira/browse/PIG-3725
错误信息
Join_6, MergeJoin_5, Join_8, Join_7, MergeJoin_2, MergeJoin_3, MergeJoin_8, MergeJoin_1, MultiQuery_14, MergeJoin_4, MergeJoin_9, MergeJoin_6, MergeJoin_7.
In these tests, Pig need to read a local file distributed by distribute cache. However, Pig try to read hdfs instead. Here is the stack:
org.apache.pig.backend.executionengine.ExecException: ERROR 2081: Unable to setup the load function.
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:289)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:263)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.setUpHashMap(POFRJoin.java:398)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.getNextTuple(POFRJoin.java:231)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2081: Unable to setup the load function.
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLoad.getNextTuple(POLoad.java:127)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:281)
... 14 more
Caused by: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://hor14n23.gq1.ygridcore.net:8020/user/hrt_qa/pigrepl_scope-75_831941592_1390506968802_1
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:285)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigFileInputFormat.listStatus(PigFileInputFormat.java:37)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
at org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:190)
at org.apache.pig.impl.io.ReadToEndLoader.<init>(ReadToEndLoader.java:146)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLoad.setUp(POLoad.java:95)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLoad.getNextTuple(POLoad.java:123)
... 15 more
解决办法
Index: src/org/apache/pig/backend/hadoop/datastorage/ConfigurationUtil.java===================================================================--- src/org/apache/pig/backend/hadoop/datastorage/ConfigurationUtil.java(revision 1561195)+++ src/org/apache/pig/backend/hadoop/datastorage/ConfigurationUtil.java(working copy)@@ -28,6 +28,7 @@ import org.apache.hadoop.conf.Configuration; import org.apache.pig.ExecType;+import org.apache.pig.backend.hadoop.executionengine.HExecutionEngine; import org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce; import org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil; @@ -94,7 +95,8 @@ } } Properties props = ConfigurationUtil.toProperties(localConf);- props.setProperty(MapRedUtil.FILE_SYSTEM_NAME, "file:///");+ props.setProperty(HExecutionEngine.FILE_SYSTEM_LOCATION, "file:///");+ props.setProperty(HExecutionEngine.ALTERNATIVE_FILE_SYSTEM_LOCATION, "file:///"); return props; } }Index: src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java===================================================================--- src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java(revision 1561195)+++ src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java(working copy)@@ -68,8 +68,8 @@ public class HExecutionEngine { public static final String JOB_TRACKER_LOCATION = "mapred.job.tracker";- private static final String FILE_SYSTEM_LOCATION = "fs.default.name";- private static final String ALTERNATIVE_FILE_SYSTEM_LOCATION = "fs.defaultFS";+ public static final String FILE_SYSTEM_LOCATION = "fs.default.name";+ public static final String ALTERNATIVE_FILE_SYSTEM_LOCATION = "fs.defaultFS"; private static final String HADOOP_SITE = "hadoop-site.xml"; private static final String CORE_SITE = "core-site.xml";
- pig的 Replicated Join 失败
- pig JOIN 的replicated后标
- Pig 和 Hive 的表连接 Join
- Pig Latin JOIN (inner) 与JOIN (outer)的区别
- pig- Join 优化
- Pig join cogroup 介绍
- Hadoop MapReduce进阶 使用分布式缓存进行replicated join
- Hadoop MapReduce进阶 使用分布式缓存进行replicated join
- Hadoop MapReduce进阶 使用分布式缓存进行replicated join
- Hadoop MapReduce进阶 使用分布式缓存进行replicated join
- Hadoop MapReduce进阶 使用分布式缓存进行replicated join
- Hadoop MapReduce进阶 使用分布式缓存进行replicated join
- 【pig】pig的注释格式
- Pig 学习之 Join 、Group、sort、Union
- Pig年末数据失败之谜
- 【pig】pig的vim高亮设置
- 【pig】pig的vim高亮设置
- pig的udf相关
- 2012年5月SAT香港真题解析
- 深入浅出Spark视频
- JAVA线程的初步学习
- OpenCV2.1的安装和VS2008的设置
- SpringMvc + Quarzt 动态执行任务实现过程
- pig的 Replicated Join 失败
- android 自定义控件字体,解决字体偏移,卡顿,代码重复等问题
- Pig的一个小问题 (filter之后 没有数据)
- Python学习之语句、列表
- Tiled Layer层空报错问题
- 无题
- MFC打开一个文件
- cocos2dx实现电脑Enter、Escape、方向键的响应
- IllegalArgumentException异常