Trafodion Troubleshooting-java.io.IOException: createTable exception

来源:互联网 发布:淘宝授权书可以造假吗 编辑:程序博客网 时间:2024/05/17 21:40

现象

今天有位用户在Trafodion数据库中对表进行更新统计信息时遇到报错如下,

*** ERROR[9214] Object TRAFODION.BIGDATA_REPORT_TEST.TRAF_SAMPLE_339036475483133524_1508207122_572837 could not be created. [2017-10-17 10:26:29]*** ERROR[8448] Unable to access Hbase interface. Call to ExpHbaseInterface::create() returned error HBASE_CREATE_ERROR(701). Cause: java.io.IOException: createTable exception. Unable to create table TRAFODION.BIGDATA_REPORT_TEST.TRAF_SAMPLE_339036475483133524_1508207122_572837org.apache.hadoop.hbase.client.transactional.RMInterface.createTable(RMInterface.java:594)org.trafodion.sql.HBaseClient.createk(HBaseClient.java:503). [2017-10-17 10:26:29]*** ERROR[9200] UPDATE STATISTICS for table TRAFODION.BIGDATA_REPORT_TEST.ST_CONTENTVIEW_EVENTS encountered an error (8609) from statement Process_Query. [2017-10-17 10:26:29]*** ERROR[8609] Waited rollback performed without starting a transaction. [2017-10-17 10:26:29]*** ERROR[9201] Unable to DROP object TRAFODION.BIGDATA_REPORT_TEST.TRAF_SAMPLE_339036475483133524_1508207122_572837. [2017-10-17 10:26:29]*** ERROR[1389] Object TRAFODION.BIGDATA_REPORT_TEST.TRAF_SAMPLE_339036475483133524_1508207122_572837 does not exist in Trafodion. [2017-10-17 10:26:29]*** ERROR[9200] UPDATE STATISTICS for table TRAFODION.BIGDATA_REPORT_TEST.ST_CONTENTVIEW_EVENTS encountered an error (8609) from statement Process_Query. [2017-10-17 10:26:29]*** ERROR[8609] Waited rollback performed without starting a transaction. [2017-10-17 10:26:29]

分析

从以上报错信息看,问题出在无法创建SAMPLE表,通过sqcheck发现数据库一切正常,HBase检查也正常。关于建表出错,我们在前面一篇博客中提到一个可能的原因是节点时钟不同步导致。

解决

根据以上信息,我们检查tm_xxx.log日志,果然发现以下错误信息,

2017-10-17 10:43:48,960, ERROR, TM, Node: 0 Pid: 61878 Name: $TM0 TransId: 10450048 Event: 103005311 Message: Error at CHbaseTM::createTable() caused by exception java.io.IOException: createTable call errororg.trafodion.dtm.HBaseTxClient.callCreateTable(HBaseTxClient.java:1814) Caused byjava.io.IOException: java.util.concurrent.ExecutionException: java.io.IOException: java.io.IOException: pushOnlineEpoch -- Error: current onlineEpoch 1508208219670 is less than new onlineEpoch 1508208221684, transId: 10450048 in region: TRAFODION.BIGDATA_REPORT_TEST.TRAF_SAMPLE_339036475483133524_1508208194_356302,,1508208218191.28ffe67ad3fa9e36660693043136e719.org.apache.hadoop.hbase.client.transactional.TransactionManager.pushRegionEpoch(TransactionManager.java:2131)org.apache.hadoop.hbase.client.transactional.TransactionManager.createTable(TransactionManager.java:2785)org.trafodion.dtm.HBaseTxClient.callCreateTable(HBaseTxClient.java:1809) Caused byjava.util.concurrent.ExecutionException: java.io.IOException: java.io.IOException: pushOnlineEpoch -- Error: current onlineEpoch 1508208219670 is less than new onlineEpoch 1508208221684, transId: 10450048 in region: TRAFODION.BIGDATA_REPORT_TEST.TRAF_SAMPLE_339036475483133524_1508208194_356302,,1508208218191.28ffe67ad3fa9e36660693043136e719.

再检查各节点的ntp服务发现ntp服务均正常,查看节点的时间结果如下,表明各节点之前的时间并未完成同步,

[trafodion@tc2 logs]$ pdsh $MY_NODES datetc3: Tue Oct 17 10:44:28 CST 2017tc4: Tue Oct 17 10:44:44 CST 2017tc2: Tue Oct 17 10:44:49 CST 2017

由于此判定确实是由于节点时钟不同步问题导致,后续解决办法请参考前一篇博客:http://blog.csdn.net/post_yuan/article/details/74199704
上述博客的解决方案属于临时解决,可能一段时间后又会出现时间不一致的情况,这可能由于网络集群与远程时钟服务器的网络有关。为避免集群与远程时钟服务器的网络影响,我们可以配置本地时钟服务器,即把其中一个节点作为本地时钟服务器,其余节点与之做时钟同步,具体方法可参考博客: http://blog.csdn.net/post_yuan/article/details/76906986

阅读全文
0 0
原创粉丝点击