Zookeeper问题排查

来源:互联网 发布:mac终端怎么退出vim 编辑:程序博客网 时间:2024/06/18 18:23

现象

zookeeper版本为3.4.3, hbase版本为0.94.7。

按照zk的设计,一台机器down了之后应该仍然可以工作,但实际上应用中并不如此。

Zookeeper一台机器在生产环境中被挪走,客户端始终无法连接HBase。

问题排查

抛出如下异常:

Caused by: java.net.UnknownHostException: ops-new-launch-7237.iad7.amazon.comat java.net.InetAddress.getAllByName0(InetAddress.java:1259)at java.net.InetAddress.getAllByName(InetAddress.java:1171)at java.net.InetAddress.getAllByName(InetAddress.java:1105)at org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:60)at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:440)at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:375)at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.<init>(RecoverableZooKeeper.java:98)at org.apache.hadoop.hbase.zookeeper.ZKUtil.connect(ZKUtil.java:127)at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:153)at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:127)at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1395)

进入源码 http://grepcode.com/file/repo1.maven.org/maven2/org.apache.zookeeper/zookeeper/3.4.3/org/apache/zookeeper/ZooKeeper.java#440

    public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher,            boolean canBeReadOnly)        throws IOException    {        LOG.info("Initiating client connection, connectString=" + connectString                + " sessionTimeout=" + sessionTimeout + " watcher=" + watcher);        watchManager.defaultWatcher = watcher;        ConnectStringParser connectStringParser = new ConnectStringParser(                connectString);        HostProvider hostProvider = new StaticHostProvider(                connectStringParser.getServerAddresses());        cnxn = new ClientCnxn(connectStringParser.getChrootPath(),                hostProvider, sessionTimeout, this, watchManager,                getClientCnxnSocket(), canBeReadOnly);        cnxn.start();    }

可以发现,在解析hostname的IP时候抛出的UnknownhostException, 并没有retry处理。


结论

zookeeper的并不是无条件容忍host的down,如果host从dns挪走的情况,它也不能处理。




0 0
原创粉丝点击