opscenter dashboard排错

来源:互联网 发布:社交网络改变世界 编辑:程序博客网 时间:2024/05/03 23:10

系统环境

opscenter 5.2
centOS 6.6
cassandra 2.0.x

问题

opscenter上的dashboard监控cassandra集群一段时间(大约1天)后总会停止显示。

然而在cassandra节点上发现datastax-agent进程还是好好的在运行着。

之后查看datastax agent的LOG日志发现

WARN [Thread-10] .... operations dropped so far.WARN [Thread-10] .... Cassandra operation queue is full, discarding cassandra operationError when proccessing cassandra callcom.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /192.168.47.222:9042 (com.datastax.driver.core.TransportException: [/192.168.47.222:9042] Connection has been closed))ERROR [Reconnection-0] 2015-08-05 16:06:39,841 Unknown error during reconnection to /192.168.47.222:9042, scheduling retry in 8000 milliseconds

初步认定是cassandra request过多导致

解决方案

/var/lib/datastax-agent/conf/address.yaml中添加参数

stomp_interface: opscenterIPuse_ssl: 0async_pool_size: 200thrift_max_cons: 200async_queue_size: 20000hosts: 集群ip,格式为["host1","host2"]local_interface: localhostcassandra_conf: /xxx/apache-cassandra-2.0.15/conf/cassandra.yaml

$CASSANDRA_HOME/conf/clusters/cluster_name.conf中修改

[stomp]batch_size = 10000push_interval = 10

一些参数

#address.yaml参数thrift_max_conns - the max number of concurrent connections to make to the local nodeasysnc_pool_size - the size of the threadpool pulling from a queue of inserts and inserting in to cassandraasync_queue_size - the size of the queue of inserts to send to cassandra, if the queue fills up additional operations will be dropped#stomp参数batch_size - The number of request updates OpsCenter will push out at once. The default value is 100. This is used to avoid overloading the browser.push_interval - How often OpsCenter will push out updates to requests. The default value is 3 seconds. This is used to avoid overloading the browser

done.

opscenter配置官方文档

0 0