openfalcon-mongodb监控插件

来源:互联网 发布:java解析json数组数据 编辑:程序博客网 时间:2024/05/29 02:40

下载地址

https://github.com/ZhuoRoger/mongomon

修改版本下载

  • http://pan.baidu.com/s/1pLgwOob
  • 修改uptime为COUNTER,方便监控
  • 修改replca-set bug,可以正常采集mongodb集群

mongodb自己连接测试

./mongo -u admin2 -p “123” –authenticationDatabase admin

配置 conf/mongomon.conf

items:-  {port: 27017, user: "admin2",password: "123"}

程序 bin/mongodb_monitor.py调整

  • hostname调整,一般我喜欢调整成ip地址
  • 这里写图片描述

mongodb monitor cron

// root下面用这个 * * * * * user (cd /opt/abc/agent/falcon-plugins/mongomon/bin/ && python  mongodb_monitor.py  > /dev/null)* * * * * cd /opt/abc/agent/falcon-plugins/mongomon/bin/ && python  mongodb_monitor.py  >> /dev/null

mongodb创建admin账户,要有root权限,否则不会被授权

use admindb.createUser(  {  user: "admin2",  pwd: "123",  roles: [ { role: "root", db: "admin" } ]  });exit;

一个坑

  • 领导要求监控uptime,如果这个300分钟内没有变化就说明mongodb不工作了,需要报警,zabbix里面有这个插件,那么falcon是否也可以呢?
  • 调整之前
    这里写图片描述

  • 调整之后
    这里写图片描述
    经过调查发现,falcon里面对于uptime这个指标,是GAUGE类型,就是说,mongodb运行一分钟,我会有数值60,一个小时就是3600,运行个几天,数据大的不行,而且关键这个东西没法做报警,不是一个相对值,所以我稍微修改了下作者的代码,把uptime加到counter列表里面,这样,每次我们就会得到一个稳定的1,因为假设mongodb一分钟内一直在运行,两次踩点之间的差是60,然后除以60,最后得到1,也就是说,我们最终得到的值只可能是1左右,否则就是不工作。

又一个坑

这里写图片描述

  • 主要是,作者本来插件已经支持单机和replca-set的监控,但是在replica-set的监控中,由于引入了mongodb的 Timestamp,这个是bson里面的,必须安装python的一个插件,以及修改json处理中少了 [“ts”], 两个地方会导致最终采集错误,修改后的整体代码如下,只需要覆盖 mongodb_server.py即可。

  • 小伙伴们可以继续测试下,有问题请留言。

  • 不要忘了pip install mongo-connector 哦。 参考 http://nullege.com/codes/search/bson.timestamp.Timestamp
  • 总结下拍错思路,根据报错信息,然后加上type(something)去看到底是个啥鬼。
#! /bin/env python#-*- coding:utf8 -*-# Filename: mongodb_server.pyimport sysimport osfrom bson.timestamp import Timestampimport pymongofrom pymongo import MongoClientclass mongodbMonitor(object):    def mongodb_connect(self,host=None, port=None, user=None, password=None):#       try:            conn = MongoClient(host, port, serverSelectionTimeoutMS=1000)  # conntion timeout 1 sec.             if user and password:                db_admin = conn["admin"]                if not db_admin.authenticate(user,password):                            pass;            conn.server_info()              print conn#       except :#           e = sys.exc_info()[0]#           return e, None            return 0,conn    #data node(1): standalone, replset primary, replset secondary. mongos(2), mongoConfigSrv(3)    def get_mongo_role(self, conn):        mongo_role = 1          conn.server_info()         if (conn.is_mongos):            mongo_role = 2         elif ("chunks" in conn.get_database("config").collection_names()): #  Role is a config servers?  not mongos and has config.chunks collections. it's a config server.            mongo_role = 3          return mongo_role    def get_mongo_monitor_data(self, conn):        mongo_monitor_dict ={}        mongo_monitor_dict["mongo_local_alive"] = 1  # mongo local alive metric for all nodes.                mongo_role = self.get_mongo_role(conn)                if(mongo_role == 1):                        mongodb_role,serverStatus_dict = self.serverStatus(conn)                        mongo_monitor_dict.update(serverStatus_dict)                        repl_status_dict = {}                        if (mongodb_role == "master" or mongodb_role == "secondary"):                                repl_status_dict = self.repl_status(conn)                                mongo_monitor_dict.update(repl_status_dict)            else:                print "this is standalone node"        elif(mongo_role == 2): # mongos            shards_dict = self.shard_status(conn)            mongo_monitor_dict.update(shards_dict)        return mongo_monitor_dict    def  serverStatus(self,connection):        serverStatus = connection.admin.command(pymongo.son_manipulator.SON([('serverStatus', 1)]))        mongodb_server_dict = {}  # mongodb server status metric for upload to falcon        mongo_version = serverStatus["version"]        #uptime metric        mongodb_server_dict["uptime"] = int(serverStatus["uptime"])        #asserts section metrics        mongo_asserts = serverStatus["asserts"]        for asserts_key in mongo_asserts.keys():            asserts_key_name = "asserts_" + asserts_key            mongodb_server_dict[asserts_key_name] = mongo_asserts[asserts_key]              ### "extra_info" section metrics: page_faults.  falcon counter type.        if serverStatus.has_key("extra_info"):            mongodb_server_dict["page_faults"] = serverStatus["extra_info"]["page_faults"]        ### "connections" section metrics        current_conn =  serverStatus["connections"]["current"]        available_conn = serverStatus["connections"]["available"]        mongodb_server_dict["connections_current"] = current_conn        mongodb_server_dict["connections_available"] = available_conn        # mongodb connection used percent         mongodb_server_dict["connections_used_percent"] = int((current_conn/(current_conn + available_conn)*100))         # total created from mongodb started.  COUNTER metric        mongodb_server_dict["connections_totalCreated"] =  serverStatus["connections"]["totalCreated"]        #  "globalLock" currentQueue        mongodb_server_dict["globalLock_currentQueue_total"] = serverStatus["globalLock"]["currentQueue"]["total"]        mongodb_server_dict["globalLock_currentQueue_readers"] = serverStatus["globalLock"]["currentQueue"]["readers"]        mongodb_server_dict["globalLock_currentQueue_writers"] = serverStatus["globalLock"]["currentQueue"]["writers"]        # "locks" section, Changed in version 3.0        if serverStatus.has_key("locks") and mongo_version >"3.0":            locks_dict_keys = serverStatus["locks"].keys()            for lock_scope in locks_dict_keys:  # Global, Database,Collection,Oplog                for lock_metric  in serverStatus["locks"][lock_scope]:                    for lock_type in serverStatus["locks"][lock_scope][lock_metric]:                        if lock_type == "R":                            lock_name = "Slock"                        elif lock_type == "W":                            lock_name = "Xlock"                        elif lock_type == "r":                            lock_name = "ISlock"                        elif lock_type == "w":                            lock_name = "IXlock"                        lock_metric_key = "locks_" + lock_scope + "_" + lock_metric + "_" + lock_name                               mongodb_server_dict[lock_metric_key] =  serverStatus["locks"][lock_scope][lock_metric][lock_type]        # "network" section metrics: bytesIn, bytesOut, numRequests;  counter type        if serverStatus.has_key("network"):            for network_metric in serverStatus["network"].keys():                network_metric_key = "network_"  + network_metric   # network metric key for upload                mongodb_server_dict[network_metric_key] = serverStatus["network"][network_metric]        ### "opcounters" section metrics: insert, query, update, delete, getmore, command. couter type        if serverStatus.has_key("opcounters"):            for opcounters_metric in serverStatus["opcounters"].keys():                opcounters_metric_key = "opcounters_" + opcounters_metric                 mongodb_server_dict[opcounters_metric_key] = serverStatus["opcounters"][opcounters_metric]        ### "opcountersRepl" section metrics: insert, query, update, delete, getmore, command. couter type        if serverStatus.has_key("opcountersRepl"):            for opcountersRepl_metric in serverStatus["opcountersRepl"].keys():                opcountersRepl_metric_key = "opcountersRepl_" + opcountersRepl_metric                 mongodb_server_dict[opcountersRepl_metric_key] = serverStatus["opcounters"][opcountersRepl_metric]        ### "mem" section metrics:         if serverStatus.has_key("mem"):            for mem_metric in serverStatus["mem"].keys():                mem_metric_key = "mem_"  + mem_metric                if( mem_metric in ["bits","supported"] ):                    mongodb_server_dict[mem_metric_key] = serverStatus["mem"][mem_metric]                else:                    mongodb_server_dict[mem_metric_key] = serverStatus["mem"][mem_metric]*1024*1024        ### "dur" section metrics:        if serverStatus.has_key("dur"):            mongodb_server_dict["dur_journaledBytes"] = serverStatus["dur"]["journaledMB"]*1024*1024            mongodb_server_dict["dur_writeToDataFilesBytes"] = serverStatus["dur"]["writeToDataFilesMB"]*1024*1024            mongodb_server_dict["dur_commitsInWriteLock"] = serverStatus["dur"]["commitsInWriteLock"]        ### "repl" section        mongodb_role = ""        if (serverStatus.has_key("repl") and  serverStatus["repl"].has_key("secondary")):            if serverStatus["repl"]["ismaster"]:                mongodb_role = "master"            if  serverStatus["repl"]["secondary"]:                mongodb_role = "secondary"        else: # not Replica sets mode            mongodb_role = "standalone"         ### "backgroundFlushing" section metrics, only for MMAPv1        if serverStatus.has_key("backgroundFlushing"):            for bgFlush_metric in serverStatus["backgroundFlushing"].keys():                if bgFlush_metric != "last_finished":  # discard last_finished metric                     bgFlush_metric_key = "backgroundFlushing_" + bgFlush_metric                    mongodb_server_dict[bgFlush_metric_key] = serverStatus["backgroundFlushing"][bgFlush_metric]        ### cursor from "metrics" section        if serverStatus.has_key("metrics") and  serverStatus["metrics"].has_key("cursor"):            cursor_status = serverStatus["metrics"]["cursor"]            mongodb_server_dict["cursor_timedOut"] = cursor_status["timedOut"]              mongodb_server_dict["cursor_open_noTimeout"] =  cursor_status["open"]["noTimeout"]            mongodb_server_dict["cursor_open_pinned"] =  cursor_status["open"]["pinned"]            mongodb_server_dict["cursor_open_total"] =  cursor_status["open"]["total"]        ### "wiredTiger" section         if serverStatus.has_key("wiredTiger"):            serverStatus_wt = serverStatus["wiredTiger"]            #cache             wt_cache = serverStatus_wt["cache"]            mongodb_server_dict["wt_cache_used_total_bytes"] = wt_cache["bytes currently in the cache"]            mongodb_server_dict["wt_cache_dirty_bytes"] = wt_cache["tracked dirty bytes in the cache"]            mongodb_server_dict["wt_cache_readinto_bytes"] = wt_cache["bytes read into cache"]            mongodb_server_dict["wt_cache_writtenfrom_bytes"] = wt_cache["bytes written from cache"]            #concurrentTransactions            wt_concurrentTransactions = serverStatus_wt["concurrentTransactions"]            mongodb_server_dict["wt_concurrentTransactions_write"] = wt_concurrentTransactions["write"]["available"]            mongodb_server_dict["wt_concurrentTransactions_read"] = wt_concurrentTransactions["read"]["available"]              #"block-manager" section            wt_block_manager = serverStatus_wt["block-manager"]            mongodb_server_dict["wt_bm_bytes_read"] = wt_block_manager["bytes read"]            mongodb_server_dict["wt_bm_bytes_written"] = wt_block_manager["bytes written"]            mongodb_server_dict["wt_bm_blocks_read"] = wt_block_manager["blocks read" ]            mongodb_server_dict["wt_bm_blocks_written"] = wt_block_manager["blocks written"]        ### "rocksdb" engine         if serverStatus.has_key("rocksdb"):            serverStatus_rocksdb = serverStatus["rocksdb"]            mongodb_server_dict["rocksdb_num_immutable_mem_table"]  = serverStatus_rocksdb["num-immutable-mem-table"]            mongodb_server_dict["rocksdb_mem_table_flush_pending"] = serverStatus_rocksdb["mem-table-flush-pending"]            mongodb_server_dict["rocksdb_compaction_pending"] = serverStatus_rocksdb["compaction-pending"]                        mongodb_server_dict["rocksdb_background_errors"] = serverStatus_rocksdb["background-errors"]                        mongodb_server_dict["rocksdb_num_entries_active_mem_table"] = serverStatus_rocksdb["num-entries-active-mem-table"]                        mongodb_server_dict["rocksdb_num_entries_imm_mem_tables"] = serverStatus_rocksdb["num-entries-imm-mem-tables"]                        mongodb_server_dict["rocksdb_num_snapshots"] = serverStatus_rocksdb["num-snapshots"]                        mongodb_server_dict["rocksdb_oldest_snapshot_time"] = serverStatus_rocksdb["oldest-snapshot-time"]                        mongodb_server_dict["rocksdb_num_live_versions"] = serverStatus_rocksdb["num-live-versions"]            mongodb_server_dict["rocksdb_total_live_recovery_units"] = serverStatus_rocksdb["total-live-recovery-units"]        ### "PerconaFT" engine        if serverStatus.has_key("PerconaFT"):            serverStatus_PerconaFT = serverStatus["PerconaFT"]            mongodb_server_dict["PerconaFT_log_count"] = serverStatus_PerconaFT["log"]["count"]            mongodb_server_dict["PerconaFT_log_time"] = serverStatus_PerconaFT["log"]["time"]            mongodb_server_dict["PerconaFT_log_bytes"] = serverStatus_PerconaFT["log"]["bytes"]            mongodb_server_dict["PerconaFT_fsync_count"] = serverStatus_PerconaFT["fsync"]["count"]            mongodb_server_dict["PerconaFT_fsync_time"] =  serverStatus_PerconaFT["fsync"]["time"]            ### cachetable            PerconaFT_cachetable = serverStatus_PerconaFT["cachetable"]            mongodb_server_dict["PerconaFT_cachetable_size_current"] = PerconaFT_cachetable["size"]["current"]             mongodb_server_dict["PerconaFT_cachetable_size_writing"]  = PerconaFT_cachetable["size"]["writing"]                        mongodb_server_dict["PerconaFT_cachetable_size_limit"]  = PerconaFT_cachetable["size"]["limit"]            ### PerconaFT checkpoint                        PerconaFT_checkpoint = serverStatus_PerconaFT["checkpoint"]            mongodb_server_dict["PerconaFT_checkpoint_count"] = PerconaFT_checkpoint["count"]            mongodb_server_dict["PerconaFT_checkpoint_time"] = PerconaFT_checkpoint["time"]            mongodb_server_dict["PerconaFT_checkpoint_write_nonleaf_count"] = PerconaFT_checkpoint["write"]["nonleaf"]["count"]            mongodb_server_dict["PerconaFT_checkpoint_write_nonleaf_time"] = PerconaFT_checkpoint["write"]["nonleaf"]["time"]                 mongodb_server_dict["PerconaFT_checkpoint_write_nonleaf_bytes_compressed"] = PerconaFT_checkpoint["write"]["nonleaf"]["bytes"]["compressed"]            mongodb_server_dict["PerconaFT_checkpoint_write_nonleaf_bytes_uncompressed"] = PerconaFT_checkpoint["write"]["nonleaf"]["bytes"]["uncompressed"]                          mongodb_server_dict["PerconaFT_checkpoint_write_leaf_count"] = PerconaFT_checkpoint["write"]["leaf"]["count"]                        mongodb_server_dict["PerconaFT_checkpoint_write_leaf_time"] = PerconaFT_checkpoint["write"]["leaf"]["time"]                             mongodb_server_dict["PerconaFT_checkpoint_write_leaf_bytes_compressed"] = PerconaFT_checkpoint["write"]["leaf"]["bytes"]["compressed"]                        mongodb_server_dict["PerconaFT_checkpoint_write_leaf_bytes_uncompressed"] = PerconaFT_checkpoint["write"]["leaf"]["bytes"]["uncompressed"]              ### serializeTime            for serializeTime_item  in serverStatus_PerconaFT["serializeTime"]:                prefix = "PerconaFT_serializeTime_" + serializeTime_item                for serializeTime_key in serverStatus_PerconaFT["serializeTime"][serializeTime_item]:                    key_name = prefix + "_" + serializeTime_key                        mongodb_server_dict[key_name] = serverStatus_PerconaFT["serializeTime"][serializeTime_item][serializeTime_key]              ### PerconaFT  compressionRatio            for compressionRatio_item in serverStatus_PerconaFT["compressionRatio"]:                key_name = "PerconaFT_compressionRatio_" + compressionRatio_item                mongodb_server_dict[key_name] = serverStatus_PerconaFT["compressionRatio"][compressionRatio_item]        return (mongodb_role, mongodb_server_dict)        def repl_status(self,connection):            replStatus = connection.admin.command("replSetGetStatus")        print replStatus                repl_status_dict = {}  # repl set metric dict                # myState "1" for PRIMARY , "2" for  SECONDARY, "3":                repl_status_dict["repl_myState"] = replStatus["myState"]                repl_status_members = replStatus["members"]                master_optime = 0 # Master oplog ops time                myself_optime = 0 # SECONDARY oplog ops time        print "开始打印repl_status_members"        print repl_status_members        print "结束打印repl_status_members"                for repl_member in repl_status_members:                        if repl_member.has_key("self") and repl_member["self"]:                                repl_status_dict["repl_health"] = repl_member["health"]                                #repl_status_dict["repl_optime"] = repl_member["optime"].time                #print "value of optime ts is:"                #print type(repl_member["optime"])                #print type(repl_member["optime"]["ts"])                print repl_member["optime"]["ts"].time                                repl_status_dict["repl_optime"] = repl_member["optime"]["ts"].time                                if repl_member.has_key("repl_electionTime"):                    repl_status_dict["repl_electionTime"] = repl_member["electionTime"].time                                if repl_member.has_key("repl_configVersion"):                    repl_status_dict["repl_configVersion"] = repl_member["configVersion"]                                #myself_optime = repl_member["optime"].time                                myself_optime = repl_member["optime"]["ts"].time                        if (replStatus["myState"] == 2 and repl_member["state"] == 1 ):  # CONDARY ,get repl lag                                master_optime = repl_member["optime"]["ts"].time                if replStatus["myState"] == 2 :                        repl_status_dict["repl_lag"] = master_optime - myself_optime                ### oplog window  hours                oplog_collection = connection["local"]["oplog.rs"]                oplog_tFirst =   oplog_collection.find({},{"ts":1}).sort('$natural',pymongo.ASCENDING).limit(1).next()                oplog_tLast = oplog_collection.find({},{"ts":1}).sort('$natural',pymongo.DESCENDING).limit(1).next()                oplogrs_collstats =   connection["local"].command("collstats", "oplog.rs")                window_multiple = 1   ##oplog.rs collections is not full                     if oplogrs_collstats.has_key("maxSize"):                        window_multiple = oplogrs_collstats["maxSize"]/(oplogrs_collstats["count"] * oplogrs_collstats["avgObjSize"])                else:                        window_multiple =  oplogrs_collstats["storageSize"]/(oplogrs_collstats["count"] * oplogrs_collstats["avgObjSize"])                #oplog_window  .xx hours                oplog_window = round((oplog_tLast["ts"].time - oplog_tFirst["ts"].time)/3600.0,2) * window_multiple  # full                repl_status_dict["repl_oplog_window"] = oplog_window                return repl_status_dict    # only for mongos node    def shard_status(self, conn):          config_db = conn["config"]            settings_col = config_db["settings"]            balancer_doc = settings_col.find_one({'_id':'balancer'})        shards_dict = {}            if balancer_doc is  None:                    shards_dict["shards_BalancerState"] = 1            elif balancer_doc["stopped"]:                      shards_dict["shards_BalancerState"] = 0            else:                     shards_dict["shards_BalancerState"] = 1            # shards_activeWindow metric,0: without setting, 1:setting             # shards_activeWindow_start  metric,  { "start" : "23:30", "stop" : "6:00" } :  23.30 for  23:30             # shards_activeWindow_stop metric            if balancer_doc is  None:                    shards_dict["shards_activeWindow"] = 0            elif balancer_doc.has_key("activeWindow"):                    shards_dict["shards_activeWindow"] = 1                    if balancer_doc["activeWindow"].has_key("start"):                            window_start = balancer_doc["activeWindow"]["start"]                            shards_dict["shards_activeWindow_start"] =  window_start.replace(":",".")                    if balancer_doc["activeWindow"].has_key("stop"):                            window_stop  = balancer_doc["activeWindow"]["stop"]                            shards_dict["shards_activeWindow_stop"] = window_stop.replace(":",".")            # shards_chunkSize metric            chunksize_doc = settings_col.find_one({"_id" : "chunksize"})            if chunksize_doc is not None:                    shards_dict["shards_chunkSize"] = chunksize_doc["value"]            # shards_isBalancerRunning metric            locks_col = config_db["locks"]            balancer_lock_doc = locks_col.find_one({'_id':'balancer'})            if balancer_lock_doc is None:                    print "config.locks collection empty or missing. be sure you are connected to a mongos"                    shards_dict["shards_isBalancerRunning"] = 0            elif balancer_lock_doc["state"] > 0:                    shards_dict["shards_isBalancerRunning"] = 1            else:                    shards_dict["shards_isBalancerRunning"] = 0            # shards_size metric              shards_col = config_db["shards"]            shards_dict["shards_size"] = shards_col.count()            # shards_mongosSize metric            mongos_col = config_db["mongos"]            shards_dict["shards_mongosSize"] = mongos_col.count()        return shards_dict

有用的指标

uptime

uptime/mongo=27017

opcounters

opcounters_insert/mongo=27017
opcounters_query/mongo=27017
opcounters_update/mongo=27017
opcounters_delete/mongo=27017
opcounters_getmore/mongo=27017
opcounters_command/mongo=27017

connections

connections_available/mongo=27017
connections_current/mongo=27017
connections_totalCreated/mongo=27017
connections_used_percent/mongo=27017

附录

  • 官网关于openfalcon数据类型的介绍。
    https://github.com/open-falcon/book/blob/master/zh/usage/data-push.md

  • counterType: 只能是COUNTER或者GAUGE二选一,前者表示该数据采集项为计时器类型,后者表示其为原值 (注意大小写)

  • GAUGE:即用户上传什么样的值,就原封不动的存储
  • COUNTER:指标在存储和展现的时候,会被计算为speed,即(当前值 - 上次值)/ 时间间隔
  • mongodb 指标意义 http://blog.itpub.net/15498/viewspace-1978198/
0 0
原创粉丝点击
热门问题 老师的惩罚 人脸识别 我在镇武司摸鱼那些年 重生之率土为王 我在大康的咸鱼生活 盘龙之生命进化 天生仙种 凡人之先天五行 春回大明朝 姑娘不必设防,我是瞎子 安装英雄联盟文件损坏怎么办 电脑玩lol运行内存不足怎么办 守望先锋账号冻结了怎么办 守望先锋服务器发生意外错误怎么办 在先锋社保缺一年上学怎么办 lol更新后反应很慢怎么办 电脑跳舞毯不正常电脑游戏怎么办 PS中缺失的字体怎么办 黑板墙不想要了怎么办 淘宝代练打坏了怎么办 绝地求生与ipad不兼容怎么办 小米手机玩绝地求生卡怎么办 绝地求生服务器目前非常繁忙怎么办 玩绝地求生手机发烫怎么办 绝地求生刺激战场延迟高怎么办 怀孕八个半月打喷嚏头疼怎么办 20岁打喷嚏漏尿怎么办 鼻炎犯了不停打喷嚏怎么办 鼻炎犯了不停打喷嚏流鼻涕怎么办 感冒鼻痒怎么办小窍门 腰扭了屁股也疼怎么办 小三把房子过户怎么办 小三把房子卖了怎么办 打印机ip地址变了怎么办 电脑ip地址错误不能上网怎么办 修改了注册表电脑无法启动怎么办 香水喷到衣服上有印怎么办 家里一股猫的味道怎么办 干菊花里面有虫怎么办 安装时显示程序已关闭怎么办 电脑一直重启开不了机怎么办 应用安装在sd卡打不开怎么办 安装ps打不开安装包怎么办 安装好的软件打不开怎么办? win10系统语言修改不了怎么办 一个月婴儿吵夜怎么办 玩游戏一直闪退怎么办 钱站一直闪退怎么办 win7重装连不上网怎么办 笔记本屏幕横过来了怎么办 3D贴图丢了怎么办