mysql performance schema

来源:互联网 发布:linux服务开机自启动 编辑:程序博客网 时间:2024/05/29 09:30
Mysql Performance SchemaDBA/李思亮我们使用mysql 5.5 也有一段时间了,对于performance schema 引擎的使用问题,一直没有开展起来,主要原因是资料太少。这段时间花了一些时间专门的关注了一下,形成本文档:Mysql 5.5 版本是第一个版本,本文提到的部分数据可能不适用,其中包括了mysql 5.6 的一些情况。从数据库内部讲这些表关注了数据库的4 个模块分别是:1. Mutex (互斥体,锁相关)2. RWLOCKs (读写锁)3. File I/O (文件io)4. Thread (线程)包括了全部的mutex 42 个, rwlocks 10 个,以及6 种类型的线程(thread),文件io 在分为数据(data) , 日志(log),临时文件(temp file i/o) 。Performance schema 记录的信息,比较底层,如果想深入的研究,需要对mysql的internal 有一定的了解, 本篇,我们不关注源代码级的信息,仅从一个使用者的角度来看,如何使用performance schema 为数据库优化提供一些建议。Performance schema 下面的表:mysql.cnt_it.sock@performance_schema> show tables ;+----------------------------------------------+| Tables_in_performance_schema |+----------------------------------------------+| cond_instances || events_waits_current || events_waits_history || events_waits_history_long || events_waits_summary_by_instance || events_waits_summary_by_thread_by_event_name || events_waits_summary_global_by_event_name || file_instances || file_summary_by_event_name || file_summary_by_instance || mutex_instances || performance_timers || rwlock_instances || setup_consumers || setup_instruments || setup_timers || threads |+----------------------------------------------+17 rows in set (0.00 sec)这些表基本可以分为4部分一类是instance 表,Instance 表记录的是那种类型的对象被探测或者记录。instance 表:mysql> SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES-> WHERE TABLE_SCHEMA = 'performance_schema'-> AND TABLE_NAME LIKE '%instances';+------------------+| TABLE_NAME |+------------------+| cond_instances || file_instances || mutex_instances || rwlock_instances |+------------------+记录各种等待事件涉及到的实例: 主要是3 类: cond (容器? )mutex(互斥锁) ,rwlock (读写锁)这些表是只读的。一类是Setup 表Setup 表记录了配置信息,以及监控的参数等等。setup 表:mysql> SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES-> WHERE TABLE_SCHEMA = 'performance_schema'-> AND TABLE_NAME LIKE 'setup%';+-------------------+| TABLE_NAME |+-------------------+| setup_consumers || setup_instruments || setup_timers |+-------------------+setup_consumers 描述各种事件setup_instruments 描述这个数据库下的表名以及是否开启监控。setup_timers 描述监控选项已经采样频率的时间间隔这个要多说一点目前performance-schema 只支持'wait' 时间的监控,代码树上wait/ 下的函数都可以监控到。文档上说了只有'wait' 事件的检测,有没有其他的选项呢?看看源代码:static row_setup_timers all_setup_timers_data[COUNT_SETUP_TIMERS]={{{ C_STRING_WITH_LEN("wait") },&wait_timer}};THR_LOCK table_setup_timers::m_table_lock;int table_setup_timers::update_row_values(TABLE *table,const unsigned char *,unsigned char *,Field **fields){Field *f;longlong value;DBUG_ASSERT(m_row);for (; (f= *fields) ; fields++){if (bitmap_is_set(table->write_set, f->field_index)){switch(f->field_index){case 0: /* NAME */my_error(ER_WRONG_PERFSCHEMA_USAGE, MYF(0));return HA_ERR_WRONG_COMMAND;case 1: /* TIMER_NAME */value= get_field_enum(f);if ((value >= FIRST_TIMER_NAME) && (value <=LAST_TIMER_NAME))*(m_row->m_timer_name_ptr)= (enum_timer_name) value;elsereturn HA_ERR_WRONG_COMMAND;break;default:DBUG_ASSERT(false);}}}return 0;}代码里写死了,只有'wait' 一个值,不排除以后的版本会增加新的关键字,但至少目前就只有一个啦。并且这个表的name 字段是不允许修改的的。下面的修改的方法里没有做任何处理,涉及到name 字段的修改,直接报错。mysql> SELECT * FROM setup_timers;+------+------------+| NAME | TIMER_NAME |+------+------------+| wait | CYCLE |+------+------------+只有timer_name 可以update 这是一个enum 字段。一类是event 表Event 表记录了各种事件,提供了当前事件(current) ,历史事件(history) 以及history_long 事件,history 表记录了每个线程的最近的10 个事件,history_long记录了10000 个事件。这两个历史表都是先进先出(FIFO)的规则。性能事件表:mysql> SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES-> WHERE TABLE_SCHEMA = 'performance_schema'-> AND TABLE_NAME LIKE '%current';+----------------------+| TABLE_NAME |+----------------------+| events_waits_current |+----------------------+记录当前正在发生的等待事件,这个表是只读的表,不能update ,delete ,但是可以truncate具体字段是什么意思就自己去查doc 了,这里不说了。性能历史表:mysql> SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES-> WHERE TABLE_SCHEMA = 'performance_schema'-> AND (TABLE_NAME LIKE '%history' OR TABLE_NAME LIKE'%history_long');+---------------------------+| TABLE_NAME |+---------------------------+| events_waits_history || events_waits_history_long |+---------------------------+这些表与前面的性能表的结构是一致的, history 表只保留每个线程(thread) 的最近的10 个事件, history_long 记录最近的10000 个事件。新事件入表,如果旧表满了,就会丢弃旧的数据,标准的先进先出(FIFO) 这俩表也是只读表,只能truncate还有一类是summary 表Summary 表是对event 的统计数据。事件汇总表:mysql> SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES-> WHERE TABLE_SCHEMA = 'performance_schema'-> AND TABLE_NAME LIKE '%summary%';+----------------------------------------------+| TABLE_NAME |+----------------------------------------------+| events_waits_summary_by_instance || events_waits_summary_by_thread_by_event_name || events_waits_summary_global_by_event_name || file_summary_by_event_name || file_summary_by_instance |+----------------------------------------------+按照相关的标准对进行的事件统计表,events_waits_summary_global_by_event_name 在mysql5.5.7 以前叫:EVENTS_WAITS_SUMMARY_BY_EVENT_NAME表也是只读的,只能turcate对于performance schema 下的每个表的详细定义,以及字段的解读,请参考手册:http://dev.mysql.com/doc/refman/5.5/en/performance-schema-table-descriptions.html大小写问题另外一个问题,performance schema 库下的表,或字段名,在查询的时候,建议全部小写, 原因是5.5.5 以前是大写,导致在某些平台上lower_case_table_names =1 出现问题,5.5.8 后全部改为小写字母。启动设置问题Performance schama 默认是不启用的,启动的参数为静态参数,需要在my.cnf 里设置performance_schema=1,或者设定命令行参数。当服务器启动起来的时候,会在err 日志里记录一条信息:“100407 16:13:02 [Note] Buffered information: Performance schema enabled.”如果是关闭的则会显示另外一个信息:“100407 16:13:02 [Note] Buffered information: Performance schema disabled (reason:start parameters)”还有中情况,设置了参数但是performance schema 启动失败,这种情况极少发生,日志里的信息:“Performance schema disabled (reason: init failed)”, 启动失败情况下,performacneschema 会关闭自己,并且mysql 数据库会继续启动,起结果就跟关闭启动参数是一样的。相应的系统参数:mysql.cnt_it.sock@performance_schema> show global variables like'%performance_schema%' ;+---------------------------------------------------+---------+| Variable_name | Value |+---------------------------------------------------+---------+| performance_schema | ON || performance_schema_events_waits_history_long_size | 10000 || performance_schema_events_waits_history_size | 10 || performance_schema_max_cond_classes | 80 || performance_schema_max_cond_instances | 1000 || performance_schema_max_file_classes | 50 || performance_schema_max_file_handles | 32768 || performance_schema_max_file_instances | 10000 || performance_schema_max_mutex_classes | 200 || performance_schema_max_mutex_instances | 1000000 || performance_schema_max_rwlock_classes | 30 || performance_schema_max_rwlock_instances | 1000000 || performance_schema_max_table_handles | 100000 || performance_schema_max_table_instances | 50000 || performance_schema_max_thread_classes | 50 || performance_schema_max_thread_instances | 1000 |+---------------------------------------------------+---------+详解这些参数:http://dev.mysql.com/doc/refman/5.5/en/performance-schema-system-variables.html参考mysql 的官方文档。对应的状态参数:mysql.cnt_house.sock@performance_schema> show global status like'%performance_schema%' ;+------------------------------------------+-------+| Variable_name | Value |+------------------------------------------+-------+| Performance_schema_cond_classes_lost | 0 || Performance_schema_cond_instances_lost | 0 || Performance_schema_file_classes_lost | 0 || Performance_schema_file_handles_lost | 0 || Performance_schema_file_instances_lost | 0 || Performance_schema_locker_lost | 0 || Performance_schema_mutex_classes_lost | 0 || Performance_schema_mutex_instances_lost | 0 || Performance_schema_rwlock_classes_lost | 0 || Performance_schema_rwlock_instances_lost | 0 || Performance_schema_table_handles_lost | 0 || Performance_schema_table_instances_lost | 0 || Performance_schema_thread_classes_lost | 0 || Performance_schema_thread_instances_lost | 0 |+------------------------------------------+-------+14 rows in set (0.00 sec)参数详解:http://dev.mysql.com/doc/refman/5.5/en/performance-schema-status-variables.html这些状态值,显示的是因为内存吃紧,或者系统太忙,导致的一些没有performanceschema 记录下来的事件。对于performance schema 运行状态如何?可以使用命令show engine status 来看:mysql.cnt_house.sock@performance_schema> show engine performance_schemastatus \G*************************** 1. row ***************************Type: performance_schemaName: events_waits_current.row_sizeStatus: 136*************************** 2. row ***************************Type: performance_schemaName: events_waits_current.row_countStatus: 3000*************************** 3. row ***************************Type: performance_schemaName: events_waits_history.row_sizeStatus: 120*************************** 4. row ***************************Type: performance_schemaName: events_waits_history.row_countStatus: 10000*************************** 5. row ***************************Type: performance_schemaName: events_waits_history.memoryStatus: 1200000*************************** 6. row ***************************Type: performance_schemaName: events_waits_history_long.row_sizeStatus: 120*************************** 7. row ***************************Type: performance_schemaName: events_waits_history_long.row_countStatus: 10000*************************** 8. row ***************************Type: performance_schemaName: events_waits_history_long.memoryStatus: 1200000*************************** 9. row ***************************Type: performance_schemaName: (pfs_mutex_class).row_sizeStatus: 248*************************** 10. row ***************************Type: performance_schemaName: (pfs_mutex_class).row_countStatus: 200*************************** 11. row ***************************Type: performance_schemaName: (pfs_mutex_class).memoryStatus: 49600*************************** 12. row ***************************Type: performance_schemaName: (pfs_rwlock_class).row_sizeStatus: 296*************************** 13. row ***************************Type: performance_schemaName: (pfs_rwlock_class).row_countStatus: 30*************************** 14. row ***************************Type: performance_schemaName: (pfs_rwlock_class).memoryStatus: 8880*************************** 15. row ***************************Type: performance_schemaName: (pfs_cond_class).row_sizeStatus: 216*************************** 16. row ***************************Type: performance_schemaName: (pfs_cond_class).row_countStatus: 80*************************** 17. row ***************************Type: performance_schemaName: (pfs_cond_class).memoryStatus: 17280*************************** 18. row ***************************Type: performance_schemaName: (pfs_thread_class).row_sizeStatus: 136*************************** 19. row ***************************Type: performance_schemaName: (pfs_thread_class).row_countStatus: 50*************************** 20. row ***************************Type: performance_schemaName: (pfs_thread_class).memoryStatus: 6800*************************** 21. row ***************************Type: performance_schemaName: (pfs_file_class).row_sizeStatus: 240*************************** 22. row ***************************Type: performance_schemaName: (pfs_file_class).row_countStatus: 50*************************** 23. row ***************************Type: performance_schemaName: (pfs_file_class).memoryStatus: 12000*************************** 24. row ***************************Type: performance_schemaName: mutex_instances.row_sizeStatus: 136*************************** 25. row ***************************Type: performance_schemaName: mutex_instances.row_countStatus: 1000000*************************** 26. row ***************************Type: performance_schemaName: mutex_instances.memoryStatus: 136000000*************************** 27. row ***************************Type: performance_schemaName: rwlock_instances.row_sizeStatus: 200*************************** 28. row ***************************Type: performance_schemaName: rwlock_instances.row_countStatus: 1000000*************************** 29. row ***************************Type: performance_schemaName: rwlock_instances.memoryStatus: 200000000*************************** 30. row ***************************Type: performance_schemaName: cond_instances.row_sizeStatus: 88*************************** 31. row ***************************Type: performance_schemaName: cond_instances.row_countStatus: 1000*************************** 32. row ***************************Type: performance_schemaName: cond_instances.memoryStatus: 88000*************************** 33. row ***************************Type: performance_schemaName: threads.row_sizeStatus: 504*************************** 34. row ***************************Type: performance_schemaName: threads.row_countStatus: 1000*************************** 35. row ***************************Type: performance_schemaName: threads.memoryStatus: 504000*************************** 36. row ***************************Type: performance_schemaName: file_instances.row_sizeStatus: 624*************************** 37. row ***************************Type: performance_schemaName: file_instances.row_countStatus: 10000*************************** 38. row ***************************Type: performance_schemaName: file_instances.memoryStatus: 6240000*************************** 39. row ***************************Type: performance_schemaName: (pfs_file_handle).row_sizeStatus: 8*************************** 40. row ***************************Type: performance_schemaName: (pfs_file_handle).row_countStatus: 32768*************************** 41. row ***************************Type: performance_schemaName: (pfs_file_handle).memoryStatus: 262144*************************** 42. row ***************************Type: performance_schemaName: events_waits_summary_by_thread_by_event_name.row_sizeStatus: 48*************************** 43. row ***************************Type: performance_schemaName: events_waits_summary_by_thread_by_event_name.row_countStatus: 360000*************************** 44. row ***************************Type: performance_schemaName: events_waits_summary_by_thread_by_event_name.memoryStatus: 17280000*************************** 45. row ***************************Type: performance_schemaName: (pfs_table_share).row_sizeStatus: 488*************************** 46. row ***************************Type: performance_schemaName: (pfs_table_share).row_countStatus: 50000*************************** 47. row ***************************Type: performance_schemaName: (pfs_table_share).memoryStatus: 24400000*************************** 48. row ***************************Type: performance_schemaName: (pfs_table).row_sizeStatus: 72*************************** 49. row ***************************Type: performance_schemaName: (pfs_table).row_countStatus: 100000*************************** 50. row ***************************Type: performance_schemaName: (pfs_table).memoryStatus: 7200000*************************** 51. row ***************************Type: performance_schemaName: performance_schema.memoryStatus: 39446870451 rows in set (0.00 sec)三个命令:Show engine innodb statusShow engine innodb mutexShow engine performacne_schema status统计事件/对象可以分为几类:1)wait/io io 等待事件2)wait/io/file 文件io等待,一般是指等待文件操作完成。3)wait/sync 等待同步对象,内容比较广泛,他的time_wait 时间包括了,请求对一个对象加锁,因为阻塞而导致的等待时间。4)wait/sync/cond 主要用于线程间同步,引起的等待事件。5)wait/sync/mutex 对资源访问的的互斥体的请求等待,mutex 对资源的独占式访问,的一种锁结构。6)wait/sync/rwlock 对变量访问、修改的锁等待。主意是对变量,主要用于线程安全的变量访问。这些表的上的操作限制:对于setup 表上,有些字段是可以update 的,用于控制监控的开启关闭。对于event_wait_* 这样的表式可以truncate 的,但是不能delete,update。可以在特定的诊断任务开始前, 清空原来的数据。对于summary 的表,也可以执行truncate 操作,但是只是把summary的统计字段置0或者null ,而不会删除里面的行记录。用performance schema 诊断性能问题的一般步骤:1.运行测试case2.打开performance schema 收集信息3.排除没有干扰的因素,并关闭对应的收集器,例如确定file i/o 不是问题的根本原因则可以关闭file io 收集,truncate event 表,然后从新收集性能数据。4.重复1-3, 越来越接近根本原因,也可以从events_waits_history_long表的统计分析,问题部分会越来越集中。5.一旦确定了问题的原因,就可以着手解决问题,优化性能。6.mutex_instances.LOCKED_BY_THREAD_ID 和rwlock_instances.WRITE_LOCKED_BY_THREAD_ID这两个字段对于发现性能瓶颈,或则死锁是非常重要的。1) 假设thread1 在等待某个mutex2) 那么可以看看thread1 在等待什么资源。SELECT * FROM events_waits_current WHERE THREAD_ID =thread_1;根据其中的OBJECT_INSTANCE_BEGIN 去关联mutex_instances 表确定是什么当前资源被那个线程占有,mysql.cnt_it.sock@performance_schema> SELECT * FROMmutex_instances where OBJECT_INSTANCE_BEGIN = 15806424;+------------------------------------+-----------------------+---------------------+| NAME | OBJECT_INSTANCE_BEGIN |LOCKED_BY_THREAD_ID |+------------------------------------+-----------------------+---------------------+| wait/synch/mutex/sql/LOG::LOCK_log |15805160 | 1305458 |+------------------------------------+-----------------------+---------------------+1 row in set (0.02 sec)3)。我们就可以去看看这线程当前正在做什么。mysql.cnt_it.sock@performance_schema> select * fromevents_waits_current where thread_id =1305458 \G*************************** 1. row***************************THREAD_ID: 1305458EVENT_ID: 76094EVENT_NAME:wait/synch/cond/sql/MYSQL_BIN_LOG::update_condSOURCE: log.cc:5535TIMER_START: 8958895120369531348TIMER_END: NULLTIMER_WAIT: NULLSPINS: NULLOBJECT_SCHEMA: NULLOBJECT_NAME: NULLOBJECT_TYPE: NULLOBJECT_INSTANCE_BEGIN: 15806424NESTING_EVENT_ID: NULLOPERATION: timed_waitNUMBER_OF_BYTES: NULLFLAGS: 01 row in set (0.00 sec)这是一个示例,我们测试环境,捕捉不到,我就拿了一个bin log 的写线程。Sql 示例:1等待事件的top Nmysql> SELECT EVENT_NAME, SUM(TIMER_WAIT), COUNT(*), SOURCE-> FROM EVENTS_WAITS_HISTORY_LONG-> WHERE EVENT_NAME LIKE “%innodb%”-> GROUP BY SOURCE-> ORDER BY SUM(TIMER_WAIT) DESC;2平均等待时间top N :mysql> SELECT EVENT_NAME, SUM(TIMER_WAIT)/count(*), source-> FROM EVENTS_WAITS_HISTORY_LONG-> WHERE EVENT_NAME LIKE “%innodb%”-> GROUP BY source-> ORDER BY SUM(TIMER_WAIT) / COUNT(*) DESC;EVENTS_WAITS_HISTORY 每个线程10 行记录,EVENTS_WAITS_HISTORY_LONG 表默认是10000 行记录,如果认为尺寸不够大,可以通过修改参数:“performance_schema_events_waits_history_size”“performance_schema_events_waits_history_long_size”这两个变量来调整, history_long 表的最大尺寸是100W 行记录。3最热的mutex :这里的时间单位是微秒(pico-second,或者说是cpu 频率周期)mysql> SELECT EVENT_NAME, COUNT_STAR,SUM_TIMER_WAIT,AVG_TIMER_WAITFROM events_waits_summary_global_by_event_nameWHERE EVENT_NAME LIKE '%innodb%'order BY COUNT_STAR DESC;从下面的结果,我们可以大体了解下,我们电脑网计数器的情况:Buff pool 和日志和undo 是比较忙的,需要优化的。*************************** 1. row***************************EVENT_NAME: wait/synch/mutex/innodb/kernel_mutexCOUNT_STAR: 183768448476SUM_TIMER_WAIT: 4527315253253694AVG_TIMER_WAIT: 24635*************************** 2. row***************************EVENT_NAME: wait/synch/mutex/innodb/buf_pool_mutexCOUNT_STAR: 15121853525SUM_TIMER_WAIT: 3285302101091527AVG_TIMER_WAIT: 217255*************************** 3. row***************************EVENT_NAME: wait/synch/mutex/innodb/log_sys_mutexCOUNT_STAR: 7142949915SUM_TIMER_WAIT: 1559525178971936AVG_TIMER_WAIT: 218330*************************** 4. row***************************EVENT_NAME:wait/synch/mutex/innodb/log_flush_order_mutexCOUNT_STAR: 6649384948SUM_TIMER_WAIT: 691589270730534AVG_TIMER_WAIT: 104008*************************** 5. row***************************EVENT_NAME: wait/synch/mutex/innodb/trx_undo_mutexCOUNT_STAR: 3615217180SUM_TIMER_WAIT: 112854228995684AVG_TIMER_WAIT: 31216PERFORMANCE SCHEMA 的性能问题:这部分没有做过测试,数据来自网上全部默认开启performance schema 大约会有8% 左右的性能消耗,如果开启了performance schema ,但是在setup 表中关闭事件记录,大约会有4% 左右的性能损失。我们的建议:安装mysql 的时候,开启performance schema 参数,并在setup 表中关闭事件记录,这样便于在遇到严重问题,需要打开性能日志的时候,需要重启mysql
0 0