MySQL: Query Parsing (MySQL query解析)
来源:互联网 发布:亲戚借身份证开淘宝店 编辑:程序博客网 时间:2024/05/22 06:56
Lexical analysis and parsing are prerequisite for any language. There are many tools available in the industry which can help in achieving this goal.
To give you an idea, following is a listing of some of the available lexer and parser generator.
Lexical analyzer generators:-
- ANTLR – Can generate lexical analyzers and parsers.
- Flex – Alternative variant of the classic “lex” (C/C++).
- JFlex – A rewrite of JLex.
- Ragel – A state machine and lexical scanner generator with output support for C, C++, C#, Objective-C, D, Java, Go and Ruby source code.
Parser generators:-
- ANTLR
- Happy for Haskell
- CUP for Java
- Bison for C and C++
- (YACC = Yet Another Compiler Compiler – the classic for C)
In this post, I will be discussing a brief about the major functions, classes and concepts used during MySQL query parsing.
Lexer used in MySQL
On the contrary, MySQL uses a hand-written lexer for lexical analysis. The benefit being most of the code can be optimized keeping SQL syntax in mind.
(Please refer to the files sql_lex.h and sql_lex.cc)
Some key observations:-
- MYSQLlex() is the main lex function, called by the parser (yyparse()).
- lex_one_token() is the key function used for tokenizing.
- lex_start() and lex_end() are the functions which are called before and after every input query to be parsed.
- All lexer’s terminals (MySQL tokens) are defined in the file lex.h, mainly divided into two groups:-
- symbols (these are SQL language keywords)
- sql_functions (these are SQL language functions)
- Digital searching algorithm is used for searching the terminals. (“The Art of Computer Programming” by Donald E. Knuth Volume 3 “Sorting and searching” chapter 6.3). The code for this can be found in the file gen_lex_hash.cc and lex_hash.h (generated from gen_lex_hash file). This calls for further reading!!
The following classes gives an outline of the lex implementation in MySQL. (along with short description, pulled from the comments in the source code.)
- class st_select_lex_node
This is the base class for st_select_lex and st_select_lex_unit.
- class st_select_lex_unit: public st_select_lex_node
st_select_lex_unit is container of either
– One SELECT
– UNION of selects
- class st_select_lex: public st_select_lex_node
Store information of parsed SELECT statement (without union)
st_select_lex and st_select_lex_unit are both inherited form select_lex_node
- class Lex_input_stream
This class represents the character input stream consumed during lexical analysis.
In addition to consuming the input stream, this class performs some comment pre processing, by filtering out out-of-bound special text from the query input stream.
Two buffers, with pointers inside each buffers, are maintained in parallel. The ‘raw’ buffer is the original query text, which may contain out-of-bound comments. The ‘cpp’ (for comments pre processor) is the pre-processed buffer that contains only the query text that should be seen once out-of-bound data is removed.
- class Query_tables_list
Class representing list of all tables used by statement and other information which is necessary for opening and locking its tables, like SQL command for this statement.
Also contains information about stored functions used by statement since during its execution we may have to add all tables used by its stored functions/triggers to this list in order to pre-open and lock them.
Also used by LEX::reset_n_backup/restore_backup_query_tables_list() methods to save and restore this informatio.
- struct LEX: public Query_tables_list
The state of the lex parsing. This is saved in the THD struct
- class Yacc_state
The internal state of the syntax parser. This object is only available during parsing, and is private to the syntax parser implementation (sql_yacc.yy).
- class Parser_state
Internal state of the parser.
The complete state consist of:
– state data used during lexical parsing,
– state data used during syntactic parsing.
Parser used in MySQL
MySQL uses “bison” tool as it’s parser generator.
The parser contains the standard set of functions generated by bison tool.
All the grammar rules being defined in the file sql_yacc.yy
sql_yacc.cc and sql_yacc.h are the generated output file from bison.
#define yyerror MYSQLerror
#define yylval MYSQLlval
#define yychar MYSQLchar
#define yydebug MYSQLdebug
#define yynerrs MYSQLnerrs
The same idea is used to merge the custom lex code with the bison parser. Since the bison parser only knows yylex() as the lexer function. But this is overridden by the macro to match the hand written lexer: MYSQLlex().
References:
- mysql -5.6.10 source code
- http://en.wikipedia.org/wiki/Lexical_analysis
- http://en.wikipedia.org/wiki/Parser
#define yyerror MYSQLerror
#define yylval MYSQLlval
#define yychar MYSQLchar
#define yydebug MYSQLdebug
#define yynerrs MYSQLnerrs
The same idea is used to merge the custom lex code with the bison parser. Since the bison parser only knows yylex() as the lexer function. But this is overridden by the macro to match the hand written lexer: MYSQLlex().
References:
- mysql -5.6.10 source code
- http://en.wikipedia.org/wiki/Lexical_analysis
- http://en.wikipedia.org/wiki/Parser
- MySQL: Query Parsing (MySQL query解析)
- 基于query语句解析mysql工作原理
- mysql 协议的query包及解析
- MySQL Query Profiler
- 使用mysql query profiler
- MySQL Query Cache
- MySQL Query Log
- MySQL Query to Json
- MySQL 缓存 Query Cache
- mysql query cache
- mysql advance query
- mysql slow query
- MySQL sql query
- mysql 的query cache
- MySQL Query Cache 小结
- MySQL Query Cache 小结
- MySQL Query Cache 小结
- mysql的query优化
- Windows服务器通过FileZilla配置FTP的方法
- Android高效加载大图、多图解决方案,有效避免程序OOM
- java:String使用equals和==比较的区别
- 如何提高Android后台进程存活率
- Web前端知识体系精简
- MySQL: Query Parsing (MySQL query解析)
- win10怎么整理磁盘碎片,win10系统磁盘碎片整理
- 抽象类和接口联系与区别
- 电脑很慢很卡怎么办
- DPN(Dual Path Network)算法详解
- 欢迎使用CSDN-markdown编辑器
- 车辆路径问题之jsprit(二):jsprit特性
- How To add Bidvertiser Ad code on Blogger Blog Follow these easy steps to add bidvertiser ad code
- 解决“请等待当前程序完成卸载或更改”的问题