MySql-Proxy Introduction

来源:互联网 发布:淘宝模特tim 编辑:程序博客网 时间:2024/06/08 07:44

MySQL Proxy Overview

The Proxy ships(运输)with an embedded Lua interpreter. Using Lua, you can definewhat to do witha query or a result set before the Proxy passes them along.

MySQL Proxy overview

Figure 1. MySQL Proxy can modify queries and results

The power of the Proxy is all in its flexibility, as allowed by the Lua engine. You can intercept the query before it goes to the server, and do everything conceivable with it:

  • Pass it along unchanged (default) 
  • Fix spelling mistakes (ever written CRATE DATAABSE?)
  • Filter it out, i.e., remove it altogether 
  • Rewrite the query(重定义查询) according to some policy (enforcing strong passwords, forbidding empty ones)
  • Add forgotten statements (autocommit is enabled and the user sent a BEGIN WORK? You can inject a SET AUTOCOMMIT = 0 before that)
  • Much more: if you can think of it, it's probably already possible; if it isn't, blog about it: chances are that someone will make it happen

In the same way, you can intercept(拦截) the result set. Thus you can:

  • Remove, modify, or add records to the result. Want to mask passwords, or hide information from unauthorized prying eyes?
  • Make your own result sets, including column names. For example, if you allow the user to enter a new SQL command, you can build the result set to show what was requested.
  • Ignore result sets, i.e., don't send them back to the client.
  • Want to do more? It could be possible. Look at the examples and start experimenting!

Key Concepts

MySQL Proxy is built with an object-oriented infrastructure. The main class exposes three member functions to the public. You can override them in a Lua script to modify the Proxy's behavior.

  • connect_server(): Called at connection time, you can work inside this function to change connection parameters. It can be used to provide load balancing.(负载均衡,也可以实现读写分离,读写分离有必要吗?)
  • read_query(packet): This function is called before sending the query to the server. You can intervene(拦截) here to change the original query or to inject more to the queue. You can also decide to skip the backend server altogether and send back to the client the result you want (e.g., given a SELECT * FROM big_table you may answer back "big_table has 20 million records. Did you forget the WHERE clause?")
  • read_query_result(injection_packet): This function is called before sending back the result in answer for an injected query. You can do something here to decide what to do with the result set (e.g., ignore, modify, or send it unchanged).

Installation

Installing the Proxy is quite easy. The distribution package contains just one binary (and as of 0.5.1, also some sample Lua scripts). You can unpack that and copy it where you like. For some operating system it's even easier, because there are RPM packages that will take care of everything.

If your operating system is not included in the distribution, or if you want to try the bleeding-edge features as soon as they leave the factory, you may get the source from the public Subversion tree and then build the proxy yourself. It should need just a few basic actions.

 ./autogen.sh ./configure && make sudo make install  # will copy the executable to /usr/local/sbin

Simple Query Interception (拦截)


  1. Create a Lua file, named first_example.lua, containing the lines listed below.
  2. Assuming that your database server is on the same box, launch the proxy server.
  3. From a separate console, connect to the proxy server, which is like connecting to the normal server, with the difference that you will use port 4040 instead of 3306.
 -- first_example.lua  function read_query(packet)   if string.byte(packet) == proxy.COM_QUERY then     print("Hello world! Seen the query: " .. string.sub(packet, 2))   end end# starting the proxy$ mysql-proxy --proxy-lua-script=first_example.lua -D                          # from another console, accessing the proxy$ mysql -u USERNAME -pPASSWORD -h 127.0.0.1 -P 4040 -e 'SHOW TABLES FROM test' //-e 执行命令并退出

If you come back to the previous terminal window, you will see that the Proxy has intercepted something for you.

Hello world! Seen the query: select @@version_comment limit 1 //默认连接后第一条查询语句Hello world! Seen the query: SHOW TABLES FROM test


Note on Usage

to not use a Lua script you ,should now specify --proxy-skip-profiling. If you are using the proxy only for load balancing, you should now specify --proxy-skip-profiling.

Query Rewriting (重写查询)

We want to catch queries with a common typing error and replace it with the correct keyword. We will look for my most frequent finger twists SLECT and CRATE.

Here is second_example.lua:

 function read_query( packet )   if string.byte(packet) == proxy.COM_QUERY then     local query = string.sub(packet, 2)     print ("received " .. query)     local replacing = false     -- matches "CRATE" as first word of the query     if string.match(string.upper(query), '^%s*CRATE') then         query = string.gsub(query,'^%s*%w+', 'CREATE')         replacing = true     -- matches "SLECT" as first word of the query     elseif string.match(string.upper(query), '^%s*SLECT') then         query = string.gsub(query,'^%s*%w+', 'SELECT')         replacing = true     end     if (replacing) then         print("replaced with " .. query )         proxy.queries:append(1, string.char(proxy.COM_QUERY) .. query )         return proxy.PROXY_SEND_QUERY     end   end end

As before, start the server with the option --proxy-lua-script=second_example.lua and connect to it from a MySQL client

 $ mysql -u USERNAME -pPASSWORD -h 127.0.0.1 -P 4040  Welcome to the MySQL monitor.  Commands end with ; or \g. Your MySQL connection id is 48 Server version: 5.0.37-log MySQL Community Server (GPL) Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> use test Database changed mysql> CRATE TABLE t1 (id int);         # Notice: TYPO! Query OK, 0 rows affected (0.04 sec) mysql> INSERT INTO t1 VALUES (1), (2); Query OK, 2 rows affected (0.01 sec) Records: 2  Duplicates: 0  Warnings: 0 mysql> SLECT * FROM t1;                 # Notice: TYPO! +------+ | id   | +------+ |    1 |  |    2 |  +------+ 2 rows in set (0.00 sec)

Isn't it sweet? I made my usual mistakes, but the Proxy was kind enough to fix them for me. Let's look at what was reported.

 received select @@version_comment limit 1 received SELECT DATABASE() received CRATE TABLE t1 (id int) replaced with CREATE TABLE t1 (id int) received INSERT INTO t1 VALUES (1), (2) received SLECT * FROM t1 replaced with SELECT * FROM t1

The first two queries are stuff the client needs for its purpose. Then came my first mistake,CRATE, which was graciously changed to CREATE, and in the end it received SLECT, and turned it into SELECT.

This script is quite crude, but it gives you an idea of the possibilities.

Query Injection

 When required, it can create a queue of queries, and send them to the server, after assigning to each query an ID code.



Query injection


Figure 2. Query injection (注入,或插入)

When an injection has taken place, the result set gets processed by another function,read_query_result,where you can deal with the result sets according to their ID. In the example, for ID 2 and 3 you just get something from SHOW STATUS and by comparing their values you can measure the impact of the main query on the server. Since you use theSHOW STATUS values only for internal calculation, you don't send that result set to the client (which is just as good, since the client is not expecting it),but you discard it.

Processing the injected queries


Figure 3. Processing the injected queries

The result set of the query sent by the client is duly returned. It's transparent for the client, but in between you managed to collect statistical results, which are displayed on the proxy console.

For a full example, see the query injection tutorial in the Forge.

Macros

Macros are just another way of using the query rewriting facility. It's one of the most striking usages of the Proxy. You can rewrite the SQL language, or make it closer to your tastes. For instance, many people who use the MySQL command-line client type cd and ls instead of use and show tables. With MySQL Proxy, they can use cd and ls and get the expected result. This juicy example of macro creation and usage is available in an early blog post. Rather than repeating all of it here, I invite you to look at your first macros with MySQL Proxy.

Creating Result Sets: Shell Commands from MySQL

The Proxy receives a request from a client, and then it has to give back a result set. Most of the time, this is straightforward. Passing the query to the server, getting the result set, passing the result set to the client. But what happens if we need to return something that the server is not able to provide? Then we need to build a result set, which is composed of a set of column names, and a bi-dimensional array with the data.

Dataset Creation Basics

For example, if I wanted to return a warning about a deprecated feature, I could create a result set like this:

 proxy.response.resultset = {     fields = {         {             type = proxy.MYSQL_TYPE_STRING,             name = "deprecated feature",          },         {             type = proxy.MYSQL_TYPE_STRING,             name = "suggested replacement",          },     },     rows = {          {              "SHOW DATABASES",              "SHOW SCHEMAS"           }     } } -- and then, send it to the client return proxy.PROXY_SEND_RESULT

The above structure, when received by the client, would be shown as:

+---------------------+-----------------------+| deprecated feature  | suggested replacement |+---------------------+-----------------------+| SHOW DATABASES      | SHOW SCHEMAS          |+---------------------+-----------------------+

That's to say that you can fabricate every result set that meets your needs. For more details, see Jan Kneschke's example.

Shell Commands from MySQL Client

And now for something completely different, let's see how to use our freshly acquired knowledge to execute shell commands through the Proxy. We said already that the Proxy's behavior can be altered with Lua scripts. And Lua is a complete language, meaning that you can do almost everything with it, including executing shell commands. Combine this knowledge with the ability to create data sets, and we come up with the idea of asking for shell commands from a MySQL client and having the Proxy return their results as if they were normal database records.(通过SHELL来创建结果集)

Running shell commands through the Proxy

Figure 4. Running shell commands through the Proxy

Let's go though it, using the tutorial from MySQL Forge.



The shell tutorial script implements a simple syntax to ask for shell commands:

 SHELL command

For example:

 SHELL ls -lh /usr/local/mysql/data
  1. Get the shell tutorial script. Save it as shell.lua.
  2. Launch the proxy.
  3. Connect to the proxy.
$ /usr/local/sbin/mysql-proxy --proxy-lua-script=shell.lua -D# from a different console$ mysql -U USERNAME -pPASSWORD -h 127.0.0.1 -P 4040

Make sure that it works as a normal proxy to the database server.

 Welcome to the MySQL monitor.  Commands end with ; or \g. Your MySQL connection id is 49 Server version: 5.0.37-log MySQL Community Server (GPL) Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> use test Database changed mysql> show tables; +----------------+ | Tables_in_test | +----------------+ | t1             | +----------------+ 1 row in set (0.00 sec) mysql> select * from t1; +------+ | id   | +------+ |    1 | |    2 | +------+ 2 rows in set (0.00 sec)

Good. The normal operations work as expected. Now we test the enhanced features.

 mysql> shell df -h; +--------------------------------------------------------+ | df -h                                                  | +--------------------------------------------------------+ | Filesystem            Size  Used Avail Use% Mounted on | | /dev/md1               15G  3.9G  9.7G  29% /          | | /dev/md4              452G  116G  313G  27% /app       | | tmpfs                 1.7G     0  1.7G   0% /dev/shm   | | /dev/md3              253G  159G   82G  67% /home      | | /dev/md0               15G  710M   13G   6% /var       | +--------------------------------------------------------+ 6 rows in set (0.00 sec)

Hello shell! This is really a treat for advanced users. Once you have a way of accessing external commands, you can become quite creative.

 mysql> shell grep key_buffer /usr/local/mysql/my.cnf; +-----------------------------------------+ | grep key_buffer /usr/local/mysql/my.cnf | +-----------------------------------------+ | key_buffer=2000M                        | +-----------------------------------------+ 1 row in set (0.00 sec)

I know that I could check the same with SHOW VARIABLES, but since this is a value that can be set online, I just wanted to make sure that it was also in the configuration file. And how is our memory situation?

 mysql> shell free -m; +---------------------------------------------------------------------------+ | free -m                                                                   | +---------------------------------------------------------------------------+ |              total       used       free     shared    buffers     cached | | Mem:          3280       1720       1560          0          9       1006 | | -/+ buffers/cache:        704       2575                                  | | Swap:         8189          2       8186                                  | +---------------------------------------------------------------------------+ 4 rows in set (0.08 sec)

That's not bad. Now that we are content with the status of the server, what about some fun? We could, for example, check the last entries on Planet MySQL. Do you think I am babbling? Not at all. The command is quite long, but it works.

   wget -q -O - http://www.planetmysql.org/rss20.xml  \      | perl -nle 'print $1 if m{<title>(.*)</title>}' \      |head -n 21 | tail -n 20;

However, because the listing is so large, and nobody will remember that anyway, you should paste it into a shell script, and call it, for instance, last_planet.sh. And, here you are!

 mysql> shell last_planet.sh; +-------------------------------------------------------------------------------------+ | last_planet.sh                                                                      | +-------------------------------------------------------------------------------------+ | Top 5 Wishes for MySQL                                                              | | Open Source ETL tools.                                                              | | MySQL Congratulates FSF on GPLv3                                                    | | Query cache is slow to the point of being unusable - what is being done about that. | | About 'semi-unicode' And 'quasi Moon Stone'                                         | | My top 5 MySQL wishes                                                               | | Four more open source startups to watch                                             | | More on queue... Possible Solution...                                               | | MySQL as universal server                                                           | | MySQL Proxy. Playing with the tutorials                                             | | Open source @ Oracle: Mike Olson speaks                                             | | Quick musing on the &quot;Queue&quot; engine.                                       | | Distributed business organization                                                   | | Ideas for a MySQL queuing storage engine                                            | | MySQL Test Creation Tool Design Change                                              | | Queue Engine, and why this won' likely happen...                                    | | What?s your disk I/O thoughtput?                                                    | | My 5+ Wish List?                                                                    | | Top 5 best MySql practices                                                          | | Packaging and Installing the MySQL Proxy with RPM                                   | +-------------------------------------------------------------------------------------+ 20 rows in set (1.48 sec)

Shell access, and web content from MySQL client! Wow!

A Word of Caution

Having shown that you can access the shell from a MySQL connection does not automatically imply that you should always do it. Shell access is a security vulnerability, and if you want to use this feature in your server, do it for internal purposes only. Do not allow shell access to applications open to normal users. That would be asking for trouble (and finding it really fast).

You can use the shell to view things, but you could also use it to erase items.

 mysql> shell ls *.lua*; +---------------------+ | ls *.lua*           | +---------------------+ | first_example.lua   | | first_example.lua~  | | second_example.lua  | | second_example.lua~ | +---------------------+ 4 rows in set (0.03 sec) mysql> shell rm *~; Empty set (0.00 sec) mysql> shell ls *.lua*; +--------------------+ | ls *.lua*          | +--------------------+ | first_example.lua  | | second_example.lua | +--------------------+ 2 rows in set (0.01 sec)

Be very careful with shell access!

Be aware that the shell access you get through the Proxy is referred to the host where the Proxy is running. If you install the Proxy on the same host, it will coincide with the database server, but don't take it for granted.

Customized Logging(自定义日志)

I left this example for the end because, in my experience, this is the most interesting one and it has a practical, immediate use. Logs on demand are available in MySQL 5.1. But if you are stuck with MySQL 5.0, then the Proxy can give you a hand.

Simple Logging

To enable logging of queries into something that looks like a general log, the task is easy. Write this small portion of code into a simple_logs.lua file (or download the snippet from MySQL Forge).

 local log_file = 'mysql.log' local fh = io.open(log_file, "a+") function read_query( packet )   if string.byte(packet) == proxy.COM_QUERY then     local query = string.sub(packet, 2)     fh:write( string.format("%s %6d -- %s \n",          os.date('%Y-%m-%d %H:%M:%S'),          proxy.connection["thread_id"],          query))      fh:flush()   end end

Then start the Proxy with it, and connect to the Proxy from some concurrent sessions. This script will log all queries to a text file named mysql.log. After a few sessions, the logfile would look like this:

 2007-06-29 11:04:28     50 -- select @@version_comment limit 1  2007-06-29 11:04:31     50 -- SELECT DATABASE()  2007-06-29 11:04:35     51 -- select @@version_comment limit 1  2007-06-29 11:04:42     51 -- select USER()  2007-06-29 11:05:03     51 -- SELECT DATABASE()  2007-06-29 11:05:08     50 -- show tables  2007-06-29 11:05:22     50 -- select * from t1  2007-06-29 11:05:30     51 -- show databases  2007-06-29 11:05:30     51 -- show tables  2007-06-29 11:05:33     52 -- select count(*) from user  2007-06-29 11:05:39     51 -- select count(*) from columns

The log contains date, time, connection ID, and query. Simple and effective for such a short script. Notice that there are three sessions, and their commands are not sorted by session, but by the time they were executed.

The pleasant aspect is that you don't need to restart the server to activate the general log. All you need to do is to point your applications to the port 4040 instead of 3306, and you have enabled a simple but functional logging. Come to think of it, you don't need to modify or restart your applications either. You can achieve the same result without touching server or applications. Simply start the Proxy on the same box where the server is located, and activate an iptables rule to redirect traffic from port 3306 to 4040 (courtesy of Patrizio Tassone).

OS级的重定向自身连接程序

sudo iptables -t nat -I PREROUTING \   -s ! 127.0.0.1 -p tcp \   --dport 3306 -j \   REDIRECT --to-ports 4040

Redirecting traffic
Figure 5. Redirecting traffic from port 3306 to 4040

Now you have logging enabled, and you don't have to restart the server or to touch your applications! When you are done, and you don't need logs anymore, remove the rule (-Dinstead of -I) and kill the proxy.

sudo iptables -t nat -D PREROUTING \   -s ! 127.0.0.1 -p tcp \   --dport 3306 -j \   REDIRECT --to-ports 4040

More Customized Logging

The simple and effective logging script from the previous section is tempting, but it's really basic. We have had a glimpse of the Proxy internals, and we have seen that we can get better information, and these logs can be much more interesting than a bare list of queries. For example, we would like to report if a query was successful or rejected as a syntax error, how many rows were retrieved, how many rows were affected.



We know all the elements to reach this goal. The script will be a bit longer, but not much.

 -- logs.lua assert(proxy.PROXY_VERSION >= 0x00600,  "you need at least mysql-proxy 0.6.0 to run this module") local log_file = os.getenv("PROXY_LOG_FILE") if (log_file == nil) then   log_file = "mysql.log" end local fh = io.open(log_file, "a+") local query = "";

In the global part of the script, we check that we're using an appropriate version of the Proxy, since we are using features that are not available in version 0.5.0. Then we set the filename, taking it from a environment variable, or assigning the default value.

 function read_query( packet )   if string.byte(packet) == proxy.COM_QUERY then     query = string.sub(packet, 2)     proxy.queries:append(1, packet )     return proxy.PROXY_SEND_QUERY   else       query = ""   end end

The first function does little work. It appends the query to the proxy queue, so that the next function will be triggered when the result is ready.

 function read_query_result (inj)   local row_count = 0   local res = assert(inj.resultset)   local num_cols = string.byte(res.raw, 1)   if num_cols > 0 and num_cols < 255 then     for row in inj.resultset.rows do       row_count = row_count + 1     end   end   local error_status =""   if res.query_status and (res.query_status < 0 ) then       error_status = "[ERR]"   end   if (res.affected_rows) then       row_count = res.affected_rows   end   --   -- write the query, adding the number of retrieved rows   --   fh:write( string.format("%s %6d -- %s {%d} %s\n",      os.date('%Y-%m-%d %H:%M:%S'),      proxy.connection["thread_id"],      query,      row_count,     error_status))   fh:flush() end

In this function we can check if we are dealing with a data manipulation query or a select query. If there are rows, the function counts them, and the result is printed in braces to the logfile. If there are affected rows, then this is the number that is reported. We also check if there was an error, in which case the information is returned in brackets, and, finally, everything gets written to the logfile. Here is an example:

 2007-06-29 16:41:10     33 -- show databases {5}  2007-06-29 16:41:10     33 -- show tables {2}  2007-06-29 16:41:12     33 -- Xhow tables {0} [ERR] 2007-06-29 16:44:27     34 -- select * from t1 {6}  2007-06-29 16:44:50     34 -- update t1 set id = id * 100 where c = 'a' {2}  2007-06-29 16:45:53     34 -- insert into t1 values (10,'aa') {1}  2007-06-29 16:46:07     34 -- insert into t1 values (20,'aa'),(30,'bb') {2}  2007-06-29 16:46:22     34 -- delete from t1 {9}

The first, second, and fourth lines say that the queries returned respectively five, two, and six rows. The third one says that the query returned an error. The fifth row reports that two rows were affected by the UPDATE command. The following lines all report the number of affected rows for INSERT and DELETE statements.

Note on the Examples

The examples provided with this article have been tested with a few different operating systems. The code is still in alpha stage, though, so it may happen that data structures, options, and interfaces change, until the feature set is stabilized.