mycat ShareJoin后少了2条记录的问题剖析详细过程

来源:互联网 发布:docker web mysql 编辑:程序博客网 时间:2024/05/29 21:28


--<>问题描述

         Mycat 在进行join的时候,本来是应该join出4条记录,但是结果却只出来2条记录,问题在哪里?


mycat版本是1.5  mysql版本是5.6.12-log


 

1,准备环境,先去原始mysql库上执行建库录入数据操作

# db1库和db2库分别建2个测试表

mysql> use db1

Database changed

mysql> CREATE TABLE `t1` (

    ->   `id` bigint(20) NOT NULL AUTO_INCREMENT,

    ->   `name` varchar(200) DEFAULT NULL,

    ->   PRIMARY KEY (`id`)

    -> ) ENGINE=InnoDB AUTO_INCREMENT=2001 DEFAULT CHARSET=utf8;                                                                                                                               

Query OK, 0 rows affected (0.00 sec)

 

mysql> CREATE TABLE `t2` (

    ->   `id` bigint(20) NOT NULL AUTO_INCREMENT,

    ->   `name` varchar(200) DEFAULT NULL,

    ->   `parent_id` bigint(20) DEFAULT NULL,

    ->   PRIMARY KEY (`id`)

    -> ) ENGINE=InnoDB AUTO_INCREMENT=2001 DEFAULT CHARSET=utf8;

Query OK, 0 rows affected (0.00 sec)

 

mysql> use db2

Database changed

mysql> CREATE TABLE `t1` (

    ->   `id` bigint(20) NOT NULL AUTO_INCREMENT,

    ->   `name` varchar(200) DEFAULT NULL,

    ->   PRIMARY KEY (`id`)

    -> ) ENGINE=InnoDB AUTO_INCREMENT=2001 DEFAULT CHARSET=utf8;

r(200) DEFAULT NULL,

  `parent_id` bigint(20) DEFAULT NULL,

  PRIMARY KEY (`id`)

) ENGINE=InnoDB AUTO_INCREMENT=2001 DEFAULT CHARSET=utf8;

Query OK, 0 rows affected (0.01 sec)

 

mysql> CREATE TABLE `t2` (

    ->   `id` bigint(20) NOT NULL AUTO_INCREMENT,

    ->   `name` varchar(200) DEFAULT NULL,

    ->   `parent_id` bigint(20) DEFAULT NULL,

    ->   PRIMARY KEY (`id`)

    -> ) ENGINE=InnoDB AUTO_INCREMENT=2001 DEFAULT CHARSET=utf8;

Query OK, 0 rows affected (0.01 sec)

 

mysql>

# 也是在mysql中执行,开始录入数据

mysql> Insert into db2.t1(id,name) values(1,'db2');

Query OK, 1 row affected (0.01 sec)

 

mysql> Insert into db1.t1(id,name) values(2,'db1');

db1.2',1);

Insert into db2.t2(id,name,parent_id) values(3,'db2.2',2);Query OK, 1 row affected (0.00 sec)

 

mysql> Insert into db1.t2(id,name,parent_id) values(2,'db1.1',1);

Query OK, 1 row affected (0.00 sec)

 

mysql> Insert into db1.t2(id,name,parent_id) values(4,'db2.2',2);

Query OK, 1 row affected (0.00 sec)

 

mysql> Insert into db2.t2(id,name,parent_id) values(1,'db1.2',1);

Query OK, 1 row affected (0.00 sec)

 

mysql> Insert into db2.t2(id,name,parent_id) values(3,'db2.2',2);

Query OK, 1 row affected (0.01 sec)

 

mysql>

 

请注意,重点来了,按照正常的理解,2个节点的t1表有2条记录,2个节点的t2表有4条记录,那么表关联后应该返回4条记录,对不对?

 

那么mycat到底给我们返回了几条记录呢是不是4条呢?

 

答案是否定的!!!!!!!接下来配置mycat吧,实际验证一下。

2mycatschema.xml配置

Schema.xml配置:

# table表配置            

   <table name="t1" dataNode="dn1,dn2" rule="mod-long" primaryKey="id" />

                <table name="t2" dataNode="dn2,dn1" rule="mod-long" primaryKey="id" />

 

# datanode配置

        <dataNode name="dn1" dataHost="dh1" database="db1" />

        <dataNode name="dn2" dataHost="dh1" database="db2" />

 

# datahost配置

        <dataHost name="dh1" maxCon="1000" minCon="10" balance="0"

                        writeType="0" dbType="mysql" dbDriver="native" switchType="1"  slaveThreshold="100">                                                                                                               

                                        <heartbeat>select user()</heartbeat>

                <writeHost host="hostM1" url="192.168.121.61:3306" user="root"  password=""/>

        </dataHost>

 

3,重启mycat

[root@oracle_standby ~]# mycat restart                                                                                                                                                                                                                                 

Stopping Mycat-server...

Stopped Mycat-server.

Starting Mycat-server...

[root@oracle_standby ~]#

 

文章源地址:http://blog.csdn.net/mchdba/article/details/51317608,谢绝转载

 

4,查看mycat执行结果

# 单表查询t12条记录,正常

mysql> select * from t1;

+----+------+

| id | name |

+----+------+

|  2 | db1  |

|  1 | db2  |

+----+------+

2 rows in set (0.01 sec)

 

# 单表查询t24条记录,正常

mysql> select * from t2;

+----+-------+-----------+

| id | name  | parent_id |

+----+-------+-----------+

|  1 | db1.2 |         1 |

|  3 | db2.2 |         2 |

|  2 | db1.1 |         1 |

|  4 | db2.2 |         2 |

+----+-------+-----------+

4 rows in set (0.01 sec)

 

# 那么t1 通过parent_id关联t2表,应该有4条记录,对不对?但是只有2条,而且加了ShareJoin前后的查询出来的数据还不一样,如下所示:                                                         

mysql> select * from t1,t2 where t1.id = t2.parent_id;

+----+------+----+-------+-----------+

| id | name | id | name  | parent_id |

+----+------+----+-------+-----------+

|  2 | db1  |  4 | db2.2 |         2 |

|  1 | db2  |  1 | db1.2 |         1 |

+----+------+----+-------+-----------+

2 rows in set (0.01 sec)

 

mysql> /*!mycat:catlet=demo.catlets.ShareJoin */ select t1.*,t2.* from t1,t2 where t1.id = t2.parent_id;

+----+------+-------+-----------+

| id | name | name  | parent_id |

+----+------+-------+-----------+

|  1 | db2  | db1.1 |         1 |

|  2 | db1  | db2.2 |         2 |

+----+------+-------+-----------+

2 rows in set (0.02 sec)

 

mysql>

# 看到加了注解走了mycat后,结果集不一样了,问题在哪里呢?

 

5Mycat.log分析

05/04 17:21:11.443  DEBUG [$_NIOREACTOR-0-RW] (ServerQueryHandler.java:56) -ServerConnection [id=1, schema=TESTDB, host=127.0.0.1, user=test,txIsolation=3, autocommit=true, schema=TESTDB]/*!mycat:catlet=demo.catlets.ShareJoin */ select t1.id,t1.name,t2.id id2,t2.name,t2.parent_id from t1,t2 where t1.id = t2.parent_id

 

05/04 17:21:11.447  DEBUG [$_NIOREACTOR-0-RW] (EnchachePool.java:76) -SQLRouteCache  miss cache ,key:TESTDB/*!mycat:catlet=demo.catlets.ShareJoin */ select t1.id,t1.name,t2.id id2,t2.name,t2.parent_id from t1,t2 where t1.id = t2.parent_id

05/04 17:21:11.469  DEBUG [$_NIOREACTOR-0-RW] (HintCatletHandler.java:46) -load catelet class:demo.catlets.ShareJoin to run sql select t1.id,t1.name,t2.id id2,t2.name,t2.parent_id from t1,t2 where t1.id = t2.parent_id

05/04 17:21:11.573   INFO [$_NIOREACTOR-0-RW] (JoinParser.java:70) -SQL: select t1.id,t1.name,t2.id id2,t2.name,t2.parent_id from t1,t2 where t1.id = t2.parent_id

# 这里路由正确,catlet路由到了dn1dn2了。

05/04 17:21:11.677   INFO [$_NIOREACTOR-0-RW] (ShareJoin.java:150) -Catlet exec:dn1,dn2, sql:select id , name from t1

05/04 17:21:11.677  DEBUG [BusinessExecutor2] (PhysicalDBPool.java:452) -select read source hostM1 for dataHost:dh1

05/04 17:21:11.681  DEBUG [BusinessExecutor2] (SQLJob.java:85) -con query sql:select id , name from t1 to con:MySQLConnection [id=2, lastTime=1462353671681, user=root, schema=db1, old shema=db1, borrowed=true, fromSlaveDB=false, threadId=103614, charset=utf8, txIsolation=0, autocommit=true, attachment=dn1{select id , name from t1}, respHandler=null, host=192.168.121.61, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=false]

05/04 17:21:11.685  DEBUG [$_NIOREACTOR-0-RW] (PhysicalDatasource.java:403) -release channel MySQLConnection [id=2, lastTime=1462353671666, user=root, schema=db1, old shema=db1, borrowed=true, fromSlaveDB=false, threadId=103614, charset=utf8, txIsolation=0, autocommit=true, attachment=null, respHandler=null, host=192.168.121.61, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=false]

05/04 17:21:11.685   INFO [$_NIOREACTOR-0-RW] (BatchSQLJob.java:41) -job finished SQLJob [ id=1,dataNodeOrDatabase=dn1,sql=select id , name from t1,  jobHandler=demo.catlets.ShareDBJoinHandler@6fb40e2d]

05/04 17:21:11.685  DEBUG [BusinessExecutor0] (PhysicalDBPool.java:452) -select read source hostM1 for dataHost:dh1

05/04 17:21:11.686  DEBUG [BusinessExecutor0] (SQLJob.java:85) -con query sql:select id , name from t1 to con:MySQLConnection [id=8, lastTime=1462353671686, user=root, schema=db2, old shema=db2, borrowed=true, fromSlaveDB=false, threadId=103617, charset=utf8, txIsolation=0, autocommit=true, attachment=dn2{select id , name from t1}, respHandler=null, host=192.168.121.61, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=false]

05/04 17:21:11.687  DEBUG [$_NIOREACTOR-0-RW] (PhysicalDatasource.java:403) -release channel MySQLConnection [id=8, lastTime=1462353671666, user=root, schema=db2, old shema=db2, borrowed=true, fromSlaveDB=false, threadId=103617, charset=utf8, txIsolation=0, autocommit=true, attachment=null, respHandler=null, host=192.168.121.61, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=false]

05/04 17:21:11.724  DEBUG [BusinessExecutor1] (PhysicalDBPool.java:452) -select read source hostM1 for dataHost:dh1

# 这里面有用到SQLJob.java,通过将t1表的id传给t2表以where in(1,2)的方式来获取t2表的数据

05/04 17:21:11.724  DEBUG [BusinessExecutor1] (SQLJob.java:85) -con query sql:select parent_id, id as id2 , name , parent_id from t2 where parent_id in (1,2)  to con:MySQLConnection [id=2, lastTime=1462353671724, user=root, schema=db1, old shema=db1, borrowed=true, fromSlaveDB=false, threadId=103614, charset=utf8, txIsolation=0, autocommit=true, attachment=dn1{select parent_id, id as id2 , name , parent_id from t2 where parent_id in (1,2) }, respHandler=null, host=192.168.121.61, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=false]

 

而在mycat中,执行这个sql语句是能获取到4条记录的,如下所示:

mysql> select parent_id, id as id2 , name , parent_id from t2 where parent_id in (1,2);

+-----------+-----+-------+-----------+

| parent_id | id2 | name  | parent_id |

+-----------+-----+-------+-----------+

|         1 |   1 | db1.2 |         1 |

|         2 |   3 | db2.2 |         2 |

|         1 |   2 | db1.1 |         1 |

|         2 |   4 | db2.2 |         2 |

+-----------+-----+-------+-----------+

4 rows in set (0.03 sec)

 

mysql>

那为什么最后在mycat窗口中显示为2条呢?继续看下面日志:

05/04 17:21:11.724  DEBUG [BusinessExecutor3] (PhysicalDBPool.java:452) -select read source hostM1 for dataHost:dh1

05/04 17:21:11.725  DEBUG [BusinessExecutor3] (SQLJob.java:85) -con query sql:select parent_id, id as id2 , name , parent_id from t2 where parent_id in (1,2)  to con:MySQLConnection [id=8, lastTime=1462353671725, user=root, schema=db2, old shema=db2, borrowed=true, fromSlaveDB=false, threadId=103617, charset=utf8, txIsolation=0, autocommit=true, attachment=dn2{select parent_id, id as id2 , name , parent_id from t2 where parent_id in (1,2) }, respHandler=null, host=192.168.121.61, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=false]

05/04 17:21:11.725   INFO [$_NIOREACTOR-0-RW] (ShareJoin.java:229) -SQLParallJob:dn1,dn2, sql:select parent_id, id as id2 , name , parent_id from t2 where parent_id in (1,2)

05/04 17:21:11.726   INFO [$_NIOREACTOR-0-RW] (BatchSQLJob.java:41) -job finished SQLJob [ id=2,dataNodeOrDatabase=dn2,sql=select id , name from t1,  jobHandler=demo.catlets.ShareDBJoinHandler@6fb40e2d]

05/04 17:21:11.726  DEBUG [$_NIOREACTOR-0-RW] (EngineCtx.java:82) -packge id 1

05/04 17:21:11.727  DEBUG [$_NIOREACTOR-0-RW] (PhysicalDatasource.java:403) -release channel MySQLConnection [id=2, lastTime=1462353671707, user=root, schema=db1, old shema=db1, borrowed=true, fromSlaveDB=false, threadId=103614, charset=utf8, txIsolation=0, autocommit=true, attachment=null, respHandler=null, host=192.168.121.61, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=false]

05/04 17:21:11.728   INFO [$_NIOREACTOR-0-RW] (BatchSQLJob.java:41) -job finished SQLJob [ id=3,dataNodeOrDatabase=dn1,sql=select parent_id, id as id2 , name , parent_id from t2 where parent_id in (1,2) ,  jobHandler=demo.catlets.ShareRowOutPutDataHandler@13037844]

05/04 17:21:11.728  DEBUG [$_NIOREACTOR-0-RW] (PhysicalDatasource.java:403) -release channel MySQLConnection [id=8, lastTime=1462353671707, user=root, schema=db2, old shema=db2, borrowed=true, fromSlaveDB=false, threadId=103617, charset=utf8, txIsolation=0, autocommit=true, attachment=null, respHandler=null, host=192.168.121.61, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=false]

05/04 17:21:11.728   INFO [$_NIOREACTOR-0-RW] (BatchSQLJob.java:41) -job finished SQLJob [ id=4,dataNodeOrDatabase=dn2,sql=select parent_id, id as id2 , name , parent_id from t2 where parent_id in (1,2) ,  jobHandler=demo.catlets.ShareRowOutPutDataHandler@13037844]

05/04 17:21:11.729   INFO [$_NIOREACTOR-0-RW] (EngineCtx.java:171) -all job finished  for front connection: ServerConnection [id=1, schema=TESTDB, host=127.0.0.1, user=test,txIsolation=3, autocommit=true, schema=TESTDB]

05/04 17:21:11.729   INFO [$_NIOREACTOR-0-RW] (EngineCtx.java:159) -write  eof ,packgId:10

05/04 17:21:11.729   INFO [$_NIOREACTOR-0-RW] (ShareJoin.java:159) -发送数据OK

 

PS:看到最后,一切流程正常了,没有看到问题所在,看来只有去看源代码了。

 

 

6,源码分析join

01.png:

 

02.png:

 

如果说。两次都获取到了,就正常了:

03.png:

 

那为啥2次获取不到呢?看着啊,我输出了点东西,04.png:

 

这就是原因:

private byte[] getRow(String value,int index){
for(Map.Entry<String,byte[]> e: arows.entrySet() ){
        String key=e.getKey();
        RowDataPacket rowDataPkg = ResultSetUtil.parseRowData(e.getValue(), afields);
        String id = ByteUtil.getString(rowDataPkg.fieldValues.get(index));
        if (id.equals(value)){
         System.out.println("getRow方法,结果集将A表join列值为【"+value+"】的集合remove掉了,再相查就查不到了");
         return arows.remove(key);
        }
     }
return null;
}
这里面的逻辑,处理比较麻烦,如果想改,还没想好怎么改……

 

再次描述下,不知道是否能描述清楚,大家都懂的,it人在讲述“专业高深技能”方面经常是一剑西来天外飞仙,场面恢宏绚丽无比,口如悬河滔滔不绝,就是太高大上了,普通人叹为观止一时无法领悟其中精髓……

这个还是接地气点,说的是他第一查询后,把对应key的值remove掉了,所以第二次查不到了,他查B表是查到4条,根据B表查询结果,取原先A表查询结果去组合返回结果,一条一条返回,可以第一次查的时候把A表对应值的记录去掉了,第二条(关联列值相同)查不到A表记录,他认为不符合join结果,就去掉了。个人理解应该i是判断是否符合join条件,和removekey那个值,分开处理才是的。个人建议。

 

--<>--这个sharejoin处理的很复杂,如果不稳定的话,建议抛弃,ER也建议抛弃

 

 

Bty:此处问题有akey提出,我做了数据准备工作日志分析,帅哥做了源码分析工作,victor大神积极参与,静哥等密切关注推进,最后我整理交流记录而成本篇笔记,准备提交git,谢谢大家的帮助。

3 0
原创粉丝点击