初步分析SQL效率

来源：互联网发布：硅藻泥装修知乎编辑：程序博客网时间：2024/05/29 14:26

最近开发当中遇到一个问题，在这里跟大家分享下，也希望在这里能得到更好的解决方案！

问题前述：月表tbl_month（code,month,type,state,fee），其中(code,type,state,month)为联合主键,表中还有其他字段，这里就不一一列举，表的数据量是千万级

问题：根据页面提交过来的（code,type,state)三个字段以及一个时间段（startMonth，endMonth）对数据进行更新字段fee操作

问题补充：页面可以勾选多条数据同时操作，也就是需要满足多条数据批量处理，才出现这篇文章

如现一部分数据为：（这里这列举能说明问题的几条数据，实际一次处理的数据量是万级）

code，type，state，fee，month，标号

001，t1，s1，100.00，201201，1

001，t1，s2，100.00，201202，2

001，t2，s1，100.00，201201，3

001，t2，s2，100.00，201202，4

002，t1，s1，100.00，201201，5

002，t1，s2，100.00，201202，6

002，t2，s1，100.00，201201，7

002，t2，s2，100.00，201202，8

现在假如我们勾选了其中标号为（1,4,6,7）四条数据进行操作，请大家这里仔细分析这四条数据的特殊性。

页面出过来的数据格式为：（001,t1,s1;001,t2,s2;002,t1,s2;002,t2,s1\t201201\t201202）

后台需要先查询出这四条数据，再针对数据进行更新，可能你会问这里为什么要查询，请忽略这个疑问，我们当且认为这个查询是必须的。

效率极低的写法：

select code,type,state,month

from tbl_month

where month>='201201' and month<='201202'

and (

(code='001' and type='t1' and state='s1')

or (code='001' and type='t2' and state='s2')

or (code='001' and type='t1' and state='s2')

or (code='001' and type='t2' and state='s1')

)

之所以会这样写，是因为这样写不需要太多的java代码，只需要将"001,t1,s1;001,t2,s2;002,t1,s2;002,t2,s1"字符串先按照";"分割，再按照","分割，再对这个数据循环，sql语句就拼出来了，很容易实现，且容易想到，但写完一测试才发现，执行效率太低

请不要嘲笑，本人暂时还不具备sql经验，还处于刚会用阶段

然后我想到了将数据进行分类，按照type和state分，存放到一个map里面，键：type;state值：code

代码： String[] strs = str.split(";");
String[] s;

Map map = new HashMap();
for (int i = 0; i < strs.length; i++) {
s = strs[i].split(",");
String key = s[1] + ";" + s[2];
if(map.get(key) != null) {
map.put(key, map.get(key) + "','" + s[0]);
} else {
map.put(key, s[0]);
}
}
分类之后在拼SQL：

String sql = "";
Iterator iter = map.entrySet().iterator();
Entry entry;
while(iter.hasNext()) {
if(sql.length() > 0) {
sql += " union all ";
}
entry = (Entry) iter.next();
String[] key = ((String)entry.getKey()).split(";");
String value = "'" + (String)entry.getValue() + "'";
sql += "(select code,type,state,month from Tb_Month" +
" where month>='201201' and month<='201202' " +

" and code in(" + value + ") and type='" + key[0] + "' and state='" + key[1] + "')";
}

经过这样拼装出来的SQL为：

(select code,type,state,month from Tb_Month

where month>='201201' and month<='201202'

and code in('001') and type='t1' and state='s1')

union all

(...)

union all

(...)

union all

(...)

这样拼出来的sql虽然长了很多，但执行效率高了很多，这里的union all的个数取决于所勾选数据的“杂乱”程度，由于对sql不够了解，也不太清楚这union all的个数会不会影响到整体sql的执行效率

然后更新数据的操作也是基于这个分类的思想，只是更新语句不能一并执行，需要循环执行，这就由原来的一条update语句变成了多条，但经过测试，即使是多条，效率还是比之前那种单条高的多。