Mybatis之批量操作-Oracle

来源：互联网发布：易语言安装包源码编辑：程序博客网时间：2024/05/17 05:52

原文链接：http://www.cnblogs.com/robinjava77/p/5530681.html，作者名称：robin>

—————————————————

前言：做一个数据同步项目，要求：同步数据不丢失的情况下，提高插入性能。

项目DB框架：Mybatis。DataBase:Oracle。

—————————————————————————-

批量插入数据方式：

一、Mybatis 全局设置批处理；

二、Mybatis 局部设置批处理；

三、Mybatis foreach批量插入：

①SELECT UNION ALL；

②BEGIN INSERT INTO …;INSERT INTO…;…;END;

四、java自带的批处理插入；

五、其他方式

—————————————————————————–

先说结论：Mybatis(全局/局部)批处理和java自带的批处理性能上差不多，属于最优处理办法，我这边各种测试后，最后采用Mybatis局部批处理方式。

一、Mybatis 全局设置批处理

先上Spring-Mybatis.xml 配置信息

 1 <?xml version="1.0" encoding="UTF-8"?> 2 <beans xmlns="http://www.springframework.org/schema/beans" 3        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 4        xmlns:context="http://www.springframework.org/schema/context" xmlns:tx="http://www.springframework.org/schema/tx" 5        xsi:schemaLocation="http://www.springframework.org/schema/beans 6                            http://www.springframework.org/schema/beans/spring-beans.xsd 7                            http://www.springframework.org/schema/context 8                            http://www.springframework.org/schema/context/spring-context.xsd http://www.springframework.org/schema/tx http://www.springframework.org/schema/tx/spring-tx.xsd"> 9 10     <!-- 自动扫描(自动注入) -->11     <context:annotation-config/>12     <context:component-scan base-package="com.company.dao"/>13 14     <!-- 动态数据源 -->15     <bean id="dataSource" class="com.company.dao.datasource.DataSource">16         <property name="myConfigFile" value="mySource.xml"/>17     </bean>18 19     <!-- mybatis配置 -->20     <bean id="sqlSessionFactory" class="org.mybatis.spring.SqlSessionFactoryBean">21         <property name="dataSource" ref="dataSource"/>22         <property name="mapperLocations" value="classpath*:mapper/*/*/*.xml"/>23         <property name="configLocation" value="classpath:/mybatisConfig.xml"/>24     </bean>25 26     <!-- 自动创建映射器，不用单独为每个 mapper映射-->27     <bean class="org.mybatis.spring.mapper.MapperScannerConfigurer">28         <property name="basePackage" value="com.company.dao.mapper"/>29         <property name="sqlSessionFactoryBeanName" value="sqlSessionFactory"/>30     </bean>31 32     <!-- 事务管理器配置,单数据源事务 -->33     <bean id="transactionManager"34           class="org.springframework.jdbc.datasource.DataSourceTransactionManager">35         <property name="dataSource" ref="dataSource"/>36     </bean>37 38     <tx:annotation-driven transaction-manager="transactionManager"/>39 40 </beans>

Spring-Mybatis.xml

再上mybatisConfig.xml（在本项目中，我没有设置setting。最终采用的局部批处理，因此未设置全局批处理，具体原因后面再说。）

 1 <?mapper.xml version="1.0" encoding="UTF-8" ?> 2 <!DOCTYPE configuration PUBLIC "-//mybatis.org//DTD Config 3.0//EN" "http://mybatis.org/dtd/mybatis-3-config.dtd"> 3 <configuration> 4      5     <settings> 6         <!-- 配置默认的执行器。SIMPLE 就是普通的执行器；REUSE 执行器会重用预处理语句（prepared statements）； BATCH 执行器将重用语句并执行批量更新。--> 7         <setting name="defaultExecutorType" value="BATCH"/> 8         <!--详见：http://www.mybatis.org/mybatis-3/zh/configuration.html--> 9     </settings>10     11     <!-- 别名列表 -->12     <typeAliases>13        <!-- typeAliases 中的配置都是配置别名，在此就不贴出来了 -->14     </typeAliases>15 16 </configuration>

mybatisConfig.xml

这样子设置好后，在BaseService开放saveBatch(List<T> list)方法

 1 @Override 2     public void save(List<T> list) { 3         for (int i = 0;i < list.size();i++){ 4             mapper.insert(list.get(i)); 5         } 6     } 7  8     @Override 9     public void saveBatch(List<T> list) {10         int size = list.size();11         int unitNum = 500;12         int startIndex = 0;13         int endIndex = 0;14         while (size > 0){15             if(size > unitNum){16                 endIndex = startIndex+unitNum;17             }else {18                 endIndex = startIndex+size;19             }20             List<T> insertData = list.subList(startIndex,endIndex);21             save(insertData);22             size = size - unitNum;23             startIndex = endIndex;24         }25     }

BaseService.saveBatch(List list)

虽然看上去是500条记录，一次次INSERT INTO，但由于在全局已经设置Mybatis是批处理执行器，所以这500条INSERT INTO只会与Oracle数据库通信一次。

全局设置批处理的局限性在哪里呢？

先附上mybatis官方的讨论列表中最很关键的一句：“If the BATCH executor is in use, the update counts are being lost. ”

设置全局批处理后，DB里的insert、Update和delete方法，都无法返回进行DML影响DB_TABLE的行数。

1.insert 无法返回影响的行数，这个好解决，一个批处理放在一个事务里，记录批处理失败次数，总数-批处理失败次数*单位批处理数据量，就能得到insert 影响DB_TABLE的行数；

2.但是update和delete就无法很简单的去统计影响行数了，如果做反复查询，反而降低了效率，得不偿失。

虽现在的项目尚未有需要反馈影响DB_TABLE行数的需求，但是为了更灵活，我们放弃了全局批处理的方式。

!这里提个疑问：为什么Mybatis官方，不将批处理的选择方式下沉到方法级别？方便开发者根据实际情况，灵活选择。我觉得这是个可以改进的地方，如有机会，可看源码去进行改进。

———————————————————————————————————

二、Mybatis局部批处理方式

由于领导说全局批处理方式，不够灵活，不适宜项目所需，要另想办法支持。但是java自带的批处理，因为项目代码管理的要求，也不能采用。因此，在仔细阅读官方文档后，设想自己能否获取SQLSession后openSession，将这个会话设置为批处理呢？

先看MyBatis官方网站(须FanQiang)：http://www.mybatis.org/mybatis-3/zh/getting-started.html

1 SqlSession session = sqlSessionFactory.openSession();2 try {3   BlogMapper mapper = session.getMapper(BlogMapper.class);4   // do work5 } finally {6   session.close();7 }

官方建议的写法

后查阅Mybatis java API(须FanQiang): http://www.mybatis.org/mybatis-3/zh/java-api.html

现在你有一个 SqlSessionFactory,可以用来创建 SqlSession 实例。

SqlSessionFactory

SqlSessionFactory 有六个方法可以用来创建 SqlSession 实例。通常来说,如何决定是你选择下面这些方法时:

Transaction (事务): 你想为 session 使用事务或者使用自动提交(通常意味着很多数据库和/或 JDBC 驱动没有事务)?
Connection (连接): 你想 MyBatis 获得来自配置的数据源的连接还是提供你自己
Execution (执行): 你想 MyBatis 复用预处理语句和/或批量更新语句(包括插入和删除)

重载的 openSession()方法签名设置允许你选择这些可选中的任何一个组合。

1 SqlSession openSession()2 SqlSession openSession(boolean autoCommit)3 SqlSession openSession(Connection connection)4 SqlSession openSession(TransactionIsolationLevel level)5 SqlSession openSession(ExecutorType execType,TransactionIsolationLevel level)6 SqlSession openSession(ExecutorType execType)7 SqlSession openSession(ExecutorType execType, boolean autoCommit)8 SqlSession openSession(ExecutorType execType, Connection connection)9 Configuration getConfiguration();

官方提供的openSession方法

因此出来了局部批处理第一套代码实现方式：

 1 public static void sqlSession(List<Student> data) throws IOException { 2         String resource = "mybatis-dataSource.xml"; 3         InputStream inputStream = null; 4         SqlSession batchSqlSession = null; 5         try{ 6             inputStream = Resources.getResourceAsStream(resource); 7             SqlSessionFactory sqlSessionFactory = new SqlSessionFactoryBuilder().build(inputStream); 8             batchSqlSession = sqlSessionFactory.openSession(ExecutorType.BATCH, false); 9             int batchCount = 500;//每批commit的个数10             for(int index = 0; index < data.size();index++){11                 Student stu = data.get(index);12                 batchSqlSession.getMapper(Student.class).insert(stu);13                 if(index !=0 && index%batchCount == 0){14                     batchSqlSession.commit();15                 }16             }17             batchSqlSession.commit();18         }catch (Exception e){19             e.printStackTrace();20         }finally {21             if(batchSqlSession != null){22                 batchSqlSession.close();23             }24             if(inputStream != null){25                 inputStream.close();26             }27         }28     }

sqlSession(List data)

 1 <?xml version="1.0" encoding="UTF-8" ?> 2 <!DOCTYPE configuration 3   PUBLIC "-//mybatis.org//DTD Config 3.0//EN" 4   "http://mybatis.org/dtd/mybatis-3-config.dtd"> 5 <configuration> 6   <environments default="development"> 7     <environment id="development"> 8       <transactionManager type="JDBC"/> 9       <dataSource type="POOLED">10         <property name="driver" value="${driver}"/>11         <property name="url" value="${url}"/>12         <property name="username" value="${username}"/>13         <property name="password" value="${password}"/>14       </dataSource>15     </environment>16   </environments>17   <mappers>18     <mapper resource="org/mybatis/example/Student.xml"/>19   </mappers>20 </configuration>

mybatis-dataSource.xml

已经在Spring-Mybatis.xml 中配置了SQLSessionFactory，那我为何还要自己去创建SQLSessionFactory呢？因此继续改良代码

 1 public static void mybatisSqlSession(List<Student> data){ 2         DefaultSqlSessionFactory sqlSessionFactory = (DefaultSqlSessionFactory) ServiceBeanConstant.CTX.getBean("sqlSessionFactory"); 3         SqlSession batchSqlSession = null; 4         try{ 5             batchSqlSession = sqlSessionFactory.openSession(ExecutorType.BATCH, false); 6             int batchCount = 500;//每批commit的个数 7             for(int index = 0; index < data.size();index++){ 8                 Student stu = data.get(index); 9                 batchSqlSession.getMapper(StudentMapper.class).insert(stu);10                 if(index !=0 && index%batchCount == 0){11                     batchSqlSession.commit();12                 }13             }14             batchSqlSession.commit();15         }catch (Exception e){16             e.printStackTrace();17         }finally {18             if(batchSqlSession != null){19                 batchSqlSession.close();20             }21         }22     }

mybatisSqlSession(List data)

这个版本的局部批处理插入是比较满意的，最终采用的方式也是这个版本。

下面放出在IService接口定义和Service的具体实现代码：

IService接口定义

1 /**2      * 批处理插入数据(方法内部定义500条为一个批次进行提交)3      * 使用注意事项：必须在XxxMappper.xml中实现<insert id="insert" ...>....<insert/>的sql4      * @param data 批量插入的数据5      * @param mClass 调用的XxxMaperr.class6      * @auth robin7      * Created on 2016/3/148      */9     void saveBatch(List<T> data,Class mClass);

saveBatch(List data,Class mClass)

Service实现

 1 @Override 2     public void saveBatch(List<T> data,Class mClass) { 3         DefaultSqlSessionFactory sqlSessionFactory = (DefaultSqlSessionFactory) ServiceBeanConstant.CTX.getBean("sqlSessionFactory"); 4         SqlSession batchSqlSession = null; 5         try{ 6             batchSqlSession = sqlSessionFactory.openSession(ExecutorType.BATCH, false); 7             int batchCount = 500;//每批commit的个数 8             for(int index = 0; index < data.size();index++){ 9                 T t = data.get(index);10                 ((BaseMapper)batchSqlSession.getMapper(mClass)).insert(t);11                 if(index !=0 && index%batchCount == 0){12                     batchSqlSession.commit();13                 }14             }15             batchSqlSession.commit();16         }catch (Exception e){17             e.printStackTrace();18         }finally {19             if(batchSqlSession != null){20                 batchSqlSession.close();21             }22         }23     }

saveBatch(List data,Class mClass)

局部和全局批处理插入对比：局部批处理，可以对特定一类的方法，进行数据批处理，不会影响其他DML语句，其他DML语句，可以正常返回影响DB_TABLE的行数。

!这样既能针对特殊需求(批处理)支持，也能支持未来需要返回影响数据行的要求。

注意：使用批处理方式进行DML操作，是无法反馈影响DB_TABLE行数的数据。无论是局部批处理还是java自带的批处理方式，皆无法反馈DB_TABLE count。

补充完善：

在我的Service实现中，通过注入的方式，获取mapper的实例

 1 public class BaseService<MAPPER extends BaseMapper, T, PK extends Serializable> implements IBaseService<T, PK> { 2  3     protected T tt; 4     /** 5      * 实体操作的自动注入Mapper(随初始化一同注入，必须用set方法) 6      */ 7     protected MAPPER mapper; 8  9     public MAPPER getMapper() {10         return mapper;11     }12 13     @Autowired14     public void setMapper(MAPPER mapper) {15         this.mapper = mapper;16     }17    //后续代码略18 }

Service

前面的Service saveBatch方法中，还需要传入指定的Mapper.class.对本项目其他开发者来说，与之前的环境相比，多传一个参数感觉别扭。

那么为何我不继续封装，外部无需传入Mapper.class，而是通过内部注入的mapper实例获取Mapper.class.
改良后的代码：

 1 @Override 2     public T saveBatch(List<T> data) { 3         T back = null; 4         DefaultSqlSessionFactory sqlSessionFactory = (DefaultSqlSessionFactory) ServiceBeanConstant.CTX.getBean("sqlSessionFactory"); 5         SqlSession batchSqlSession = null; 6         try{ 7             batchSqlSession = sqlSessionFactory.openSession(ExecutorType.BATCH, false); 8             int batchCount = 500;//每批commit的个数 9             for(int index = 0; index < data.size();index++){10                 T t = data.get(index);11                 back = t;12                 Class<?>[] interfaces=mapper.getClass().getInterfaces();13                 Class clazz = null;14                 for (int i=0;i<interfaces.length;i++){15                     if(BaseMapper.class.isAssignableFrom(interfaces[i])){16                         clazz = interfaces[i];17                     }18                 }19                 if(clazz == null){20                     throw new Exception("user-defined exception:mapper not implements interfaces com.company.dao.mapper.BaseMapper");21                 }22                 BaseMapper baseMapper = (BaseMapper) batchSqlSession.getMapper(clazz);23                 baseMapper.insert(t);24                 if(index !=0 && index%batchCount == 0){25                     batchSqlSession.commit();26                 }27             }28             batchSqlSession.commit();29         }catch (Exception e){30             e.printStackTrace();31         }finally {32             if(batchSqlSession != null){33                 batchSqlSession.close();34             }35             return back;36         }37     }

saveBatch(List data)

这里对mapper实例进行一个简短的说明：

1.mapper实例是通过java动态代理来实例化的；

2.mapper的SQLSession是使用mybatis统一的配置实例的；

3.mapper的默认执行器是SIMPLE（普通的执行器）;

————————————————————————————-

三、Mybatis foreach批量插入

Mybatis foreach 批量插入，如果批量插入的数据量大，不得不说这真是一个非常糟糕的做法。

无论是SELECT ** UNION ALL 还是BEGIN …;END; ，相对而言后者比前者稍微好点。

放出DB和我测试的结果：

耗时占当时整个数据库CPU百分比说明15.598.33union all方式拼接插入16.497.75begin end方式插入块1.5464.81java 自带的batch方式插入

①foreach union all的批量插入，现已有大量的博客资源可供参考，我就不贴出自己的实现方式了。

如果有兴趣可以参阅：http://blog.csdn.net/sanyuesan0000/article/details/19998727 (打开浏览器，复制url)

这篇博客。BEGIN END的方式，也是从这篇博客中得到启发。只不过他是把BEGIN END用在update中。

②foreach begin end 语句块

我的实现：

1 <insert id="insertBatch" parameterType="java.util.List">2            BEGIN3            <foreach collection="list" item="item" index="index" separator=";" >4                INSERT INTO TABLE.STUDENT (ID,AGE,NAME,STU_ID) VALUES5                ( DEMO.SEQ_EID.NEXTVAL,#{item.age},#{item.name},#{item.stuId} )6            </foreach>7            ;END ;8        </insert>

insertBatch

调用方式：

 1 @Override 2     public void saveBatch(List<T> list) { 3         int size = list.size(); 4         int unitNum = 500; 5         int startIndex = 0; 6         int endIndex = 0; 7         while (size > 0){ 8             if(size > unitNum){ 9                 endIndex = startIndex+unitNum;10             }else {11                 endIndex = startIndex+size;12             }13             List<T> insertData = list.subList(startIndex,endIndex);14             mapper.insertBatch(insertData);15             size = size - unitNum;16             startIndex = endIndex;17         }

saveBatch(List list)

———————————————————————

四、java自带的批处理方式

废话不多说，直接上代码

 1 package DB; 2  3 import base.Student; 4  5 import java.sql.Connection; 6 import java.sql.DriverManager; 7 import java.sql.PreparedStatement; 8 import java.sql.SQLException; 9 import java.util.ArrayList;10 import java.util.List;11 12 /**13  * Created by robin on 2016/5/23.14  *15  * @author robin16  */17 public class InsertTableDemo {18 19     public static void main(String args[]) throws SQLException {20         Connection connection = null;21         List<Student> dataList = getDataList(100000);22         long startTime = 0;23         try{24             connection = getConn();25             startTime=System.currentTimeMillis();26             connection.setAutoCommit(false);27             PreparedStatement statement = connection.prepareStatement("INSERT INTO STUDENT (ID,AGE,NAME,STU_ID) VALUES ( DEMO.SEQ_EID.NEXTVAL, ?,?,? ) ");28             int num = 0;29             for (int i = 0;i< dataList.size();i++){30                 Student s = dataList.get(i);31                 statement.setInt(1, s.getAge());32                 statement.setString(2, s.getName());33                 statement.setString(3, s.getStuId());34                 statement.addBatch();35                 num++;36                 if(num !=0 && num%500 == 0){37                     statement.executeBatch();38                     connection.commit();39                     num = 0;40                 }41             }42             statement.executeBatch();43             connection.commit();44         }catch (Exception e){45             e.printStackTrace();46             connection.rollback();47         }finally {48             if(connection != null){49                 connection.close();50             }51             long endTime=System.currentTimeMillis();52             System.out.println("方法执行时间："+(endTime-startTime)+"ms");53         }54 55     }56 57     public static Connection getConn(){58         String driver = "oracle.jdbc.driver.OracleDriver";59         String url = "jdbc:oracle:thin:@//ip:port/DMEO"; //DMEO为数据库名60         String user = "user";61         String password = "pwd";62         try{63             Class.forName(driver);64             Connection conn = DriverManager.getConnection(url, user, password);65             return conn;66         } catch (ClassNotFoundException e) {67             e.printStackTrace();68         } catch (SQLException e) {69             e.printStackTrace();70         }71         return null;72     }73     public static List<Student> getDataList(int f){74         List<Student> data = new ArrayList<>();75         for (int i =0;i<f;i++){76             Student s = new Student(""+i,"小明" + i,i);77             data.add(s);78         }79         return data;80     }81 82 83 }

JDBC BATCH

这种批量插入大量数据的方式，性能上最好。但是因为我们小组代码管理所限制，因此这种方式不使用。

————————————————————————

五、其他方式

现在已经忘了，其他方式到底使用过哪些，但总归是比以上四种效果都更差，所以没什么印象了。

如果各位，还有什么其他更好的批量插入数据的方式，欢迎加入讨论，集思广益。

以上就是这两天，对在原项目基础上不进行大变动的基础上，提供批处理插入数据的所思所行。

——————————————————————————-

后记：

这里吐槽一句：希望大家不要把未经过自己验证的东西，言之凿凿地写到博客中去。

在我做批处理这件事的时候，领导也在参阅网上的博客。其中有一篇博客，说自己在oracle中批量插入数据，采用foreach insert into (field1,field2,…) values (v11,v12,…),(v21,v22,…) ,(v31,v32,…),…也可以。

虽然我明知不行，但是无可奈何还是要去演示给领导看，在oracle中，这种写法确实不适用。

领导问我为何他说可以，我想我也只能回答：他抄别人的博客呗，抄来抄去都不自己实践验证，就想当然地写到博客里。

所以，如果你看完了我这篇分享，希望您也能写个demo验证下，起码可以加深自己的理解。

感谢领导和DB同事，在此过程中的帮助。

以上内容，都经过本人实践验证过。若转发，请在标题上标记[转]，并注明原文链接：http://www.cnblogs.com/robinjava77/p/5530681.html，作者名称：robin。并在文章最后一行附上本句话。否则，作者保留追究的权利。

阅读全文

1 0