Hibernate Batch Processing 和 second-cache 的一点纠结

来源:互联网 发布:python语音识别 编辑:程序博客网 时间:2024/06/03 17:01

先看一个官方文档示例


Example 4.1. Naive way to insert 100000lines with Hibernate

Transaction tx =session.beginTransaction(); 

for ( int i=0; i<100000; i++ ) { 

    Customer customer = new Customer(.....); 

    session.save(customer); 

tx.commit(); 

session.close();


This fails with exceptionOutOfMemoryException after around 50000 rows on most systems. The reason isthat Hibernate caches all the newly inserted Customer instances in thesession-level cache. There are several ways to avoid this problem.

 

这是一个批量处理的示例,运行会有异常抛出,原因是session的缓存溢出了

 

下面是官方给一些解决方案:


Example 4.2. Flushing and clearing theSession

ransaction tx =session.beginTransaction(); 

for ( int i=0; i<100000; i++ ) { 

    Customer customer = new Customer(.....); 

    session.save(customer); 

    if ( i % 20 == 0 ) { //20, same as the JDBC batch size 

        //flush a batch of inserts and release memory: 

       session.flush(); 

        session.clear(); 

   

tx.commit(); 

session.close();

 

还有一种方式:

If you are undertaking batch processing youwill need to enable the use of JDBC batching. This is absolutely essential ifyou want to achieve optimal performance. Setthe JDBC batch size to areasonable number (10-50, for example):

  hibernate.jdbc.batch_size 20   

Hibernate disables insert batching at theJDBC level transparently if you use anidentity identifier generator.

 

但是,纠结来了:

You can also do this kind of work in aprocess where interaction with the second-level cache is completely disabled:

 hibernate.cache.use_second_level_cachefalse

However, this is not absolutely necessary,since we can explicitly set the CacheMode to disable interaction with thesecond-level cache.

 session.setCacheMode(CacheMode.IGNORE);

 

很明显,hibernate 建议在做批量处理的时候把二级缓存关掉,但是这样不就更新不了缓存了吗,这意味着如果不手动清除缓存有可能之后的读取操作可能读到脏数据呀,为什么会这么建议呢

 

和google、百度大神们沟通了很久,又参考了一部分的代码,结论是这样的:

*本人使用hibernate基本是用对象,极少使用hql和sql,下面结论可能只适用使用对象的情况哦*

1. 批量插入

这个果断要关掉缓存,因为是新增数据,如果开缓存,会增加一堆可能没人读的缓存,现在设置缓存一般都是设置一定时间没有操作就删除的,所以关掉缓存,直接数据库,等有操作再缓存也不错哦

当然如果你的系统就是批量插入数据后马上就有操作的是可以开二级缓存的,但要小心内存溢出哦

 

2. 批量修改

这个挺纠结的,一方面想批量修改同时修改缓存,一方面又不想增加无用的缓存,但是鱼和熊掌不可兼得,这个就需要权衡一下利弊了,根据实际情况来决定

如果选择关闭缓存的话,就需要先清空缓存,一般没有必要把这个缓存都清掉

缓存管理的示例:

Example 6.5. Second-level cache eviction

//You can evict the cached state of aninstance, entire class, collectioninstance or entire collection role, usingmethods ofSessionFactory.

sessionFactory.getCache().containsEntity(Cat.class,catId); // is this particular Cat currently in the cache

sessionFactory.getCache().evictEntity(Cat.class,catId); // evict a particular Cat

sessionFactory.getCache().evictEntityRegion(Cat.class);  // evict all Cats 

sessionFactory.getCache().evictEntityRegions();  // evict all entity data 

sessionFactory.getCache().containsCollection("Cat.kittens",catId); // is this particular collection currently in the cache 

sessionFactory.getCache().evictCollection("Cat.kittens",catId); // evict a particular collection of kittens 

sessionFactory.getCache().evictCollectionRegion("Cat.kittens");// evict all kitten collections 

sessionFactory.getCache().evictCollectionRegions();// evict all collection data

3. 批量删除

个人觉得这个应该开缓存,因为会先锁住缓存并删除缓存中相应记录,然后再同步到数据,可以提高数据同步的效率哦,因为改内存中数据的效率一定很高,之后操会先到缓存中操作。

 

以上如有说的不正确或者不恰当的地方,还望大家指教~

原创粉丝点击