工作总结6 JXL Excel性能优化

来源:互联网 发布:nginx隐藏真实ip 编辑:程序博客网 时间:2024/05/17 09:22
JXL的性能问题分析

 对于韩国人写得jxl的问题

    最近用jxl做得模板导出数据出现了严重的性能问题:

    (1)在main方法中运行此业务逻辑时间大约1.662秒,打印时间如图:


    虚拟机运行情况如图:


    (2)在Tomcat中运行此业务逻辑时间大约62.182秒,打印时间如图:


    虚拟机运行情况如图:


    (3)分析情况

    通过数据和图表认真分析,main方法之所以运行快,因为JVM里面没有大量的垃圾回收,占用的内存又不是很大,jxl进行垃圾回收时,占用的消耗时间不大。

    在tomcat运行中,可以看到JVM里面程序占用的内存比较大,有垃圾存在,每次进行垃圾回收,都会占用大量的时间。

通过数据分析,main方法的垃圾回收时间很多安基本上忽略不计,而在tomcat里,由于模板总共有47个模板工作表,每个工作表获取数据,要调用3次垃圾回事,每次大约用时0.4秒多,每个工作表用时1.3秒以上,通过计算,大约是60秒以上,可见对整个数据导出占用了大量的时间。

    (4)源码分析

    代码调用处:

                       Workbook wb = Workbook.getWorkbook(file);

                       WritableWorkbook wwb = Workbook.createWorkbook(new File(dstFile),wb);

    此程序相当于打开一个文件的副本,进行修改数据,然后回写到源文件

     产生慢的原因:

     产生一个workbook,读取文件

/**
* A factory method which takes in an excel file and reads in the contents.
*
* @exception IOException
* @exception BiffException
* @param file the excel 97 spreadsheet to parse
* @return a workbook instance
*/
public static Workbook getWorkbook(java.io.File file)
throws IOException, BiffException
{
return getWorkbook(file, new WorkbookSettings());
}

下面这个类的一些参数就是在读取或写入工作簿时用到的(根据一些参数处理)

/**
* This is a bean which client applications may use to set various advanced
* workbook properties. Use of this bean is not mandatory, and its absence
* will merely result in workbooks being read/written using the default
* settings
*/
public final class WorkbookSettings

看一下它的构造器:

/**
* Default constructor
*/
public WorkbookSettings()
{
initialFileSize = DEFAULT_INITIAL_FILE_SIZE;
arrayGrowSize = DEFAULT_ARRAY_GROW_SIZE;
localeFunctionNames = new HashMap();
excelDisplayLanguage = CountryCode.USA.getCode();
excelRegionalSettings = CountryCode.UK.getCode();
refreshAll = false;
template = false;
excel9file = false;
windowProtected = false;
hideobj = HIDEOBJ_SHOW_ALL;

// Initialize other properties from the system properties
try
{
boolean suppressWarnings = Boolean.getBoolean("jxl.nowarnings");
setSuppressWarnings(suppressWarnings);
drawingsDisabled = Boolean.getBoolean("jxl.nodrawings");
namesDisabled = Boolean.getBoolean("jxl.nonames");
gcDisabled = Boolean.getBoolean("jxl.nogc");
rationalizationDisabled = Boolean.getBoolean("jxl.norat");
mergedCellCheckingDisabled =
Boolean.getBoolean("jxl.nomergedcellchecks");
formulaReferenceAdjustDisabled =
Boolean.getBoolean("jxl.noformulaadjust");
propertySetsDisabled = Boolean.getBoolean("jxl.nopropertysets");
ignoreBlankCells = Boolean.getBoolean("jxl.ignoreblanks");
cellValidationDisabled = Boolean.getBoolean("jxl.nocellvalidation");
autoFilterDisabled = !Boolean.getBoolean("jxl.autofilter");
// autofilter currently disabled by default
useTemporaryFileDuringWrite =
Boolean.getBoolean("jxl.usetemporaryfileduringwrite");
String tempdir =
System.getProperty("jxl.temporaryfileduringwritedirectory");

if (tempdir != null)
{
temporaryFileDuringWriteDirectory = new File(tempdir);
}

encoding = System.getProperty("file.encoding");
}
catch (SecurityException e)
{
logger.warn("Error accessing system properties.", e);
}

// Initialize the locale to the system locale
try
{
if (System.getProperty("jxl.lang") == null ||
System.getProperty("jxl.country") == null)
{
locale = Locale.getDefault();
}
else
{
locale = new Locale(System.getProperty("jxl.lang"),
System.getProperty("jxl.country"));
}

if (System.getProperty("jxl.encoding") != null)
{
encoding = System.getProperty("jxl.encoding");
}
}
catch (SecurityException e)
{
logger.warn("Error accessing system properties.", e);
locale = Locale.getDefault();
}
}

其中, gcDisabled = Boolean.getBoolean("jxl.nogc");这个是读取这个是否垃圾回收的参数,默认是false

就是这个这个参数导致读取sheet对象非常的慢,下面会继续进行分析。

Workbook.createWorkbook(new File(dstFile),wb); 这个方法调用了WorkBook

咱们来进行一些简单的分析:

/**
* Creates a writable workbook with the given filename as a copy of
* the workbook passed in. Once created, the contents of the writable
* workbook may be modified
*
* @param file the output file for the copy
* @param in the workbook to copy
* @return a writable workbook
* @exception IOException
*/
public static WritableWorkbook createWorkbook(java.io.File file,
Workbook in)
throws IOException
{
return createWorkbook(file, in, new WorkbookSettings());
}

这个类相当于产生对Excel修改对象,产生的是一个原文件副本的修改回写

/**
* Creates a writable workbook with the given filename as a copy of
* the workbook passed in. Once created, the contents of the writable
* workbook may be modified
*
* @param file the output file for the copy
* @param in the workbook to copy
* @param ws the configuration for this workbook
* @return a writable workbook
*/
public static WritableWorkbook createWorkbook(java.io.File file,
Workbook in,
WorkbookSettings ws)
throws IOException
{
FileOutputStream fos = new FileOutputStream(file);
WritableWorkbook w = new WritableWorkbookImpl(fos, in, true, ws);
return w;
}

此处分析下:WritableWorkbook w = new WritableWorkbookImpl(fos, in, true, ws);中的WritableWorkbookImpl类。

/**
* A pseudo copy constructor. Takes the handles to the font and formatting
* records
*
* @exception IOException
* @param w the workbook to copy
* @param os the output stream to write the data to
* @param cs TRUE if the workbook should close the output stream, FALSE
* @param ws the configuration for this workbook
*/
public WritableWorkbookImpl(OutputStream os,
Workbook w,
boolean cs,
WorkbookSettings ws) throws IOException
{。。。。。。。。。。。。。。。。。。。}

这个构造方法中调用了copyWorkbook(w);,实现了对原Excel文件数据对象的拷贝,相当于一个副本

/**
* Produces a writable copy of the workbook passed in by
* creating copies of each sheet in the specified workbook and adding
* them to its own record
*
* @param w the workbook to copy
*/
private void copyWorkbook(Workbook w)
{
int numSheets = w.getNumberOfSheets();
wbProtected = w.isProtected();
Sheet s = null;
WritableSheetImpl ws = null;
long start= System.currentTimeMillis();
for (int i = 0 ; i < numSheets; i++)
{
long starts= System.currentTimeMillis();
s = w.getSheet(i);
System.out.println("+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++"+( System.currentTimeMillis() - starts)/1000.0+" 秒");
long start3= System.currentTimeMillis();
ws = (WritableSheetImpl) createSheet(s.getName(),i, false);
System.out.println("~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"+s.getName());
System.out.println("+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++"+( System.currentTimeMillis() - start3)/1000.0+" 秒");
long start2= System.currentTimeMillis();
ws.copy(s);
System.out.println("@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"+( System.currentTimeMillis() - start2)/1000.0+" 秒");
}
long start1= System.currentTimeMillis();
System.out.println("********************************************************************"+(start1 - start)/1000.0+" 秒");
}

分析下这个影响性能的地方:s = w.getSheet(i);

sheet的实现类是:SheetImpl (public class SheetImpl implements Sheet)

这个类的构造器:

SheetImpl(File f,
SSTRecord sst,
FormattingRecords fr,
BOFRecord sb,
BOFRecord wb,
boolean nf,
WorkbookParser wp)
throws BiffException
{
excelFile = f;
sharedStrings = sst;
formattingRecords = fr;
sheetBof = sb;
workbookBof = wb;
columnInfosArray = new ArrayList();
sharedFormulas = new ArrayList();
hyperlinks = new ArrayList();
rowProperties = new ArrayList(10);
columnInfosInitialized = false;
rowRecordsInitialized = false;
nineteenFour = nf;
workbook = wp;
workbookSettings = workbook.getSettings();

}中的workbookSettings = workbook.getSettings();就是WorkBook的参数的传递,里面有垃圾回收的控制参数

看一下WorkBook具体实现类WorkbookParser获取sheet的方法:

/**
* Gets the specified sheet within this workbook
*
* @param index the zero based index of the required sheet
* @return The sheet specified by the index
*/
public Sheet getSheet(int index)
{
// First see if the last sheet index is the same as this sheet index.
// If so, then the same sheet is being re-requested, so simply
// return it instead of rereading it
if ((lastSheet != null) && lastSheetIndex == index)
{
return lastSheet;
}

// Flush out all of the cached data in the last sheet
if (lastSheet != null)
{
lastSheet.clear();
if (!settings.getGCDisabled())
{
System.gc();
}
}

lastSheet = (SheetImpl) sheets.get(index);
lastSheetIndex = index;
lastSheet.readSheet();

return lastSheet;
}

从此方法中可以看出调用了垃圾回收 if (!settings.getGCDisabled()){
System.gc();
}

lastSheet.clear();中看下clear方法也调用垃圾回收方法

final void clear()
{
cells = null;
mergedCells = null;
columnInfosArray.clear();
sharedFormulas.clear();
hyperlinks.clear();
columnInfosInitialized = false;

if (!workbookSettings.getGCDisabled())
{
System.gc();
}
}

在readSheet()方法中也调用了垃圾回收方法

/**
* Reads in the contents of this sheet
*/
final void readSheet()

if (!workbookSettings.getGCDisabled())
{
System.gc();
}

可以源码分析到复制副本,一个sheet总共用了3次垃圾回收,导致时间消耗很大。

为了提高性能,做了如下程序的修改:

jxl在读取excel的时候,关闭了手工垃圾回收

System.setProperty("jxl.nogc","true");

或者

fis = new FileInputStream(file);
WorkbookSettings wbs = new WorkbookSettings();
wbs.setGCDisabled(true);
wb = Workbook.getWorkbook(fis,wbs);
WorkbookSettings wbss = new WorkbookSettings();
wbss.setGCDisabled(true);
wwb = Workbook.createWorkbook(new File(dstFile),wb,wbss);

    垃圾回收器"也许并不像许多人想象的一样会立即执行(当堆中的资源需要释放时),而是在引用类型的引用被删除和它在"堆"中的对象实例被删除中间有个间隔,为什么呢? 因为"垃圾回收器"的调用是比较消耗系统资源的,因此不可能经常被调用! (当然,用户代码可以用方法System.GC.Collect()来强制执行"垃圾回收器")


0 0