A Problem in the BioHDF bioh5g_import_reads tool

来源:互联网 发布:荣耀盒子推荐软件 编辑:程序博客网 时间:2024/04/29 21:28
     Recently, I am modifying the bioinformatics alignment software  BWA 's  I/O module by HDF5, so firstly I have to convert the fasta/fastq files to  .h5  files. I chosed  BioHDF BioHDF bioh5g_import  tool to do it, which the data I used are drosophila.fa  and  two fastq files (pair end). The problem is the convertion from fastq  to .h5 is succeed, but the convertion from fasta to .h5 is always failed. The error is that data->identifier  and  data->sequence are correct, but data->quality_values is filled with some data which actually is no data. If the data size large enough, the segement fault will appear.

     Then I did some debuging, I discovered the problem was the  definition of the BIOHDF_MAX_STRING_SIZE  in the */include/biohdf_api.h. The default size is too small for my data. The main common reason of the segement fault is using the memory illegally. So I revised the BIOHDF_MAX_STRING_SIZE   with an appropriate size, and then the problem disappeared.

 

    Suggestion: The BioHDF is in the testing period, the hdf group haven't provided official tech support. When I sent a feedback of this problem to the hdf group, they replied that we should implement  the BioHDF tools which we need on our own, some bugs in BioHDF alpha version haven't been fixed yet.

原创粉丝点击