solr6.1 支持Blob字段

来源:互联网 发布:java打包 编辑:程序博客网 时间:2024/06/05 23:43

一种最简单的方式,就是在数据库端将Blob转换为字符:

select PROPOSAL_ID,TITLE,UTL_RAW.CAST_TO_VARCHAR2(CONTENT) as CONTENT,UTL_RAW.CAST_TO_VARCHAR2(ATTACHMENT) as ATTACHMENT from bcc_proposal

但是这种方法容易造成数据库内存问题。所以考虑另外一种方法,就是在导入solr的时候,利用BlobTransformer将Blob转换为String。继续看下文:

在上一篇文章《在Idea下编译solr 6.1源码》中,我们可以在Idea中查看solr源程序了。在本篇文章中主要介绍通过修改源码在solr中支持Blob字段的导入。
步骤一:新增BlobTransformer .java

package org.apache.solr.handler.dataimport;import java.io.BufferedReader;import java.io.IOException;import java.io.InputStream;import java.io.InputStreamReader;import java.io.Reader;import java.sql.Blob;import java.sql.SQLException;import java.util.ArrayList;import java.util.List;import java.util.Map;public class BlobTransformer extends Transformer {    public static final String BLOB = "blob";    public Object transformRow(Map<String, Object> aRow, Context context) {        for (Map<String, String> map : context.getAllEntityFields()) {            String fmt = map.get(BLOB);            if (fmt == null) continue;            String column = map.get(DataImporter.COLUMN);            String srcCol = map.get(RegexTransformer.SRC_COL_NAME);            if (srcCol == null) srcCol = column;            Object o = aRow.get(srcCol);            if (o instanceof List) {                List inputs  = (List) o;                List<String> results = new ArrayList<String>();                for (Object input : inputs)                    results.add(process(input));                aRow.put(column, results);            } else if (o instanceof Blob){                Blob blob = (Blob)o;                aRow.put(column, readFromBlob(blob));            }        }        return aRow;    }    private String process(Object value) {        if (value == null) return null;        byte[] bdata = (byte[]) value;        return new String(bdata);    }    private String readFromBlob(Blob blob) {        try{            InputStream is = blob.getBinaryStream();            BufferedReader br = new BufferedReader(new InputStreamReader(is,"UTF-8"));            String str = "";            String res = "";            while((str=br.readLine())!=null){                res += str;            }            return res;        }catch (Exception e) {            e.printStackTrace();            return "";        }    }}

步骤二:
编译源代码。在solr-6.1.0\solr目录下运行ant server

步骤三:

配置db_data_config.xml文件

<dataConfig>    <dataSource name="jdbcds" type="JdbcDataSource" driver="oracle.jdbc.driver.OracleDriver" url="jdbc:oracle:thin:@//192.168.60.144:1521/NPCBJ" user="user" password="password" />    <document>        <entity dataSource="jdbcds" name="proposal" query="select * from BCC_PROPOSAL_INTELLIGENT_VIEW" deltaQuery="select PROPOSAL_ID from BCC_PROPOSAL_INTELLIGENT_VIEW where to_char(modify_time,'yyyy-mm-dd hh24:mi:ss') > '${dataimporter.last_index_time}'"                             deltaImportQuery="select * from BCC_PROPOSAL_INTELLIGENT_VIEW where PROPOSAL_ID='${dataimporter.delta.PROPOSAL_ID}'"              convertType="true" transformer="BlobTransformer">                    <field column="PROPOSAL_ID" name="id" />                    <field column="TITLE" name="title" />                       <field column="CONTENT" name="content" blob="true" />        </entity>    </document></dataConfig>

如此,就可以支持Blob字段了。为了支持她,不容易啊~

参考文章:
Blob values in my table are added to the Solr document as object strings like B@1f23c5
How to index blob field in Apache Solr indexing

1 0
原创粉丝点击