java实现跨数据库关联运算的简便方法

来源：互联网发布：spark java编程实例编辑：程序博客网时间：2024/04/30 12:54

Java程序开发中会碰到跨数据库关联运算的情况，这里通过一个例子来看Java实现的方法。例子中sales表在db2数据库中，employee表在mysql数据库中。要将sales和employee表通过sales中的sellerid和employee中的eid关联起来，过滤出state=”California”的所有sales和employee数据。

Sales表的结构和数据如下：

Employee表的结构和数据如下：

两个表来自不同数据库，没有办法用sql来实现join。这里采用Java的数据计算类库RowSet来实现。RowSet提供了JoinRowSet和FilteredRowSet类，可以进行跨库的计算。

Java程序的编写思路是：

1、分别从db2和mysql数据库中读入sales表和employee表数据，存入CachedRowSet对象中。

2、使用JoinRowSet完成两个表的内连接。

3、使用FilteredRowSet完成条件过滤。

4、打印出结果数据。

下面两个函数分别读入db2和mysql数据，具体的代码如下：

public static RowSet db2() throws Exception {

String drive = "com.ibm.db2.jcc.DB2Driver";

String url = "jdbc:db2://127.0.0.1:50000/demo";

String DBUSER="db2admin";

String password="db2admin";

Connection conn = null;

Statement stmt= null;

ResultSet result = null;

Class.forName(drive);

conn =DriverManager.getConnection(url,DBUSER,password);

stmt = conn.createStatement();

result1 =stmt.executeQuery("SELECT * FROM sales");

CachedRowSetcachedRS = new CachedRowSetImpl();

cachedRS.populate(result);

result.close();

stmt.close();

conn.close();

returncachedRS;

}

public staticRowSetmysql() throws Exception {

Stringdrive = "com.mysql.jdbc.Driver";

String url ="jdbc:mysql://127.0.0.1:3306/test";

String DBUSER="root";

String password="root";

Connection conn = null;

Statement stmt= null;

ResultSet result1 = null;

Class.forName(drive);

conn=DriverManager.getConnection(url,DBUSER,password);

stmt = conn.createStatement();

result1 =stmt.executeQuery("SELECT * FROM employee");

CachedRowSetcachedRS = newCachedRowSetImpl();

cachedRS.populate(result1);

result1.close();

stmt.close();

conn.close();

returncachedRS;

}

下面的函数实现两个表的连接（join）和过滤（filter）。

publicstatic void myJoin() throws Exception {

//从两个数据库中取数

RowSetmysqlRS= mysql();

RowSetdb2RS= db2();

//完成两个表的join

JoinRowSetjoinRS= new JoinRowSetImpl();

joinRS.addRowSet(db2RS,"SELLERID");

joinRS.addRowSet(mysqlRS,"EID");

//完成条件过滤

FilteredRowSetfilterRS= new FilteredRowSetImpl();

filterRS.populate(joinRS);

StateRangerange = new StateRange();//过滤条件，具体实现见后

filterRS.setFilter(range);

while(filterRS.next()){//打印结果

int ORDERID=filterRS.getInt("ORDERID");

int SELLERID =filterRS.getInt("SELLERID");

StringNAME = filterRS.getString("NAME");

String STATE =filterRS.getString("STATE");

System.out.print("ORDERID="+ORDERID+";");

System.out.print("SELLERID="+SELLERID+";");

System.out.print("NAME="+NAME+";");

System.out.print("STATE="+STATE+";");

}

其中的StateRange对象需要自行实现，例如下面这个内部类：

public staticclass StateRange implements Predicate {

publicStateRange(){}

publicbooleanevaluate(RowSetrs) {

try {

if(rs.getString("STATE").equals("California"))

return true;//如果state等于California则保留

} catch (SQLException e) {

// do nothing

}

return false;

}

publicboolean evaluate(Objectvalue, int column) throws SQLException {

return false;

}

publicboolean evaluate(Objectvalue, String columnName)

throwsSQLException{

return false;

}

上面的代码实现了db2和mysql的跨库关联和过滤计算，但是有很多局限。首先是JoinRowSet只支持inner join，不支持outterjoin。第二是db2、mysql和hsql经过测试是可以使用JoinRowSet的，但是oracle 11g和其他数据库关联的的时候虽然不报错，但是结果集为空。如果是oracle11g的两个数据库用户跨库做join，使用JoinRowSet可以得到正确的结果。所以说不同数据库厂家提供的Jdbc实现可能会影响上述方法的结果。第三，就是编程还是有点复杂。

采用集算器esProc辅助会是个更轻松的方案。集算器是专门为结构化（半结构化）数据处理设计的开发语言，实现跨数据库的关联计算很轻松，并和Java程序能无缝结合，从而使Java程序可以象SQL那样灵活实现跨库数据计算。集算器支持各种数据库，包括：oracle、db2、mysql、sqlserver、sybase、postgre等，均可完成inner join和outter join等各种跨库关联运算。