gp借助类DBLINK访问oracle性能测试

来源:互联网 发布:vscode markdown html 编辑:程序博客网 时间:2024/06/05 10:58
0. Oracle测试数据准备:
[oracle@db1 ~]$ sqlplus system/000000


SQL*Plus: Release 11.2.0.3.0 Production on Tue Mar 25 10:26:06 2014


Copyright (c) 1982, 2011, Oracle.  All rights reserved.




Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options


SQL>  drop table test ;


Table dropped.


SQL> create table test(id int,name varchar2(20),age int,msg varchar2(20));


Table created.


SQL> insert into test values(1,'aaaaa',1,'aaaaa');


1 row created.


SQL> insert into test values(2,'bbbbb',2,'bbbbb');


1 row created.


SQL> insert into test values(3,'ccccc',3,'ccccc');


1 row created.


SQL>  insert into test values(4,'ddddd',4,'ddddd');


1 row created.


SQL> insert into test values(5,'eeeee',5,'eeeee');


1 row created.


SQL> 
SQL> commit;


Commit complete.


SQL>  select count(*) from test;


  COUNT(*)
----------
         5


SQL> INSERT INTO TEST SELECT * FROM TEST;


5 rows created.


SQL> commit;


Commit complete.


SQL> select count(*) from test;


  COUNT(*)
----------
    100000


SQL> select segment_name,bytes/1024/1024 from dba_segments where segment_name='TEST';


SEGMENT_NAME
--------------------------------------------------------------------------------
BYTES/1024/1024
---------------
TEST
              3
 模拟插入10W的数据。


1. 使用落地文件外部表加载测试
首先在oracle服务器端建数据导出,并开启gpfdist服务进程:
[oracle@db1 ~]$ gtlions.ora2text.bin user=system/000000 query='select * from test' text=csv file=test.sql fast=true
           0 rows exported at 2014-03-25 10:30:19, size 0 MB.
      100000 rows exported at 2014-03-25 10:30:20, size 2 MB.
         output file test.sql closed at 100000 rows, size 2 MB.
[oracle@db1 ~]$ nohup gpfdist -d . -p 9999 &
[1] 15147
[oracle@db1 ~]$ nohup: 忽略输入并把输出追加到"nohup.out"


[1]+  Exit 1                  nohup gpfdist -d . -p 9999
[oracle@db1 ~]$ ps -ef | grep gpfdist
oracle   15149 15068  0 10:30 pts/8    00:00:00 grep gpfdist
oracle   62994 62778  0 Mar24 ?        00:00:04 gpfdist -d . -p 9999
导出阶段耗时1S;


接下来在gp创建相关外部表并加载数据入库:
[gpadmin@bdb ~]$ psql postgres
Timing is on.
psql (8.2.15)
Type "help" for help.


postgres=# \timing on
Timing is on.
postgres=# drop table if exists gt_test;
create table gt_test(id int,name character varying(20),age int,msg character varying(20)) distributed randomly;
DROP TABLE
Time: 17.802 ms
postgres=# create table gt_test(id int,name character varying(20),age int,msg character varying(20)) distributed randomly;
drop external table if exists gt_test_ext;
CREATE TABLE
Time: 17.386 ms
postgres=# drop external table if exists gt_test_ext;
create external table gt_test_ext(like gt_test) location ('gpfdist://192.168.1.2:9999/test.sql') format 'csv' (header);
DROP EXTERNAL TABLE
Time: 6.734 ms
postgres=# create external table gt_test_ext(like gt_test) location ('gpfdist://192.168.1.2:9999/test.sql') format 'csv' (header);
NOTICE:  HEADER means that each one of the data files has a header row.
CREATE EXTERNAL TABLE
Time: 13.562 ms
postgres=# insert into gt_test select * from gt_test_ext;
select count(*) from gt_test;
INSERT 0 100000
Time: 469.955 ms
postgres=# select count(*) from gt_test;
 count  
--------
 100000
(1 row)


Time: 9.455 ms


insert阶段耗时0.174S;
一共耗时1S+0.469S=1.469S;


2. 不落地的外部表加载
在gp创建相关外部表并加载数据入库:
postgres=# \timing on
Timing is on.
postgres=# drop table if exists gt_test;
create table gt_test(id int,name character varying(20),age int,msg character varying(20)) distributed randomly;
DROP TABLE
Time: 17.969 ms
postgres=# create table gt_test(id int,name character varying(20),age int,msg character varying(20)) distributed randomly;
drop external table if exists gt_test_webext;
CREATE TABLE
Time: 16.960 ms
postgres=# drop external table if exists gt_test_webext;
DROP EXTERNAL TABLE
Time: 6.944 ms
postgres=# create external web table gt_test_webext(like gt_test) execute 'sh /home/gtlions/oracle.sh' on master format 'text' (delimiter  ','); 
CREATE EXTERNAL TABLE
Time: 9.508 ms
postgres=# insert into gt_test select * from gt_test_webext;
select count(*) from gt_test;
INSERT 0 100000
Time: 4253.003 ms
postgres=# select count(*) from gt_test;
 count  
--------
 100000
(1 row)


Time: 9.867 ms
insert阶段耗时4.523S;


3. 加大数据量进行测试
50W 100W 200W 400W 800W 1600W 2000W
经过逐步测试,在50W记录以内,两者的差别不大,在2~3倍左右;
但是一旦记录增多,两者的差距逐步放大:
50W记录差距7倍;
100W记录差距12.9倍;
200W记录差距12.9倍;
500W记录差距13.5倍;
1000W记录差距15.2倍;
。。。。。。
不过在从100W到1亿过程中,时间差距并不是线性增长的。


3. 总结
对于小表来说,使用不落地的方式尚可接受;而对于超过25MB的大小的表速度上差距太大了。
-EOF-
0 0