初试PL/SQL并行编程
来源:互联网 发布:基于sqoop数据交换平台 编辑:程序博客网 时间:2024/05/02 00:36
-----------------------------Cryking原创------------------------------
-----------------------转载请注明出处,谢谢!------------------------
并行处理能大大提高数据的处理速度,它依赖于硬件资源、网络资源等环境.
并行处理的硬件资源环境分为5大类:
1.传统的单台计算机、单处理器、单内核的机器.(无法进行并行处理,但是并行程序还是可以运行的,此时就和普通程序一样了)
2.单台计算机、单处理器、多内核的机器.(目前本人的测试环境,没钱买多处理器的)
3.单台计算机、多处理器、多内核的机器.
4.集群系统
5.分布式系统
PS:如何查看几个处理器、几核,常用的软件就是CPU-Z.
本人的测试机器CPU:
单CPU,双核心处理器
下面开始,编写一个普通的管道函数(这里演示的就是管道函数的并行处理):
CREATE OR REPLACE FUNCTION pipe_test( c_empno SYS_REFCURSOR ,p VARCHAR2 DEFAULT ',') RETURN EMP_element PIPELINED IS v_element VARCHAR2(1000);BEGIN FETCH c_empno INTO v_element; LOOP EXIT WHEN c_empno%NOTFOUND; PIPE ROW(substr(v_element, 0, instr(v_element, p) - 1)); LOOP v_element := substr(v_element ,instr(v_element, p) + 1 ,length(v_element) - instr(v_element, p) + 1); EXIT WHEN instr(v_element, p) = 0; PIPE ROW(substr(v_element, 0, instr(v_element, p) - 1)); END LOOP; PIPE ROW(v_element); FETCH c_empno INTO v_element; END LOOP; RETURN;END pipe_test;
注意需要先建立类型EMP_element,如下:
CREATE OR REPLACE TYPE EMP_element as table of varchar2(100);
然后我们建立一个并行处理的管道函数:
CREATE OR REPLACE FUNCTION parallel_test( p_empno SYS_REFCURSOR ,p VARCHAR2 DEFAULT ',') RETURN EMP_element PIPELINED PARALLEL_ENABLE(PARTITION p_empno BY ANY) IS v_element VARCHAR2(1000);BEGIN FETCH p_empno INTO v_element; LOOP EXIT WHEN p_empno%NOTFOUND; PIPE ROW(substr(v_element, 0, instr(v_element, p) - 1)); LOOP v_element := substr(v_element ,instr(v_element, p) + 1 ,length(v_element) - instr(v_element, p) + 1); EXIT WHEN instr(v_element, p) = 0; PIPE ROW(substr(v_element, 0, instr(v_element, p) - 1)); END LOOP; PIPE ROW(v_element); FETCH p_empno INTO v_element; END LOOP; RETURN;END parallel_test;
留意下启用并行的关键字PARALLEL_ENABLE.
接下来我们构造大表进行测试,根据v$pq_sesstat视图的结果,我们来判断数据库系统是否真的进行了并行处理.
12:57:52 SYS@orcl> create table big_data as select 'Cc' a,'Dd' b from dual connect by level<1000000;表已创建。已用时间: 00: 00: 01.9212:59:00 SYS@orcl> select count(*) from big_data; COUNT(*)---------- 999999已选择 1 行。已用时间: 00: 00: 00.3113:47:08 SYS@orcl> set autot trace exp stat--普通管道函数耗时40.48秒13:48:42 SYS@orcl> select SCOTT.pipe_test(CURSOR(SELECT a||','||b from big_data)) FROM DUAL;已选择 1 行。已用时间: 00: 00: 40.48执行计划----------------------------------------------------------Plan hash value: 2452824241------------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | | 2 (0)| 00:00:01 || 1 | VIEW | | 8168 | 414K| 29 (0)| 00:00:01 || 2 | COLLECTION ITERATOR PICKLER FETCH| PIPE_TEST | 8168 | | 29 (0)| 00:00:01 || 3 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |------------------------------------------------------------------------------------------------Note----- - dynamic sampling used for this statement (level=2)统计信息---------------------------------------------------------- 1000515 recursive calls 0 db block gets 1000325 consistent gets 1548 physical reads 0 redo size 6027769 bytes sent via SQL*Net to client 1214 bytes received via SQL*Net from client 10 SQL*Net roundtrips to/from client 7 sorts (memory) 0 sorts (disk) 1 rows processed13:53:24 SYS@orcl> alter system flush shared_pool;系统已更改。已用时间: 00: 00: 01.43--使用启用并行的管道函数,耗时40.32,与普通的管道函数耗时差不多--奇怪的是执行计划了多了一个BIG_DATA的全表扫描操作了13:53:26 SYS@orcl> select SCOTT.parallel_test(CURSOR(SELECT a||','||b from big_data)) FROM DUAL;已选择 1 行。已用时间: 00: 00: 40.32执行计划----------------------------------------------------------Plan hash value: 1682567826----------------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | | 2 (0)| 00:00:01 || 1 | VIEW | | 8168 | 414K| 29 (0)| 00:00:01 || 2 | COLLECTION ITERATOR PICKLER FETCH| PARALLEL_TEST | 8168 | | 29 (0)| 00:00:01 || 3 | TABLE ACCESS FULL | BIG_DATA | 968K| 7565K| 422 (3)| 00:00:06 || 4 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |----------------------------------------------------------------------------------------------------Note----- - dynamic sampling used for this statement (level=2)统计信息---------------------------------------------------------- 1000867 recursive calls 0 db block gets 1000353 consistent gets 1560 physical reads 0 redo size 6026847 bytes sent via SQL*Net to client 1006 bytes received via SQL*Net from client 8 SQL*Net roundtrips to/from client 9 sorts (memory) 0 sorts (disk) 1 rows processed--通过v$pq_sesstat视图我们发现此时系统并没有进行并行处理,(其实通过执行计划也可看出没有使用并行)13:54:21 SYS@orcl> select * from v$pq_sesstat;STATISTIC LAST_QUERY SESSION_TOTAL------------------------------ ---------- -------------Queries Parallelized 0 0DML Parallelized 0 0DDL Parallelized 0 0DFO Trees 0 0Server Threads 0 0Allocation Height 0 0Allocation Width 0 0Local Msgs Sent 0 0Distr Msgs Sent 0 0Local Msgs Recv'd 0 0Distr Msgs Recv'd 0 0已选择11行。已用时间: 00: 00: 00.00 13:54:40 SYS@orcl> alter system flush buffer_cache;系统已更改。已用时间: 00: 00: 00.0013:55:27 SYS@orcl> alter system flush shared_pool;系统已更改。已用时间: 00: 00: 01.40--接下来我们加HINT,强制并行处理,此时耗时25.03秒,降低了一半13:55:30 SYS@orcl> select /*+ parallel */scott.PARALLEL_TEST(CURSOR(SELECT a||','||b from big_data)) FROM DUAL;已选择 1 行。已用时间: 00: 00: 25.03执行计划----------------------------------------------------------Plan hash value: 3127237831-----------------------------------------------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |-----------------------------------------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | | 2 (0)| 00:00:01 | | | || 1 | PX COORDINATOR | | | | | | | | || 2 | PX SEND QC (RANDOM) | :TQ10000 | 8168 | 414K| 16 (0)| 00:00:01 | Q1,00 | P->S | QC (RAND) || 3 | VIEW | | 8168 | 414K| 16 (0)| 00:00:01 | Q1,00 | PCWP | || 4 | COLLECTION ITERATOR PICKLER FETCH| PARALLEL_TEST | 8168 | | 16 (0)| 00:00:01 | Q1,00 | PCWP | || 5 | PX BLOCK ITERATOR | | 968K| 7565K| 234 (3)| 00:00:03 | Q1,00 | PCWC | || 6 | TABLE ACCESS FULL | BIG_DATA | 968K| 7565K| 234 (3)| 00:00:03 | Q1,00 | PCWP | || 7 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 | | | |-----------------------------------------------------------------------------------------------------------------------------------Note----- - dynamic sampling used for this statement (level=2) - automatic DOP: Computed Degree of Parallelism is 2统计信息---------------------------------------------------------- 1000941 recursive calls 4 db block gets 1000516 consistent gets 1842 physical reads 0 redo size 6026847 bytes sent via SQL*Net to client 1006 bytes received via SQL*Net from client 8 SQL*Net roundtrips to/from client 9 sorts (memory) 0 sorts (disk) 1 rows processed--上面的执行计划及v$pq_sesstat都显示系统启用了并行处理,并行度为213:57:44 SYS@orcl> select * from v$pq_sesstat;STATISTIC LAST_QUERY SESSION_TOTAL------------------------------ ---------- -------------Queries Parallelized 0 1DML Parallelized 0 0DDL Parallelized 0 0DFO Trees 0 1Server Threads 0 0Allocation Height 0 0Allocation Width 0 0Local Msgs Sent 0 553Distr Msgs Sent 0 0Local Msgs Recv'd 0 553Distr Msgs Recv'd 0 0已选择11行。已用时间: 00: 00: 00.01
为了使示例更加有说服力,我们再来试一下普通管道函数+HINT强制并行看看:
16:02:36 SYS@orcl> select /*+ parallel */SCOTT.pipe_test(CURSOR(SELECT a||','||b from big_data)) FROM DUAL;已选择 1 行。已用时间: 00: 00: 43.34执行计划----------------------------------------------------------Plan hash value: 2452824241------------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | | 2 (0)| 00:00:01 || 1 | VIEW | | 8168 | 414K| 16 (0)| 00:00:01 || 2 | COLLECTION ITERATOR PICKLER FETCH| PIPE_TEST | 8168 | | 16 (0)| 00:00:01 || 3 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |------------------------------------------------------------------------------------------------Note----- - dynamic sampling used for this statement (level=2) - automatic DOP: Computed Degree of Parallelism is 1统计信息---------------------------------------------------------- 1003093 recursive calls 18 db block gets 1000973 consistent gets 1628 physical reads 0 redo size 6026843 bytes sent via SQL*Net to client 1006 bytes received via SQL*Net from client 8 SQL*Net roundtrips to/from client 100 sorts (memory) 0 sorts (disk) 1 rows processed
由于普通函数不支持并行,所以即使+HINT强制并行,系统还是没有使用并行处理.耗时依然为40多秒.
如果不使用函数,使用普通的SQL,又是可以使用HINT强制并行的,如下:
15:59:07 SYS@orcl> select /*+ parallel */ a||','||b from big_data;已选择999999行。已用时间: 00: 00: 04.09执行计划----------------------------------------------------------Plan hash value: 2638980575--------------------------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |--------------------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 968K| 7565K| 234 (3)| 00:00:03 | | | || 1 | PX COORDINATOR | | | | | | | | || 2 | PX SEND QC (RANDOM)| :TQ10000 | 968K| 7565K| 234 (3)| 00:00:03 | Q1,00 | P->S | QC (RAND) || 3 | PX BLOCK ITERATOR | | 968K| 7565K| 234 (3)| 00:00:03 | Q1,00 | PCWC | || 4 | TABLE ACCESS FULL| BIG_DATA | 968K| 7565K| 234 (3)| 00:00:03 | Q1,00 | PCWP | |--------------------------------------------------------------------------------------------------------------Note----- - dynamic sampling used for this statement (level=2) - automatic DOP: Computed Degree of Parallelism is 2统计信息---------------------------------------------------------- 6 recursive calls 0 db block gets 1577 consistent gets 1520 physical reads 0 redo size 13467035 bytes sent via SQL*Net to client 733742 bytes received via SQL*Net from client 66668 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 999999 rows processed
通过上面的示例可以看到,并行处理可以大大加速数据的处理,几乎成倍的提升性能.
但并不是并行度为几,就会提示几倍的性能,它受硬件环境,优化器内部算法,PL/SQL引擎解析等等的约束.
总之,并行处理给我们带来的性能提升是很大的,所以在某些场景使用并行编程也是很必要的.
- 初试PL/SQL并行编程
- PL/SQL编程基础
- pl/sql 编程入门
- pl/sql编程
- Oracle PL/SQL编程
- oracle pl/sql 编程
- pl/sql编程
- PL/SQL编程
- 《PL/SQL编程》
- PL/SQL编程笔记
- PL/SQL编程
- oracle pl/sql 编程
- pl/sql编程
- PL/SQL编程入门
- ORACLE PL/SQL编程
- Oracle PL/SQL编程
- oracle-pl/sql编程
- PL/SQL编程2
- C指针详解和内存分配
- 用iArduino app+以太网插板实现“iPhone,iPad&iPod无线控制Arduino”!
- 经典算法7:动态规划之多段图
- 20部商科学生必须看的好莱坞影片
- 【GBT28181开发:SIP协议实践】之设备状态查询
- 初试PL/SQL并行编程
- LoadRunner结果分析笔记
- 身份证
- 创业企业在第一年常犯的12个错误
- J2EE领域的一些技术框架结构图
- Hadoop的安装、测试、以及为伪分布式下矩阵乘法的实现
- A标签触发onclick事件而不跳转
- linferay json webservice 身份验证问题
- 生活不坏