Sqlldr性能优化总结

来源：互联网发布：网络舆情管理工作制度编辑：程序博客网时间：2024/04/29 11:20

Usage: SQLLDR keyword=value [,keyword=value,...]

部分关键字:

userid -- ORACLE username/password

control -- 控制文件

log -- 记录的日志文件

bad -- 坏数据文件

data -- 数据文件

discard -- 丢弃的数据文件

discardmax -- 允许丢弃数据的最大值 (默认全部)

skip -- Number of logical records to skip (默认0)

load -- Number of logical records to load (默认全部)

errors -- 允许的错误记录数 (默认50)

rows --（每次提交的记录数，默认: 常规路径 64, 直接路径全部，所以使用直接路径的话，效率会比普通的好太多太多）

bindsize --（每次提交记录的缓冲区的大小，字节为单位，默认256000)

silent -- 禁止输出信息 (header,feedback,errors,discards,partitions)

direct -- 使用直通路径方式导入 (默认FALSE)

parfile -- parameter file: name of file that contains parameter specifications

parallel -- 并行导入

(默认FALSE，注意：parallel并不是让一个sqlldr语句起多个进程来加载数据,而是不锁住加载表,允许别的直接路径加载. 所以要使parallel起作用,应该先将要加载的数据文件分成多个,用多个sqlldr语句同时加载,如下例:

skip_unusable_indexes -- disallow/allow unusable indexes or index partitions(默认FALSE)

skip_index_maintenance -- do not maintain indexes, mark affected indexes as unusable(默认FALSE,这个最好设置为TRUE)

file -- file to allocate extents from

skip_unusable_indexes -- disallow/allow unusable indexes or index partitions (默认 FALSE)

skip_index_maintenance -- do not maintain indexes, mark affected indexes as unusable (默认 FALSE)

commit_discontinued -- commit loaded rows when load is discontinued (默认 FALSE)

readsize -- size of read buffer (默认 1048576)

external_table -- use external table for load; NOT_USED, GENERATE_ONLY, EXECUTE (默认 NOT_USED)

columnarrayrows -- number of rows for direct path column array (默认 5000)

streamsize -- size of direct path stream buffer in bytes (默认 256000)

multithreading -- use multithreading in direct path

resumable -- enable or disable resumable for current session (默认 FALSE)

resumable_name -- text string to help identify resumable statement

resumable_timeout -- wait time (in seconds) for RESUMABLE (默认 7200)

date_cache -- size (in entries) of date conversion cache (默认 1000)

0) 配置文件

　　************* 以下是4种装入表的方式

　　APPEND // 原先的表有数据就加在后面

　　INSERT // 装载空表如果原先的表有数据 sqlloader会停止默认值

　　REPLACE // 原先的表有数据原先的数据会全部删除

　　TRUNCATE // 指定的内容和replace的相同会用truncate语句删除现存数据

*************

import.ctl(含序列,不能设direct=true):

load data

infile '/home/oracle/data/import.dat'

append

into table sys_login_records_detail_1

fields terminated by ','

(

LOGIN_DATE DATE "yyyy-mm-dd" TERMINATED BY whitespace,

FILLER_1 FILLER,

GAME_ID,

FILLER_2 FILLER,

IP,

MAC,

NETBAR_ID,

PROVINCE,

CITY,

LOGIN_TIMES,

id POSITION(1:1) "sys_login_records_detail_seq_1.nextval" ---序列

)

import.ctl(不含序列,可设direct=true):

Load data

infile '/home/oracle/data/import.dat'

append

into table sys_login_records_detail_1

fields terminated by ','

(

LOGIN_DATE DATE "yyyy-mm-dd" TERMINATED BY whitespace,

FILLER_1 FILLER,

GAME_ID,

FILLER_2 FILLER,

IP,

MAC,

NETBAR_ID,

PROVINCE,

CITY,

LOGIN_TIMES

)

1) 老例导入

sqlldr user/password@dbname

control=/home/oracle/data/install.ctl

log=/home/oracle/data/export.log

bad=/home/oracle/data/export.bad

##Space allocated for bind array: 132352 bytes(64 rows)

##Elapsed time was: 00:01:45.85

2) 1000条提交一次

sqlldr user/password@dbname

control=/home/oracle/data/install.ctl

log=/home/oracle/data/export.log

bad=/home/oracle/data/export.bad

rows=1000 readsize=2068000 bindsize=2068000

##Space allocated for bind array: 2068000 bytes(1000 rows)

##Elapsed time was: 00:00:22.80

3) 10000条提交一次

sqlldr user/password@dbname

control=/home/oracle/data/install.ctl

log=/home/oracle/data/export.log

bad=/home/oracle/data/export.bad

rows=10000 readsize=20680000 bindsize=20680000

##Space allocated for bind array: 20680000 bytes(10000 rows)

##Elapsed time was: 00:00:20.25

4) 设置direct=true, 含序列

sqlldr user/password@dbname

control=/home/oracle/data/install.ctl

log=/home/oracle/data/export.log

bad=/home/oracle/data/export.bad

direct=true

##ORA-01400: cannot insert NULL into ("EMOA"."sys_login_records_detail_1"."ID")

5) 设置direct=true, 去掉序列字段, 最快的行动

#SQL> alter table emoa.sys_login_records_detail_1 drop column id;

sqlldr user/password@dbname

control=/home/oracle/data/export.ctl

log=/home/oracle/data/export.log

bad=/home/oracle/data/export.bad

direct=true

##Elapsed time was: 00:00:10.98

##但是假如有频频数据, 会把唯一索引置为unusable, 要确保数据源已剔重

6) 设置readsize和bindsize到最大值, 高效又稳妥的行动 , 终极回收方案

sqlldr user/password@dbname

control=/home/oracle/data/export.ctl

log=/home/oracle/data/export.log

bad=/home/oracle/data/export.bad

rows=100160 readsize=20971520 bindsize=20971520 PARALLEL=TRUE

##Space allocated for bind array: 20970240 bytes(10160 rows)

##Elapsed time was: 00:00:14.36

提高 SQL*Loader 的性能

　　1) 一个简单而容易忽略的问题是，没有对导入的表使用任何索引和/或约束(主键)。如果这样做，甚至在使用ROWS=参数时，会很明显降低数据库导入性能。

　　2) 可以添加 DIRECT=TRUE来提高导入数据的性能。当然，在很多情况下，不能使用此参数。

　　3) 通过指定 UNRECOVERABLE选项，可以关闭数据库的日志。这个选项只能和 direct 一起使用。

　　4) 可以同时运行多个导入任务。

　　常规导入与direct导入方式的区别

　　常规导入可以通过使用 INSERT语句来导入数据。Direct导入可以跳过数据库的相关逻辑(DIRECT=TRUE)，而直接将数据导入到数据文件中。

1.direct对性能的影响是巨大的，如果为Fasle（默认值），1万的记录需要100秒，太慢了，后来设置为TRUE，配合其他的设置，200万的记录，330秒搞定，这个速度和MySQL中的select ....into的效率差不多，应该还算很不错的了。不过由于我在Oracle中

对大对象进行了压缩，所以需要10分钟

2.如果表中有索引的话，是不能指定direct=TRUE的，除非使用skip_index_maintenance=TRUE，这个就是在导入的时候忽略索引，所以在数据导入完毕以后，查看索引的状态应该都是无效的，需要重建之,如下SQL语句

select * from dba_indexes where table_name='?'

alter idnex index_name rebuild

不过在我测试过程中，这种方式是比较合理的，重建索引比新建索引要快很多，如我200W的记录，重建主键只需1分钟多，新建的话则要7分钟。

3.在数据导入的过程，让该表不记录日志,数据库不开启归档日志

alter database noarchivelog

alter table BLOG nologging

--End--

0 0