从零开始,快速地在VMware虚拟机下搭建一个DB2 DPF环境

来源:互联网 发布:知天命之年的古诗词 编辑:程序博客网 时间:2024/05/01 21:30

说明:

在win7下,使用两个VMware Linux 虚拟机搭建一个具有两个物理节点,四个逻辑节点的DPF环境。Linux使用的是Ubuntu 14.04(因为ubuntu的安装介质比较小), DB2使用的是10.5FP8.
搭建之后的效果为:

虚拟机1: 主机名db2a, DPF节点号0、1
虚拟机2: 主机名db2b, DPF节点号2、3


测试环境:

Windows 7 Professional, 64-bit 6.1.7601, Service Pack 1
VMware10.0.1 build-1379776
SecureCRT Version 7.2.3


测试步骤:

测试过程中,输入的命令以蓝色表示,前面#表示用root用户执行,$表示实例用户执行。需要强调的地方用红色标出。

1. 下载Ubuntu介质

下载64bit的ubuntu 14.04,文件名为 ubuntu-14.04.5-server-amd64.iso
http://releases.ubuntu.com/trusty/

2. 安装一个ubuntu

在VMware里安装ubuntu的过程略

3. 安装ssh

安装ubuntu完成后,使用root登录,安装ssh:
# apt-get install openssh-server

完成之后就可以直接使用win7里的secureCRT或者PuTTY连接虚拟机,先在Ubuntu里使用ifconfig -a 查看到ubuntu的IP地址为192.168.187.138

4. 安装NFS

# apt-get install nfs-kernel-server

5. 安装ksh

DB2需要ksh --> 说明1

# apt-get install ksh

6. 安装DB2

先用工具,比如flashfxp,上传DB2安装包到机器上
root@ubuntu:~# ls
v10.5fp8_linuxx64_server_t.tar.gz
root@ubuntu:~# tar -zxvf v10.5fp8_linuxx64_server_t.tar.gz
root@ubuntu:~# cd server_t
root@ubuntu:~/server_t# ls
db2                     db2checkCOL.tar.gz  db2_deinstall  db2ls           db2setup  installFixPack
db2checkCOL_readme.txt  db2ckupgrade        db2_install    db2prereqcheck  ibm_im    nlpack
root@ubuntu:~/server_t# ./db2_install -f sysreq -->说明2
...<skip>...
Install into default directory (/opt/ibm/db2/V10.5) ? [yes/no] 
yes
 Specify one of the following keywords to install DB2 products.

  SERVER 
  CONSV 
  EXP 
  CLIENT 
  RTCL 
Enter "help" to redisplay product names.

Enter "quit" to exit.

***********************************************************
SERVER
***********************************************************
Do you want to install the DB2 pureScale Feature? [yes/no] 
no   

..

Task #48 start
Description: Updating global profile registry 
Estimated time 3 second(s) 
Task #48 end 


The execution completed successfully.


For more information see the DB2 installation log at
"/tmp/db2_install.log.9814".

7. 安装libaio.so.1 

# apt-get install libaio-dev <--说明3

8. 修改主机名

将主机名修改为db2a,方法为修改/etc/hostname的内容为db2a,并修改相应的/etc/hosts内容为如下,重启系统生效
# cat /etc/hostname
db2a
# cat /etc/hosts
127.0.0.1       localhost
127.0.1.1       db2a
192.168.187.138 db2a

9. 克隆一台机器

关掉db2a,使用VMare克隆一个机器,修改主机名为db2b,查看到IP地址为192.168.187.139,把这个信息也放到db2a的/etc/hosts里面,最后的效果如下:
db2a:
root@db2a:~# cat /etc/hosts
127.0.0.1       localhost
#127.0.1.1       db2a <--说明4
192.168.187.138 db2a
192.168.187.139 db2b

db2b:
root@db2b:~# cat /etc/hosts
127.0.0.1       localhost
#127.0.1.1       db2b <--说明4
192.168.187.138 db2a
192.168.189.139 db2b

这时已经可以在db2a和db2b上使用hostname ping对方了


10. 共享文件系统

目地是建立一个共享目录,使db2a, db2b都能访问,实例用户要创建在这个共享目录下(DPF环境中,实例是共享的)

db2a:
root@db2a:~# mkdir /share

在/etc/exports最后添加一行
/share *(rw,sync,no_root_squash,no_subtree_check)

重启服务:
root@db2a:~# service  nfs-kernel-server restart
 * Stopping NFS kernel daemon
   ...done.
 * Unexporting directories for NFS kernel daemon...
   ...done.
 * Exporting directories for NFS kernel daemon...
   ...done.
 * Starting NFS kernel daemon
   ...done.
root@db2a:~# showmount -e
Export list for db2a:
/share *

db2b上:
root@db2b:~# service rpcbind restart
rpcbind stop/waiting
rpcbind start/running, process 1457
root@db2b:~# mkdir /share
root@db2b:~#  mount -t nfs db2a:/share /share
root@db2b:~# df
Filesystem     1K-blocks    Used Available Use% Mounted on
udev              492600       4    492596   1% /dev
tmpfs             100760     988     99772   1% /run
/dev/sda1       40120704 6603652  31455996  18% /
none                   4       0         4   0% /sys/fs/cgroup
none                5120       0      5120   0% /run/lock
none              503780       0    503780   0% /run/shm
none              102400       0    102400   0% /run/user
db2a:/share     40120704 6603648  31456000  18% /share

可以看到,db2b的/share目录已经挂载到db2a:/share上,两台机器可以并发地访问这个文件系统了。如果想要每次开机重启都自动挂载,可以在db2b的/etc/rc.local里添加一行:
sudo mount -t nfs db2a:/share /share

11. 创建DB2实例

db2a和db2b上都执行以下命令,创建用户:
# groupadd -g 999 db2iadm  
# groupadd -g 998 db2fadm
# groupadd -g 997 dasadm 
# useradd -u 1004 -g db2iadm -m -d /share/dpf105 dpf105   
# useradd -u 1003 -g db2fadm -m -d /share/db2fend db2fend
# useradd -u 1002 -g dasadm -m -d /share/dasusr dasusr 
# passwd dpf105 
# passwd db2fend
# passwd dasusr  

db2a上创建实例:
root@db2a:~# /opt/ibm/db2/V10.5/instance/db2icrt -u db2fend dpf105

创建成功之后,可以看到/etc/services文件里多了以下内容:
root@db2a:~# grep -i dpf105 /etc/services
DB2_dpf105      60000/tcp
DB2_dpf105_1    60001/tcp
DB2_dpf105_2    60002/tcp
DB2_dpf105_3    60003/tcp
DB2_dpf105_4    60004/tcp
DB2_dpf105_END  60005/tcp

在/etc/services里添加一行,作为db2实例对外服务的端口
dpf105inst      60006/tcp

db2a切换到实例用户dpf105:
root@db2a:~# su - dpf105
dpf105@db2a:~$ db2set db2comm=tcpip

db2b切换到实例用户dpf105:
root@db2b:~# su - dpf105
dpf105@db2b:~$ db2set db2comm=tcpip

DBI1306N  The instance profile is not defined.

Explanation: 

The instance is not defined in the target machine registry.

User response: 

Specify an existing instance name or create the required instance.

这个报错是因为实例dpf105在db2b上未创建过,解决办法是在db2b上把/share/dpf105/sqllib删掉,重新创建一次实例,db2b上,使用root用户:
root@db2b:~# rm -r /share/dpf105/sqllib
root@db2b:~# /opt/ibm/db2/V10.5/instance/db2icrt -u db2fend dpf105
root@db2b:~# grep -i dpf105 /etc/services
DB2_dpf105      60000/tcp
DB2_dpf105_1    60001/tcp
DB2_dpf105_2    60002/tcp
DB2_dpf105_3    60003/tcp
DB2_dpf105_4    60004/tcp
DB2_dpf105_END  60005/tcp

同样在/etc/services里添加一行,作为db2实例对外服务的端口:
dpf105inst      60006/tcp

切换到实例用户:
root@db2b:~# su - dpf105
dpf105@db2b:~$ db2set db2comm=tcpip
dpf105@db2b:~$ db2 update dbm cfg using svcename dpf105inst
DB20000I  The UPDATE DATABASE MANAGER CONFIGURATION command completed 
successfully.

这里注意一点,两个机器里的/etc/services里的端口要保持一致。另外一点,虽然实例目录是共享的,但实例是创建了两次,在db2a上创建完成之后,把sqllib删掉,再在db2b上创建一次。

12. 更新db2nodes.cfg

修改/share/dpf105/sqllib目录下的db2nodes.cfg为以下内容,在db2a或者db2b上修改都行,因为是共享的:
0 db2a 0
1 db2a 1
2 db2b 0
3 db2b 1

13. 配置ssh免密

为dpf105用户配置SSH免密,方法可以参考链接:
http://blog.csdn.net/qingsong3333/article/details/73695895

14. 测试

如果以下测试都成功(ssh不需要密码,ping都能ping通,date都能获取时间),说明配置的没有问题
db2a上:
dpf105@db2a:~$ ssh db2a date
dpf105@db2a:~$ ssh db2b date
dpf105@db2a:~$ ping db2a
dpf105@db2a:~$ ping db2b
dpf105@db2a:~$ db2_all "date"

db2b上也要做同样的测试。
(如果ssh成功,但db2_all失败,可以尝试db2set DB2RSHCMD=/usr/bin/ssh)

15. 启动实例

db2a或者db2b上都可以
dpf105@db2b:~$ db2start
06/15/2017 09:22:22     3   0   SQL1063N  DB2START processing was successful.
06/15/2017 09:22:24     2   0   SQL1063N  DB2START processing was successful.
06/15/2017 09:22:33     1   0   SQL1063N  DB2START processing was successful.
06/15/2017 09:22:35     0   0   SQL1063N  DB2START processing was successful.
SQL1063N  DB2START processing was successful.

16. 创建数据库

切换到db2a上:
dpf105@db2a:~$ db2sampl

  Creating database "SAMPLE"...
  Connecting to database "SAMPLE"...
  Creating tables and data in schema "DPF105"...
  Creating tables with XML columns and XML data in schema "DPF105"...

  'db2sampl' processing complete.

dpf105@db2a:~$ db2 terminate
DB20000I  The TERMINATE command completed successfully.

dpf105@db2a:~$ db2 "connect to sample"

   Database Connection Information

 Database server        = DB2/LINUXX8664 10.5.8
 SQL authorization ID   = DPF105
 Local database alias   = SAMPLE

dpf105@db2a:~$ db2 "values (current dbpartitionnum)"

1          
-----------
          0

  1 record(s) selected.

dpf105@db2a:~$ db2 list tablespaces

           Tablespaces for Current Database

 Tablespace ID                        = 0
 Name                                 = SYSCATSPACE
 Type                                 = Database managed space
 Contents                             = All permanent data. Regular table space.
 State                                = 0x0000
   Detailed explanation:
     Normal

 Tablespace ID                        = 1
 Name                                 = TEMPSPACE1
 Type                                 = System managed space
 Contents                             = System Temporary data
 State                                = 0x0000
   Detailed explanation:
     Normal

 Tablespace ID                        = 2
 Name                                 = USERSPACE1
 Type                                 = Database managed space
 Contents                             = All permanent data. Large table space.
 State                                = 0x0000
   Detailed explanation:
     Normal

 Tablespace ID                        = 3
 Name                                 = IBMDB2SAMPLEREL
 Type                                 = Database managed space
 Contents                             = All permanent data. Large table space.
 State                                = 0x0000
   Detailed explanation:
     Normal

 Tablespace ID                        = 4
 Name                                 = IBMDB2SAMPLEXML
 Type                                 = Database managed space
 Contents                             = All permanent data. Large table space.
 State                                = 0x0000
   Detailed explanation:
     Normal

 Tablespace ID                        = 5
 Name                                 = SYSTOOLSPACE
 Type                                 = Database managed space
 Contents                             = All permanent data. Large table space.
 State                                = 0x0000
   Detailed explanation:
     Normal

DB21011I  In a partitioned database server environment, only the table spaces 
on the current node are listed.

可以看到当前节点为0,上面有一个表空间SYSCATSPACE,这个表空间只在0号节点上存在,所以0号节点是Catalog Partition,我们再切换到3号节点上

dpf105@db2a:~$ DB2NODE=3
dpf105@db2a:~$ export DB2NODE
dpf105@db2a:~$ db2 terminate
DB20000I  The TERMINATE command completed successfully.
dpf105@db2a:~$ db2 "connect to sample"

   Database Connection Information

 Database server        = DB2/LINUXX8664 10.5.8
 SQL authorization ID   = DPF105
 Local database alias   = SAMPLE

dpf105@db2a:~$ db2 "values (current dbpartitionnum)"

1          
-----------
          3

  1 record(s) selected.

dpf105@db2a:~$ db2 list tablespaces

           Tablespaces for Current Database

 Tablespace ID                        = 1
 Name                                 = TEMPSPACE1
 Type                                 = System managed space
 Contents                             = System Temporary data
 State                                = 0x0000
   Detailed explanation:
     Normal

 Tablespace ID                        = 2
 Name                                 = USERSPACE1
 Type                                 = Database managed space
 Contents                             = All permanent data. Large table space.
 State                                = 0x0000
   Detailed explanation:
     Normal

 Tablespace ID                        = 3
 Name                                 = IBMDB2SAMPLEREL
 Type                                 = Database managed space
 Contents                             = All permanent data. Large table space.
 State                                = 0x0000
   Detailed explanation:
     Normal

 Tablespace ID                        = 4
 Name                                 = IBMDB2SAMPLEXML
 Type                                 = Database managed space
 Contents                             = All permanent data. Large table space.
 State                                = 0x0000
   Detailed explanation:
     Normal

DB21011I  In a partitioned database server environment, only the table spaces 
on the current node are listed.

看一下employee表的数据在各个节点上的分布情况:
dpf105@db2a:~$ db2 "SELECT DBPARTITIONNUM(EMPNO) as PARTITION_NUM, COUNT(*) as ROW_NUM FROM employee GROUP BY DBPARTITIONNUM(EMPNO) ORDER BY DBPARTITIONNUM(EMPNO) DESC"

PARTITION_NUM ROW_NUM    
------------- -----------
            3          12
            2          10
            1          13
            0           7

  4 record(s) selected.

最后停掉实例:
dpf105@db2a:~$ db2 connect reset

DB20000I  The SQL command completed successfully.

dpf105@db2a:~$ db2 list applications
SQL1611W  No data was returned by Database System Monitor.
dpf105@db2a:~$ db2stop
06/15/2017 22:10:26     0   0   SQL1064N  DB2STOP processing was successful.
06/15/2017 22:10:27     1   0   SQL1064N  DB2STOP processing was successful.
06/15/2017 22:10:27     2   0   SQL1064N  DB2STOP processing was successful.
06/15/2017 22:10:27     3   0   SQL1064N  DB2STOP processing was successful.
SQL1064N  DB2STOP processing was successful.

 
说明1:
如果不安装ksh,那么db2_all "date"测试的时候,会出现报错
-su: /share/dpf105/sqllib/bin/db2_all: /bin/ksh: bad interpreter: No such file or directory

说明2:
如果不加 '-f sysreq'选项,db2_install会由于下面的报错而中止,这个报错的意思是缺少32-bit的库文件,由于我们的安装的是64bit的DB2,如果没有32-bit的应用,可以忽略这个报错。加上'-f sysreq'之后,虽然也会有警告,但安装能完成。
DBT3514W  The db2prereqcheck utility failed to find the following 32-bit library file: "/lib/i386-linux-gnu/libpam.so*". 

说明3:
如果不安装libaio-dev,那么db2start会报错:
db2start: error while loading shared libraries: libaio.so.1: cannot open shared object file: No such file or directory

说明4:
这里要把这一行注释掉,否则最后一步创建数据库(db2sampl)的时候会一直HANG住,并且在db2diag.log中每隔3分钟就出现一次下面的消息:

2017-06-15-09.26.50.472345-420 E96186E683            LEVEL: Error (OS)
PID     : 30914                TID : 140413780354816 PROC : db2sysc 2
INSTANCE: dpf105               NODE : 002
HOSTNAME: db2b
EDUID   : 24                   EDUNAME: db2fcms 2
FUNCTION: DB2 UDB, oper system services, sqloPdbConnectSocket, probe:30
MESSAGE : ZRC=0x810F001D=-2129723363=SQLO_SOCKET_IN_USE
          "Socket bound to an address already"
CALLED  : OS, -, connect                          OSERR: EINVAL (22)
DATA #1 : String, 24 bytes
Unable to connect socket
DATA #2 : socket, 4 bytes
17
DATA #3 : sockaddr, 16 bytes
0x0000000200D94440 : 0200 EA60 C0A8 BB8A 0000 0000 0000 0000    ...`............

2017-06-15-09.26.50.480725-420 I96870E436            LEVEL: Severe
PID     : 30914                TID : 140413780354816 PROC : db2sysc 2
INSTANCE: dpf105               NODE : 002
HOSTNAME: db2b
EDUID   : 24                   EDUNAME: db2fcms 2
FUNCTION: DB2 UDB, fast comm manager, sqkfSendConduit::InitiateLinkConnect, probe:17
RETCODE : ZRC=0x81590070=-2124873616=SQLKF_CONN_INIT_FAIL
          "FCM connection initialization failed"

查看FCM监听端口,发现是在172.0.1.1上监听的:
dpf105@db2a:~$ netstat -an | grep -i 6000
tcp        0      0 127.0.1.1:60000         0.0.0.0:*               LISTEN     
tcp        0      0 127.0.1.1:60001         0.0.0.0:*               LISTEN    
tcp        0      0 0.0.0.0:60006           0.0.0.0:*               LISTEN

如果没有这一行的话,启动实例后就看到是在对应IP上监听的了,创建数据库也很快成功:
dpf105@db2a:~$ netstat -an | grep -i 6000
tcp        0      0 192.168.187.138:60000   0.0.0.0:*               LISTEN     
tcp        0      0 192.168.187.138:60001   0.0.0.0:*               LISTEN     
tcp        0      0 0.0.0.0:60006           0.0.0.0:*               LISTEN
阅读全文
0 0
原创粉丝点击