hadoop 集群大版本切换之保存数据和日志的方法

来源：互联网发布：java语言入门书编辑：程序博客网时间：2024/05/16 06:23

查看原文

注：将版本从0.21.0 切换到 0.20.205.0 ，或者反过来切换，没有办法用自带的upgrade命令（文中的许多操作最好能够写成脚本，手动操作过于麻烦）

转载请注明出处，谢谢，实现出来确实挺累的

测试之前的情况

测试采用三台机器作为测试：

Namenode/secondarynamenode:192.168.1.39 slave039（该节点连接外网114.212.190.92）。

Datanode:192.168.1.33 slave033

192.168.1.34 slave034

集群的用户除了root外，在hadoop用户组下有三个用户，分别是hadoop，user001，user002,user003其中user001- user003的密码为账户名。并且在对应的账户下上传一些文件，并做一些简单的mapreduce应用。

format namenode会发生的情况

这个我们知道，原先的namenode中的存放的元数据文件等都会重新生成。
如果执行namenode –format的命令会改变namenode中的hadoop_dir中的内容，但是不会改变datanode中的hadoop_dir中的内容，也就是说，如果我备份了namenode中hadop_dir/dfs/data/current中的fsimage文件，那么我还能找回我的文件数据。

设想的更换方案集合

都需要采用的方案：

首先在文件夹hadoop_dir/hadoop_d中新建一个文件夹tmp205，不要使用原先的tmp文件夹，会造成一些列的麻烦，而且事实上版本替换也不需要这个文件夹的保存。

方案一：

保存hadoop_dir文件夹，重新用hadoop_install文件夹来代替，注意不能够替换执行format命令，这样就真的找不到对应的数据了。

需要将namenode中的dfs/name/current里面的edits，VERSION等文件都要修改，这样可以识别出元数据，也就是说可以知道原来的版本中存在哪些数据以及数据的大小，但是沮丧的是，这个方案理论上是可行的，但是因为要替换的文件较多，当时没有做好，所以最后的结果是会导致只能看到存在哪些数据，但是无法读出数据。

但是该方案应该是可行的！

方案二：

该方案会在每个上节点创建hadoop_d文件夹，用于hadoop namenode –format，然后从原来的hadoop_dir的文件夹中复制一个文件hadoop_dir/dfs/data/current/fsimage即可。

注意这个方案的配置中是这样的，datanode数据文件等依然还是存在hadoop_dir中，但是log和pids文件是存在新的文件夹hadoop_d中的。

这里我给个每个版本的文件布局：

0.20.205.0

Hadoop.tmp.dir /home/hadoop/hadoop_dir/tmp205

HADOOP_LOG_DIR /home/hadoop/hadoop_d/log

HADOOP_PID_DIR /home/hadoop/hadoop_d/pids

Dfs.name.dir /home/hadoop/hadoop_d/dfs/name

Dfs.data.dir /home/hadoop/hadoop_dir/dfs/data,/data/hadoop_dir/dfs/data

Mapred.local.dir /home/hadoop/hadoop_dir/mapred/local,/data/hadoop_dir/mapred/local

Mapred.system.dir /home/hadoop/hadoop_dir/mapred/system

0.21.0

Hadoop.tmp.dir /home/hadoop/hadoop_dir/tmp21

HADOOP_LOG_DIR /home/hadoop/hadoop_dir/log

HADOOP_PID_DIR /home/hadoop/hadoop_dir/pids

Dfs.namenode.name.dir /home/hadoop/hadoop_dir/dfs/name

dfs.datanode.data.dir /home/hadoop/hadoop_dir/dfs/data,/data/hadoop_dir/dfs/data

mapreduce.cluster.local.dir /home/hadoop/hadoop_dir/mapred/local,/data/hadoop_dir/mapred/local

Mapred.jobtracker.system.dir /home/hadoop/hadoop_dir/mapred/system

更换过程

1. 备份fsimage文件！
添加新文件夹

Mkdir ~/hadoop_d

Mkdir dfs;mkdir log;mkdir mapred;mkdir tmp205;mkdir tmp21;#之所以全部建全了是为了以后修改部署方便，其实中间有些文件夹可以不用新建

配置

tar –zxvf hadoop-0.20.205.0.tar.gz(注意将该包放在hadoop_install的目录下)到新的文件夹hadoop_install 里面

修改配置文件（配置文件对应的备份已存在）

需要手动修改：

Hadoop-env.sh:

Export JAVA_HOME=/usr/lib/jvm/java-1.6.0

Export HADOOP_LOG_DIR=/home/hadoop/hadoop_d/log

Export hadoop_PID_DIR=/home/hadoop_d/pids

Masters:

192.168.1.39

Slaves:

192.168.1.33

192.168.1.34

Core-site.xml

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<name>hadoop.tmp.dir</name>

<value>/home/hadoop/hadoop_dir/tmp205</value>

<description>A base for other temporary directories.</description>

</property>

<name>fs.default.name</name>

<description>The name of the default file system. A URI whose

scheme and authority determine the FileSystem implementation. The

uri's scheme determines the config property (fs.SCHEME.impl) naming

the FileSystem implementation class. The uri's authority is used to

determine the host, port, etc. for a filesystem.</description>

</property>

</configuration>

Hdfs-site.xml

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<value>/home/hadoop/hadoop_d/dfs/name</value>

<description>Determines where on the local filesystem the DFS name node

should store the name table(fsimage). If this is a comma-delimited list

of directories then the name table is replicated in all of the

directories, for redundancy. </description>

</property>

<value>/home/hadoop/hadoop_dir/dfs/data,/data/hadoop_dir/dfs/data</value>

<description>Determines where on the local filesystem an DFS data node

should store its blocks. If this is a comma-delimited

list of directories, then data will be stored in all named

directories, typically on different devices.

Directories that do not exist are ignored.

</description>

</property>

<name>dfs.datanode.data.dir.perm</name>

</property>

<name>dfs.replication</name>

<description>Default block replication.

The actual number of replications can be specified when the file is created.

The default is used if replication is not specified in create time.

</description>

</property>

</configuration>

Mapred-site.xml

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<name>mapred.job.tracker</name>

<description>The host and port that the MapReduce job tracker runs

at. If "local", then jobs are run in-process as a single map

and reduce task.

</description>

</property>

<name>mapred.local.dir</name>

<value>/home/hadoop/hadoop_dir/mapred/local,/data/hadoop_dir/mapred/local</value>

<description>The local directory where MapReduce stores intermediate

data files. May be a comma-separated list of

directories on different devices in order to spread disk i/o.

Directories that do not exist are ignored.

!!WARNING: there is a problem to set the value as this for all the nodes, on the JobTracker

(here is SecondaryMaster(192.168.1.38), the later dir "/data/hadoop_dir/mapred/local" should not

be placed here)

</description>

</property>

<name>mapred.system.dir</name>

<value>/home/hadoop/hadoop_dir/mapred/system</value>

<description>The directory where MapReduce stores control files.

</description>

</property>

<name>mapreduce.jobtracker.staging.root.dir</name>

<value>/user/${user.name}/mapred/staging</value>

<description>The root of the staging area for users' job files

In practice, this should be the directory where users' home

directories are located (usually /user)

</description>

</property>

<name>mapred.tasktracker.map.tasks.maximum</name>

<description>The maximum number of map tasks that will be run

simultaneously by a task tracker.

</description>

</property>

<name>mapred.tasktracker.reduce.tasks.maximum</name>

<description>The maximum number of reduce tasks that will be run

simultaneously by a task tracker.

</description>

</property>

<name>mapred.child.java.opts</name>

<description>Java opts for the task tracker child processes.

The following symbol, if present, will be interpolated: @taskid@ is replaced

by current TaskID. Any other occurrences of '@' will go unchanged.

For example, to enable verbose gc logging to a file named for the taskid in

/tmp and to set the heap maximum to be a gigabyte, pass a 'value' of:

-Xmx1024m -verbose:gc -Xloggc:/tmp/@taskid@.gc

The configuration variable mapred.child.ulimit can be used to control the

maximum virtual memory of the child processes.

</description>

</property>

<name>dfs.hosts.exclude</name>

<value>/home/hadoop/hadoop_dir/slaves.exclude</value>

</property>

<name>mapred.hosts.exclude</name>

<value>/home/hadoop/hadoop_dir/slaves.exclude</value>

</property>

</configuration>

修改.bash_profile文件，即修改链接

export JAVA_HOME=/usr/lib/jvm/java-1.6.0

export HADOOP_HOME=/home/hadoop/hadoop_installs/hadoop-0.21.0

export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin

修改为

export JAVA_HOME=/usr/lib/jvm/java-1.6.0

export HADOOP_HOME=/home/hadoop/hadoop_install/hadoop-0.20.205.0

export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin

更新下 source .bash_profile

将配置好的文件复制到对应的用户目录下以及datanode：

cp .bash_profile /home/user001

cp .bash_profile /home/user002

cp .bash_profile /home/user003

scp .bash_profile hadoop@192.168.1.33:/home/hadoop

scp .bash_profile hadoop@192.168.1.34:/home/hadoop

scp -r /home/hadoop/hadoop_install hadoop@192.168.1.33:/home/hadoop

scp -r /home/hadoop/hadoop_install hadoop@192.168.1.34:/home/hadoop

Exit当前的用户再重新登录，Stop-all.sh后，执行hadoop namenode -format

替换namenode中hadoop_d/dfs/name/current/fsimage文件（从备份附件覆盖），修改hadoop_d/dfs/name/current/fsimage/VERSION中的version=-32,并且记录下namespaceIDs的值为x，将datanode中的hadoop_dir/dfs/data/current/VERSION，/datahadoop_dir/dfs/data/current/VERSION中的version改为-32，namespaceIDs改为x。

附注：

如果想测试打开的路径，可以用

Strace –o output.txt –fe open start-dfs.sh