Hadoop安装记录及测试

来源:互联网 发布:c语言 项目开发实例 编辑:程序博客网 时间:2024/04/27 23:13
这两天学习了一下hadoop,在虚机上做了伪部署,暂时把安装和测试过程记录下来备忘(未做整理也未作排版哦),等后面有时间了再补充完整吧。 ^^ 顺便庆祝下自己终于使用全了博客、微博和微信(是不是有点out了)疑问



tar -vxf hadoop-1.1.2.tar.gzcd hadoop-1.1.2/confvi hadoop-env.shexport JAVA_HOME=/usr/java/jdk1.6.0_26export HADOOP_HOME=/home/centos/soft/hadoop-1.1.2export PATH=$PATH:/home/centos/soft/hadoop-1.1.2/bin. ./conf/hadoop-env.shvi core-site.xml<property><name>fs.default.name</name><value>hdfs://localhost:9000</value></property><property>    <name>hadoop.tmp.dir</name>    <value>/home/centos/soft/hadooptmp/hadoop-${user.name}</value> </property>vi hdfs-site.xml<property>        <name>dfs.name.dir</name>        <value>/home/centos/soft/hadoop/hdfs/name</value></property><property>        <name>dfs.data.dir</name>        <value>/home/centos/soft/hadoop/hdfs/data</value></property><property>        <name>dfs.replication</name>        <value>1</value></property>Vi mapred-site.xml<property>        <name>mapred.job.tracker</name>        <value>localhost:9001</value></property>vi masters127.0.0.1vi slaves127.0.0.1vi /etc/hosts127.0.0.1 localhost localhost.localdomin cxz.localdomain127.0.0.1 master127.0.0.1 slavemkidr /home/centos/soft/hadoop/mkidr /home/centos/soft/hadoop/namemkidr /home/centos/soft/hadoop/datamkidr /home/centos/soft/hadooptmp/格式化./bin/hadoop namenode –format13/08/11 16:16:27 INFO namenode.NameNode: STARTUP_MSG: /************************************************************STARTUP_MSG: Starting NameNodeSTARTUP_MSG:   host = cxz.localdomain/127.0.0.1STARTUP_MSG:   args = [-format]STARTUP_MSG:   version = 1.1.2STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1 -r 1440782; compiled by 'hortonfo' on Thu Jan 31 02:03:24 UTC 2013************************************************************/Re-format filesystem in /home/centos/soft/hadoop/hdfs/name ? (Y or N) Y13/08/11 16:16:29 INFO util.GSet: VM type       = 64-bit13/08/11 16:16:29 INFO util.GSet: 2% max memory = 17.77875 MB13/08/11 16:16:29 INFO util.GSet: capacity      = 2^21 = 2097152 entries13/08/11 16:16:29 INFO util.GSet: recommended=2097152, actual=209715213/08/11 16:16:29 INFO namenode.FSNamesystem: fsOwner=centos13/08/11 16:16:29 INFO namenode.FSNamesystem: supergroup=supergroup13/08/11 16:16:29 INFO namenode.FSNamesystem: isPermissionEnabled=true13/08/11 16:16:29 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=10013/08/11 16:16:29 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)13/08/11 16:16:29 INFO namenode.NameNode: Caching file names occuring more than 10 times 13/08/11 16:16:30 INFO common.Storage: Image file of size 112 saved in 0 seconds.13/08/11 16:16:30 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/home/centos/soft/hadoop/hdfs/name/current/edits13/08/11 16:16:30 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/home/centos/soft/hadoop/hdfs/name/current/edits13/08/11 16:16:30 INFO common.Storage: Storage directory /home/centos/soft/hadoop/hdfs/name has been successfully formatted.13/08/11 16:16:30 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************SHUTDOWN_MSG: Shutting down NameNode at cxz.localdomain/127.0.0.1************************************************************/启动hadoop$ ./bin/start-all.sh starting namenode, logging to /home/centos/soft/hadoop-1.1.2/logs/hadoop-centos-namenode-cxz.localdomain.out127.0.0.1: Warning: $HADOOP_HOME is deprecated.127.0.0.1: 127.0.0.1: starting datanode, logging to /home/centos/soft/hadoop-1.1.2/logs/hadoop-centos-datanode-cxz.localdomain.out127.0.0.1: Warning: $HADOOP_HOME is deprecated.127.0.0.1: 127.0.0.1: starting secondarynamenode, logging to /home/centos/soft/hadoop-1.1.2/logs/hadoop-centos-secondarynamenode-cxz.localdomain.outstarting jobtracker, logging to /home/centos/soft/hadoop-1.1.2/logs/hadoop-centos-jobtracker-cxz.localdomain.out127.0.0.1: Warning: $HADOOP_HOME is deprecated.127.0.0.1: 127.0.0.1: starting tasktracker, logging to /home/centos/soft/hadoop-1.1.2/logs/hadoop-centos-tasktracker-cxz.localdomain.out$ jps13121 NameNode13581 TaskTracker13461 JobTracker19761 Jps13378 SecondaryNameNode$ ./bin/hadoop namenode -report13/08/11 17:57:04 INFO namenode.NameNode: STARTUP_MSG: /************************************************************STARTUP_MSG: Starting NameNodeSTARTUP_MSG:   host = cxz.localdomain/127.0.0.1STARTUP_MSG:   args = [-report]STARTUP_MSG:   version = 1.1.2STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.1 -r 1440782; compiled by 'hortonfo' on Thu Jan 31 02:03:24 UTC 2013************************************************************/Usage: java NameNode [-format [-force ] [-nonInteractive]] | [-upgrade] | [-rollback] | [-finalize] | [-importCheckpoint] | [-recover [ -force ] ]13/08/11 17:57:04 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************SHUTDOWN_MSG: Shutting down NameNode at cxz.localdomain/127.0.0.1************************************************************/$ ./bin/hadoop jar hadoop-examples-1.1.2.jar pi 4 2Number of Maps  = 4Samples per Map = 2Wrote input for Map #0Wrote input for Map #1Wrote input for Map #2Wrote input for Map #3Starting Job13/08/11 19:45:06 INFO mapred.FileInputFormat: Total input paths to process : 413/08/11 19:45:07 INFO mapred.JobClient: Running job: job_201308111944_000113/08/11 19:45:08 INFO mapred.JobClient:  map 0% reduce 0%13/08/11 19:45:33 INFO mapred.JobClient:  map 25% reduce 0%13/08/11 19:45:38 INFO mapred.JobClient:  map 50% reduce 0%13/08/11 19:46:15 INFO mapred.JobClient:  map 50% reduce 16%13/08/11 19:47:10 INFO mapred.JobClient:  map 75% reduce 16%13/08/11 19:47:18 INFO mapred.JobClient:  map 100% reduce 16%13/08/11 19:47:21 INFO mapred.JobClient:  map 100% reduce 25%13/08/11 19:47:25 INFO mapred.JobClient:  map 100% reduce 100%13/08/11 19:47:33 INFO mapred.JobClient: Job complete: job_201308111944_000113/08/11 19:47:33 INFO mapred.JobClient: Counters: 3013/08/11 19:47:33 INFO mapred.JobClient:   Job Counters 13/08/11 19:47:33 INFO mapred.JobClient:     Launched reduce tasks=113/08/11 19:47:33 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=24998913/08/11 19:47:33 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=013/08/11 19:47:33 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=013/08/11 19:47:33 INFO mapred.JobClient:     Launched map tasks=413/08/11 19:47:33 INFO mapred.JobClient:     Data-local map tasks=413/08/11 19:47:33 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=11149913/08/11 19:47:33 INFO mapred.JobClient:   File Input Format Counters 13/08/11 19:47:33 INFO mapred.JobClient:     Bytes Read=47213/08/11 19:47:33 INFO mapred.JobClient:   File Output Format Counters 13/08/11 19:47:33 INFO mapred.JobClient:     Bytes Written=9713/08/11 19:47:33 INFO mapred.JobClient:   FileSystemCounters13/08/11 19:47:33 INFO mapred.JobClient:     FILE_BYTES_READ=9413/08/11 19:47:33 INFO mapred.JobClient:     HDFS_BYTES_READ=96413/08/11 19:47:33 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=29020013/08/11 19:47:33 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=21513/08/11 19:47:33 INFO mapred.JobClient:   Map-Reduce Framework13/08/11 19:47:33 INFO mapred.JobClient:     Map output materialized bytes=11213/08/11 19:47:33 INFO mapred.JobClient:     Map input records=413/08/11 19:47:33 INFO mapred.JobClient:     Reduce shuffle bytes=11213/08/11 19:47:33 INFO mapred.JobClient:     Spilled Records=1613/08/11 19:47:33 INFO mapred.JobClient:     Map output bytes=7213/08/11 19:47:33 INFO mapred.JobClient:     Total committed heap usage (bytes)=63196364813/08/11 19:47:33 INFO mapred.JobClient:     CPU time spent (ms)=3454013/08/11 19:47:33 INFO mapred.JobClient:     Map input bytes=9613/08/11 19:47:33 INFO mapred.JobClient:     SPLIT_RAW_BYTES=49213/08/11 19:47:33 INFO mapred.JobClient:     Combine input records=013/08/11 19:47:33 INFO mapred.JobClient:     Reduce input records=813/08/11 19:47:33 INFO mapred.JobClient:     Reduce input groups=813/08/11 19:47:33 INFO mapred.JobClient:     Combine output records=013/08/11 19:47:33 INFO mapred.JobClient:     Physical memory (bytes) snapshot=95093145613/08/11 19:47:33 INFO mapred.JobClient:     Reduce output records=013/08/11 19:47:33 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=528714547213/08/11 19:47:33 INFO mapred.JobClient:     Map output records=8Job Finished in 146.894 secondsEstimated value of Pi is 3.50000000000000000000$ ./bin/hadoop fs -mkdir input$ ./bin/hadoop fs -lsFound 2 itemsdrwxr-xr-x   - centos supergroup          0 2013-08-11 18:02 /user/centos/inputdrwxr-xr-x   - centos supergroup          0 2013-08-11 18:02 /user/centos/output$ ./bin/hadoop fs -put /home/centos/soft/hadoopdemo/*.txt input$ ./bin/hadoop fs -ls inputFound 2 items-rw-r--r--   1 centos supergroup         31 2013-08-11 19:58 /user/centos/input/demo1.txt-rw-r--r--   1 centos supergroup         34 2013-08-11 19:58 /user/centos/input/demo2.txt$ ./bin/hadoop jar /home/centos/soft/hadoop-1.1.2/hadoop-examples-1.1.2.jar wordcount input output注:如output目录已存在,则应先删除13/08/11 20:00:46 INFO input.FileInputFormat: Total input paths to process : 213/08/11 20:00:47 INFO util.NativeCodeLoader: Loaded the native-hadoop library13/08/11 20:00:47 WARN snappy.LoadSnappy: Snappy native library not loaded13/08/11 20:00:47 INFO mapred.JobClient: Running job: job_201308111944_000313/08/11 20:00:48 INFO mapred.JobClient:  map 0% reduce 0%13/08/11 20:01:11 INFO mapred.JobClient:  map 50% reduce 0%13/08/11 20:01:15 INFO mapred.JobClient:  map 100% reduce 0%13/08/11 20:01:40 INFO mapred.JobClient:  map 100% reduce 100%13/08/11 20:01:41 INFO mapred.JobClient: Job complete: job_201308111944_000313/08/11 20:01:41 INFO mapred.JobClient: Counters: 2913/08/11 20:01:41 INFO mapred.JobClient:   Job Counters 13/08/11 20:01:41 INFO mapred.JobClient:     Launched reduce tasks=113/08/11 20:01:41 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=4301913/08/11 20:01:41 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=013/08/11 20:01:41 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=013/08/11 20:01:41 INFO mapred.JobClient:     Launched map tasks=213/08/11 20:01:41 INFO mapred.JobClient:     Data-local map tasks=213/08/11 20:01:41 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=2885213/08/11 20:01:41 INFO mapred.JobClient:   File Output Format Counters 13/08/11 20:01:41 INFO mapred.JobClient:     Bytes Written=5813/08/11 20:01:41 INFO mapred.JobClient:   FileSystemCounters13/08/11 20:01:41 INFO mapred.JobClient:     FILE_BYTES_READ=11813/08/11 20:01:41 INFO mapred.JobClient:     HDFS_BYTES_READ=29313/08/11 20:01:41 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=17438513/08/11 20:01:41 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=5813/08/11 20:01:41 INFO mapred.JobClient:   File Input Format Counters 13/08/11 20:01:41 INFO mapred.JobClient:     Bytes Read=6513/08/11 20:01:41 INFO mapred.JobClient:   Map-Reduce Framework13/08/11 20:01:41 INFO mapred.JobClient:     Map output materialized bytes=12413/08/11 20:01:41 INFO mapred.JobClient:     Map input records=1013/08/11 20:01:41 INFO mapred.JobClient:     Reduce shuffle bytes=12413/08/11 20:01:41 INFO mapred.JobClient:     Spilled Records=1813/08/11 20:01:41 INFO mapred.JobClient:     Map output bytes=10513/08/11 20:01:41 INFO mapred.JobClient:     CPU time spent (ms)=1075013/08/11 20:01:41 INFO mapred.JobClient:     Total committed heap usage (bytes)=31037849613/08/11 20:01:41 INFO mapred.JobClient:     Combine input records=1013/08/11 20:01:41 INFO mapred.JobClient:     SPLIT_RAW_BYTES=22813/08/11 20:01:41 INFO mapred.JobClient:     Reduce input records=913/08/11 20:01:41 INFO mapred.JobClient:     Reduce input groups=713/08/11 20:01:41 INFO mapred.JobClient:     Combine output records=913/08/11 20:01:41 INFO mapred.JobClient:     Physical memory (bytes) snapshot=46552678413/08/11 20:01:41 INFO mapred.JobClient:     Reduce output records=713/08/11 20:01:41 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=317523968013/08/11 20:01:41 INFO mapred.JobClient:     Map output records=10$ ./bin/hadoop fs -ls outputFound 3 items-rw-r--r--   1 centos supergroup          0 2013-08-11 20:01 /user/centos/output/_SUCCESSdrwxr-xr-x   - centos supergroup          0 2013-08-11 20:00 /user/centos/output/_logs-rw-r--r--  1 centos supergroup        58 2013-08-11 20:01 /user/centos/output/part-r-00000$ ./bin/hadoop fs -cat /user/centos/output/part-r-00000hadoop3java1mongdodb1office2redis1text1word1$ ./bin/hadoop fs -copyToLocal /user/centos/output/part-r-00000 ~/soft/test.txt$ cat soft/test.txt hadoop3java1mongdodb1office2redis1text1word1写数据启动客户端程序时,出现以下问题13/08/15 11:35:27 INFO ipc.Client: Retrying connect to server: 192.168.21.133/192.168.21.133:9000. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)分析core-site.xml文件中,默认配置了<property>      <name>fs.default.name</name>      <value>hdfs://localhost:9000</value></property>此时,通过外部程序访问虚拟机时无法连接,改为如下的具体IP即可。需注意的是,如果是重启虚机,IP地址可能发生变化<property>      <name>fs.default.name</name>      <value>hdfs://192.168.21.133:9000</value></property>./bin/hadoop fs -mkdir tmp$ ./bin/hadoop fs -lsFound 3 itemsdrwxr-xr-x   - centos supergroup          0 2013-08-11 19:58 /user/centos/inputdrwxr-xr-x   - centos supergroup          0 2013-08-11 20:01 /user/centos/outputdrwxr-xr-x   - centos supergroup          0 2013-08-15 11:04 /user/centos/tmp执行客户端上传程序org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create file/user/centos/tmp/test1.txt. Name node is in safe mode.The reported blocks is only 0 but the threshold is 0.9990 and the total blocks 8. Safe mode will be turned off automatically.注:需要解除安全模式,执行$ ./bin/hadoop dfsadmin -safemode leave继续执行org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=hy-cxz, access=WRITE, inode="tmp":centos:supergroup:rwxr-xr-xat sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)…注:需要开放文件夹的写权限./bin/hadoop fs -chmod 777 tmp$ ./bin/hadoop fs -lsFound 3 itemsdrwxr-xr-x   - centos supergroup          0 2013-08-11 19:58 /user/centos/inputdrwxr-xr-x   - centos supergroup          0 2013-08-11 20:01 /user/centos/outputdrwxrwxrwx   - centos supergroup          0 2013-08-15 11:04 /user/centos/tmp当遇到类似如下问题时:13/08/11 19:05:44 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/centos/PiEstimator_TMP_3_141592654/in/part0 could only be replicated to 0 nodes, instead of 1可以检查datanode是否启动。在每次执行bin/hadoop namenode -format时,会为namenode生成namespaceID, 但是在tmp文件夹下的datanode还是保留上次的namespaceID,在启动时,由于namespaceID不一致,导致datanode无法启动。解决方法是在每次bin/hadoop namenode -format之前先删除"临时文件夹"就可以启动成功。即:1)先停止所有服务2)删除临时文件(hadoop.tmp.dir)和数据文件(dfs.name.dir、dfs.data.dir)3)启动所有服务




原创粉丝点击