HBase备份之导入导出
来源:互联网 发布:淘宝上传宝贝怎么插图 编辑:程序博客网 时间:2024/04/28 23:15
我们在上一篇文章《HBase复制》中讲述了如何建立主/从集群,实现数据的实时备份。但是,HBase复制只对设置好复制以后的数据生效,也即,配置好复制之后插入HBase主集群的数据才能同步复制到HBase从集群中,而对之前的历史数据,采用HBase复制这种办法是无能为力的。本文介绍如何使用HBase的导入导出功能来实现历史数据的备份。
1)将HBase表数据导出到hdfs的一个指定目录中,具体命令如下:
- $ cd $HBASE_HOME/
- $ bin/hbase org.apache.hadoop.hbase.mapreduce.Export test_table /data/test_table
其中,$HBASE_HOME为HBase主目录,test_table为要导出的表名,/data/test_table为hdfs中的目录地址。
执行结果太长,这里截取最后一部分,如下所示:- 2014-08-11 16:49:44,484 INFO [main] mapreduce.Job: Running job: job_1407491918245_0021
- 2014-08-11 16:49:51,658 INFO [main] mapreduce.Job: Job job_1407491918245_0021 running in uber mode : false
- 2014-08-11 16:49:51,659 INFO [main] mapreduce.Job: map 0% reduce 0%
- 2014-08-11 16:49:57,706 INFO [main] mapreduce.Job: map 100% reduce 0%
- 2014-08-11 16:49:57,715 INFO [main] mapreduce.Job: Job job_1407491918245_0021 completed successfully
- 2014-08-11 16:49:57,789 INFO [main] mapreduce.Job: Counters: 37
- File System Counters
- FILE: Number of bytes read=0
- FILE: Number of bytes written=118223
- FILE: Number of read operations=0
- FILE: Number of large read operations=0
- FILE: Number of write operations=0
- HDFS: Number of bytes read=84
- HDFS: Number of bytes written=243
- HDFS: Number of read operations=4
- HDFS: Number of large read operations=0
- HDFS: Number of write operations=2
- Job Counters
- Launched map tasks=1
- Rack-local map tasks=1
- Total time spent by all maps in occupied slots (ms)=9152
- Total time spent by all reduces in occupied slots (ms)=0
- Map-Reduce Framework
- Map input records=3
- Map output records=3
- Input split bytes=84
- Spilled Records=0
- Failed Shuffles=0
- Merged Map outputs=0
- GC time elapsed (ms)=201
- CPU time spent (ms)=5210
- Physical memory (bytes) snapshot=377470976
- Virtual memory (bytes) snapshot=1863364608
- Total committed heap usage (bytes)=1029177344
- HBase Counters
- BYTES_IN_REMOTE_RESULTS=87
- BYTES_IN_RESULTS=87
- MILLIS_BETWEEN_NEXTS=444
- NOT_SERVING_REGION_EXCEPTION=0
- NUM_SCANNER_RESTARTS=0
- REGIONS_SCANNED=1
- REMOTE_RPC_CALLS=3
- REMOTE_RPC_RETRIES=0
- RPC_CALLS=3
- RPC_RETRIES=0
- File Input Format Counters
- Bytes Read=0
- File Output Format Counters
- Bytes Written=243
- $ cd $HADOOP_HOME/
- $ bin/hadoop fs -ls /data/test_table
- Found 2 items
- -rw-r--r-- 3 hbase supergroup 0 2014-08-11 16:49 /data/test_table/_SUCCESS
- -rw-r--r-- 3 hbase supergroup 243 2014-08-11 16:49 /data/test_table/part-m-00000
- $ cd $HBASE_HOME/
- $ bin/hbase shell
- 2014-08-11 17:05:52,589 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
- HBase Shell; enter 'help<RETURN>' for list of supported commands.
- Type "exit<RETURN>" to leave the HBase Shell
- Version 0.98.2-hadoop2, r1591526, Wed Apr 30 20:17:33 PDT 2014
- hbase(main):001:0> describe 'test_table'
- DESCRIPTION ENABLED
- 'test_table', {NAME => 'cf', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => '1', COMPRESSION => 'NONE', VERSIONS => true
- '1', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE =>
- 'true'}
- 1 row(s) in 1.3400 seconds
- hbase(main):002:0> scan 'test_table'
- ROW COLUMN+CELL
- r1 column=cf:q1, timestamp=1406788229440, value=va1
- r2 column=cf:q1, timestamp=1406788265646, value=va2
- r3 column=cf:q1, timestamp=1406788474301, value=va3
- 3 row(s) in 0.0560 seconds
2)将导出到hdfs中的数据导入到hbase创建好的表中。注意,该表可以和之前的表不同名,但模式一定要相同。我们领取一个名字,使用test_copy这个表名。创建表的命令如下:
- $ cd $HBASE_HOME/
- $ bin/hbase shell
- 2014-08-11 17:05:52,589 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
- HBase Shell; enter 'help<RETURN>' for list of supported commands.
- Type "exit<RETURN>" to leave the HBase Shell
- Version 0.98.2-hadoop2, r1591526, Wed Apr 30 20:17:33 PDT 2014
- hbase(main):001:0> create 'test_copy', 'cf'
- 0 row(s) in 1.1980 seconds
- => Hbase::Table - test_copy
- $ cd $HBASE_HOME/
- $ bin/hbase org.apache.hadoop.hbase.mapreduce.Import test_copy hdfs://l-master.data/data/test_table
导入命令执行的结果如下,因为结果很长,所以取最后一部分:
- 2014-08-11 17:13:08,706 INFO [main] mapreduce.Job: map 100% reduce 0%
- 2014-08-11 17:13:08,710 INFO [main] mapreduce.Job: Job job_1407728839061_0014 completed successfully
- 2014-08-11 17:13:08,715 INFO [main] mapreduce.Job: Counters: 27
- File System Counters
- FILE: Number of bytes read=0
- FILE: Number of bytes written=117256
- FILE: Number of read operations=0
- FILE: Number of large read operations=0
- FILE: Number of write operations=0
- HDFS: Number of bytes read=356
- HDFS: Number of bytes written=0
- HDFS: Number of read operations=3
- HDFS: Number of large read operations=0
- HDFS: Number of write operations=0
- Job Counters
- Launched map tasks=1
- Rack-local map tasks=1
- Total time spent by all maps in occupied slots (ms)=6510
- Total time spent by all reduces in occupied slots (ms)=0
- Map-Reduce Framework
- Map input records=3
- Map output records=3
- Input split bytes=113
- Spilled Records=0
- Failed Shuffles=0
- Merged Map outputs=0
- GC time elapsed (ms)=21
- CPU time spent (ms)=1110
- Physical memory (bytes) snapshot=379494400
- Virtual memory (bytes) snapshot=1855762432
- Total committed heap usage (bytes)=1029177344
- File Input Format Counters
- Bytes Read=243
- File Output Format Counters
- Bytes Written=0
- $ cd $HBASE_HOME/
- $ bin/hbase shell
- 2014-08-11 17:15:52,117 INFO [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
- HBase Shell; enter 'help<RETURN>' for list of supported commands.
- Type "exit<RETURN>" to leave the HBase Shell
- Version 0.98.2-hadoop2, r1591526, Wed Apr 30 20:17:33 PDT 2014
- hbase(main):001:0> scan 'test_copy'
- ROW COLUMN+CELL
- r1 column=cf:q1, timestamp=1406788229440, value=va1
- r2 column=cf:q1, timestamp=1406788265646, value=va2
- r3 column=cf:q1, timestamp=1406788474301, value=va3
- 3 row(s) in 0.3640 seconds
0 0
- HBase备份之导入导出
- HBase备份之导入导出
- HBase备份之导入导出
- HBase备份之导入导出
- HBase备份之导入导出
- Oracle备份之导入导出
- hbase系列-hbase导入导出
- HBase 数据表导出,导入
- Hbase导入导出数据
- Hbase数据导入导出
- hbase导入导出数据
- Hbase 导入导出
- oracle 备份 导入 导出
- Oracle 备份(导入导出)
- Mysql 导入导出备份
- hbase数据导出导入,数据备份,数据迁移(仅解决非通信集群)
- mongodb日常管理之导入导出以及备份相关
- 【Oracle】OCR的备份和恢复之导出导入
- log4net创建日志
- 【Linux】在centOS上安装MySQL, JDK, Jboss
- 一些有用的网站。
- 平面最近点对
- JAVA try、catch、finally的执行详解
- HBase备份之导入导出
- Gradle的基本使用
- Tomcat学习—Tomcat的server.xml配置文件
- UML---类图
- git的一些简单命令(git clone,git branch,git checkout。。。)
- ld: library not found for -lAFNetworking
- RFS(五)AutoItLibrary库RUN关键字的用法及附加的其他关键字
- 支付宝线下门店扫码支付开发
- 一张图帮你决定要不要辞职