HDFS Snapshots
来源:互联网 发布:linux使用搜狗输入法 编辑:程序博客网 时间:2024/05/29 18:41
HDFS Snapshots
- Overview
- Snapshottable Directories
- Snapshot Paths
- Upgrading to a version of HDFS with snapshots
- Snapshot Operations
- Administrator Operations
- Allow Snapshots
- Disallow Snapshots
- User Operations
- Create Snapshots
- Delete Snapshots
- Rename Snapshots
- Get Snapshottable Directory Listing
- Get Snapshots Difference Report
- Administrator Operations
Overview
HDFS Snapshots are read-only point-in-time copies of the file system. Snapshots can be taken on a subtree of the file system or the entire file system. Some common use cases of snapshots are data backup, protection against user errors and disaster recovery.
The implementation of HDFS Snapshots is efficient:
- Snapshot creation is instantaneous: the cost is O(1) excluding the inode lookup time.
- Additional memory is used only when modifications are made relative to a snapshot: memory usage is O(M), where M is the number of modified files/directories.
- Blocks in datanodes are not copied: the snapshot files record the block list and the file size. There is no data copying.
- Snapshots do not adversely affect regular HDFS operations: modifications are recorded in reverse chronological order so that the current data can be accessed directly. The snapshot data is computed by subtracting the modifications from the current data.
Snapshottable Directories
Snapshots can be taken on any directory once the directory has been set as snapshottable. A snapshottable directory is able to accommodate 65,536 simultaneous snapshots. There is no limit on the number of snapshottable directories. Administrators may set any directory to be snapshottable. If there are snapshots in a snapshottable directory, the directory can be neither deleted nor renamed before all the snapshots are deleted.
Nested snapshottable directories are currently not allowed. In other words, a directory cannot be set to snapshottable if one of its ancestors/descendants is a snapshottable directory.
Snapshot Paths
For a snapshottable directory, the path component ".snapshot" is used for accessing its snapshots. Suppose /foo is a snapshottable directory, /foo/bar is a file/directory in /foo, and /foo has a snapshot s0. Then, the path
/foo/.snapshot/s0/bar
- Listing all the snapshots under a snapshottable directory:
hdfs dfs -ls /foo/.snapshot
- Listing the files in snapshot s0:
hdfs dfs -ls /foo/.snapshot/s0
- Copying a file from snapshot s0:
hdfs dfs -cp -ptopax /foo/.snapshot/s0/bar /tmp
Note that this example uses the preserve option to preserve timestamps, ownership, permission, ACLs and XAttrs.
Upgrading to a version of HDFS with snapshots
The HDFS snapshot feature introduces a new reserved path name used to interact with snapshots: .snapshot. When upgrading from an older version of HDFS, existing paths named .snapshot need to first be renamed or deleted to avoid conflicting with the reserved path. See the upgrade section in the HDFS user guide for more information.
Snapshot Operations
Administrator Operations
The operations described in this section require superuser privilege.
Allow Snapshots
Allowing snapshots of a directory to be created. If the operation completes successfully, the directory becomes snapshottable.
- Command:
hdfs dfsadmin -allowSnapshot <path>
- Arguments:pathThe path of the snapshottable directory.
See also the corresponding Java API void allowSnapshot(Path path) in HdfsAdmin.
Disallow Snapshots
Disallowing snapshots of a directory to be created. All snapshots of the directory must be deleted before disallowing snapshots.
- Command:
hdfs dfsadmin -disallowSnapshot <path>
- Arguments:pathThe path of the snapshottable directory.
See also the corresponding Java API void disallowSnapshot(Path path) in HdfsAdmin.
User Operations
The section describes user operations. Note that HDFS superuser can perform all the operations without satisfying the permission requirement in the individual operations.
Create Snapshots
Create a snapshot of a snapshottable directory. This operation requires owner privilege of the snapshottable directory.
- Command:
hdfs dfs -createSnapshot <path> [<snapshotName>]
- Arguments:pathThe path of the snapshottable directory.snapshotNameThe snapshot name, which is an optional argument. When it is omitted, a default name is generated using a timestamp with the format "'s'yyyyMMdd-HHmmss.SSS", e.g. "s20130412-151029.033".
See also the corresponding Java API Path createSnapshot(Path path) and Path createSnapshot(Path path, String snapshotName) in FileSystem. The snapshot path is returned in these methods.
Delete Snapshots
Delete a snapshot of from a snapshottable directory. This operation requires owner privilege of the snapshottable directory.
- Command:
hdfs dfs -deleteSnapshot <path> <snapshotName>
- Arguments:pathThe path of the snapshottable directory.snapshotNameThe snapshot name.
See also the corresponding Java API void deleteSnapshot(Path path, String snapshotName) in FileSystem.
Rename Snapshots
Rename a snapshot. This operation requires owner privilege of the snapshottable directory.
- Command:
hdfs dfs -renameSnapshot <path> <oldName> <newName>
- Arguments:pathThe path of the snapshottable directory.oldNameThe old snapshot name.newNameThe new snapshot name.
See also the corresponding Java API void renameSnapshot(Path path, String oldName, String newName) in FileSystem.
Get Snapshottable Directory Listing
Get all the snapshottable directories where the current user has permission to take snapshtos.
- Command:
hdfs lsSnapshottableDir
- Arguments: none
See also the corresponding Java API SnapshottableDirectoryStatus[] getSnapshottableDirectoryListing() in DistributedFileSystem.
Get Snapshots Difference Report
Get the differences between two snapshots. This operation requires read access privilege for all files/directories in both snapshots.
- Command:
hdfs snapshotDiff <path> <fromSnapshot> <toSnapshot>
- Arguments:pathThe path of the snapshottable directory.fromSnapshotThe name of the starting snapshot.toSnapshotThe name of the ending snapshot.
- Results:+The file/directory has been created.-The file/directory has been deleted.MThe file/directory has been modified.RThe file/directory has been renamed.
A RENAME entry indicates a file/directory has been renamed but is still under the same snapshottable directory. A file/directory is reported as deleted if it was renamed to outside of the snapshottble directory. A file/directory renamed from outside of the snapshottble directory is reported as newly created.
The snapshot difference report does not guarantee the same operation sequence. For example, if we rename the directory "/foo" to "/foo2", and then append new data to the file "/foo2/bar", the difference report will be:
R. /foo -> /foo2 M. /foo/bar
See also the corresponding Java API SnapshotDiffReport getSnapshotDiffReport(Path path, String fromSnapshot, String toSnapshot) in DistributedFileSystem.
- HDFS Snapshots
- HDFS Snapshots
- HDFS快照(HDFS Snapshots)
- hadoop2.7.2学习笔记20-HDFS Snapshots
- 云计算(十一)- HDFS快照(HDFS Snapshots)
- 云计算(十一)- HDFS快照(HDFS Snapshots)
- Hadoop——HDFS Federation、File System Snapshots、集中式缓存管理、Distributed Copy、YARN HA简单讲解
- Distributed Snapshots
- Features/Snapshots
- 关于LoadRunner的Snapshots
- sql Server snapshots
- Snapshots: The alternative backup
- RTX LDAP SYNCHRONIZER Snapshots
- Snapshots in HBase 0.96
- 浅析snapshots, blockcommit,blockpull
- OpenStack: Perform Consistent Snapshots
- Hadoop浅解SnapShots
- ORACLE 快照Snapshots
- 使用Altium Designer画PCB器件半落在视窗外无法选择移动解决办法
- C语言实现蛇形输出
- C语言 - 查找字符串中相同的字符函数的使用
- spring,mybatis 事务隔离级别管理
- 读取一个5*5数组,然后显示每行的和与每列的和
- HDFS Snapshots
- Wpf依赖属性
- spring+hibernate整合(基本配置)
- 你真的理解程序中的变量吗
- 20161218
- ftl页面打开后直接显示代码不显示空间的问题
- 51Nod - 1090
- Solr查询参数cache/cost
- Offline Edits Viewer Guide