HDFS如何使用多个磁盘
来源:互联网 发布:辐射4艾达王捏脸数据 编辑:程序博客网 时间:2024/06/05 18:26
To run HDFS, you need to designate (指派)one machine as a namenode. In this case, the
property fs.default.name is a HDFS filesystem URI, whose host is the namenode’s
hostname or IP address, and port is the port that the namenode will listen on for RPCs.
If no port is specified, the default of 8020 is used.
The fs.default.name property also doubles as specifying the default filesystem. The
default filesystem is used to resolve relative paths, which are handy (有用)to use since they
save typing (and avoid hardcoding knowledge of a particular namenode’s address). For
example, with the default filesystem defined in Example 9-1, the relative URI /a/b is
resolved to hdfs://namenode/a/b.
2 dfs.name.dir
here are a few other configuration properties you should set for HDFS: those that set
the storage directories for the namenode and for datanodes. The property
dfs.name.dir specifies a list of directories where the namenode stores persistent file-
system metadata (the edit log, and the filesystem image). A copy of each of the metadata
files is stored in each directory for redundancy( 冗余,即namenode在 dfs.name.dir 每一 项位置中存的数据都是一样的 ) .
It’s common to configure dfs.name.dir so that the namenode metadata is written to one or two local disks , and
a remote disk , such as a NFS-mounted directory. Such a setup guards against failure
of a local disk, and failure of the entire namenode, since in both cases the files can be
recovered and used to start a new namenode. (The secondary namenode takes only
periodic checkpoints of the namenode, so it does not provide an up-to-date backup of
the namenode.)
3 dfs.data.dir
You should also set the dfs.data.dir property, which specifies a list of directorie s for
a datanode to store its blocks. Unlike the namenode, which uses multiple directories
for redundancy(冗余), a datanode round-robins(轮循, datanode 在 dfs.data.dir 每一 项位置中存的数据是不一样的 ) . ) writes between its storage directories, so for
performance you should specify a storage directory for each local disk. Read perform-
ance also benefits from having multiple disks for storage, because blocks will be spread
across them, and concurrent reads for distinct blocks will be correspondingly spread
across disks.
4 fs.checkpoint.dir
Finally, you should configure where the secondary namenode stores its checkpoints of
the filesystem. The fs.checkpoint.dir property specifies a list of directories where the
checkpoints are kept. Like the storage directories for the namenode, which keep re-
dundant copies of the namenode metadata, the checkpointed filesystem image is stored
in each checkpoint directory for redundancy.
Note that the storage directories for HDFS are under Hadoop’s tempo-
rary directory by default (the hadoop.tmp.dir property, whose default
is /tmp/hadoop-${user.name}). Therefore it is critical that these proper-
ties are set so that data is not lost by the system clearing out temporary
directories.
- HDFS如何使用多个磁盘
- HDFS如何使用多个磁盘
- iscsi target 如何设置对多个磁盘的共享
- linux 下如何给系统挂载磁盘,多个磁盘挂载到同一卷组中
- linux 下如何给系统挂载磁盘,多个磁盘挂载到同一卷组中
- linux 下如何给系统挂载磁盘,多个磁盘挂载到同一卷组中
- vmware-vdiskmanager使用详解,vmware linux系统磁盘扩容,合并多个磁盘文件
- 如何查看磁盘使用状况
- 设置hdfs磁盘配额
- 如何把hdfs上的多个目录下的文件合并为一个文件
- 如何使用Java API读写HDFS
- hadoop集群hdfs磁盘划分
- HDFS上磁盘数据清理
- 如何查看linux 日志 磁盘使用情况?
- hdfs中配置多个namenode
- 如何给10^7个数据量的磁盘文件排序
- 如何给10^7个数据量的磁盘文件排序
- 如何给10^7个数据量的磁盘文件排序
- C#的textbox的自动滚屏
- 创建项目统一的连续增长ID
- code::blocks的基本配置
- C# 应用程序根目录
- C#里partial关键字的作用 (分布类)
- HDFS如何使用多个磁盘
- 使用VC开发软件框架
- SOJ 3427 Dark roads
- 10个你需要了解的项目管理工具
- c#前期知识总结
- 关于编译选项的文章
- Windows Azure部署和虚拟IP
- 10个你需要了解的项目管理工具
- c#中的时间构建模式和用法