Why use HDFS over Lustre and / or GPFS?
来源:互联网 发布:淘宝途虎机油怎么样 编辑:程序博客网 时间:2024/05/16 15:13
Apache Hadoop: Why use HDFS over Lustre and / or GPFS?
Curious to understand what some specific use cases would be for adopting HDFS as a file system over Lustre or GPFS. i.e. what kind of storage / computing needs are more suited to one file system over the other. thanks
Allen Wittenauer, I make yellow elephants cry.
In general (and not really geared towards the specific file systems):
- Many large scale file systems do not scale as high as HDFS. (GPFS, for example, maxes out at 4PB, IIRC).
- The price/perf ratio may not be as good, especially if licensing is involved.
- This can some times be attributed to POSIX support. This isn't 'cheap' and will impact either the cost or the performance or both. Ignoring the Amazon Web Services and other Cloud Services cases, this is probably the #1 reason why one would not use HDFS when using Hadoop.
- This can some times be attributed to POSIX support. This isn't 'cheap' and will impact either the cost or the performance or both. Ignoring the Amazon Web Services and other Cloud Services cases, this is probably the #1 reason why one would not use HDFS when using Hadoop.
- Lack of MapReduce block locality support. (Of course, I'm assuming one would implement the ability to read specific blocks in the Hadoop FileSystem API. But that might be problematic for some systems as well...)
- Lack of replication (i.e., only one copy of the file on disk so ability to recover data becomes more important).
- Few 'real world' large scale references. No one wants to be the first on the block to try Hadoop+other fs outside of a lab. This is made worse by file system vendors having a very limited view of what Hadoop actually does and why. (The older Lustre paper at http://wiki.lustre.org/im
ages/1/... is a funread.)
- Why use HDFS over Lustre and / or GPFS?
- jsbin-jsfiddle-or-codepen-which-one-to-use-and-why
- Lustre vs. HDFS
- Why and How to Use Netlink Socket
- Linux Why and How to use NetLink
- Why and How to Use Netlink Socket
- Why and How to Use Netlink Socket
- Why Ceph and how to use Ceph?
- Sizing and Tuning GPFS
- Kernel Korner - Why and How to Use Netlink Socket
- Kernel Korner - Why and How to Use Netlink Socket
- When and Why do we use "#if 0"
- Kernel Korner - Why and How to Use Netlink Socket
- 【转】周末闲谈:C and C++, why use c++?
- Why to use memory pool and how to implement it
- Kernel Korner - Why and How to Use Netlink Socket
- Kernel Korner - Why and How to Use Netlink Socket
- How-to: Use HBase Bulk Loading, and Why
- 利用GNUstep搭建windows下object-c的开发环境
- uboot网络移植成功,tftp能用。
- java下载远程http地址的图片文件到本地-自动处理图片是否经过服务器gzip压缩的问题
- 有关音频编码的知识与技术参数
- 源码分析
- Why use HDFS over Lustre and / or GPFS?
- 客户们不要再为一个月几十块钱在哪里斤斤计较吧...你的损失有可能在你不之情的情况下在不断流失
- PHP缓存之APC-简介、存储结构和操作
- 函数调用的两种方式PASCAL调用方式和C调用方式
- ext.net ComboBox
- 钱线观察:货币基金T+0驾到 活期存款将死?
- 这个果然是天坑!!!
- SGI STL源码解读之 string
- Apache commons-compress ZIP打包