The Google File System : part1 ABSTRACT and INTRODUCTION

来源：互联网发布：中世纪2 原版优化编辑：程序博客网时间：2024/06/03 11:52

ABSTRACT
We have designed and implemented the Google File System, a scalable distributed file system for large distributed data-intensive applications.
It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients.

摘要
我们已经设计并实施了Google File System，这是一种用于大型分布式数据密集型应用的可扩展分布式文件系统。
它在运行在廉价商品硬件上时提供容错，并为大量客户端提供高集总性能。

While sharing many of the same goals as previous distributed file systems, our design has been driven by observations of our application workloads and technological environment, both current and anticipated, that reflect a marked departure from some earlier file system assumptions.
This has led us to reexamine traditional choices and explore radically different design points.
The file system has successfully met our storage needs.
It is widely deployed within Google as the storage platform for the generation and processing of data used by our service as well as research and development efforts that require large data sets.
The largest cluster to date provides hundreds of terabytes of storage across thousands of disks on over a thousand machines, and it is concurrently accessed by hundreds of clients.
In this paper, we present file system interface extensions designed to support distributed applications, discuss many aspects of our design, and report measurements from both micro-benchmarks and real world use.

在与以前的分布式文件系统共享许多相同的目标的同时，我们的设计是由我们的应用程序工作负载和当前和预期的技术环境的观察驱动的，这反映了一些较早的文件系统假设的明显偏离。
这导致我们重新审视传统选择，探索完全不同的设计点。
文件系统已经成功地满足了我们的存储需求。
它在Google中广泛部署为用于生成和处理我们服务使用的数据的存储平台以及需要大数据集的研究和开发工作。
迄今为止，最大的集群在千台机器上的数千个磁盘上提供了数百TB的存储空间，并由数百个客户端同时访问。
在本文中，我们提出了旨在支持分布式应用程序的文件系统接口扩展，讨论了我们设计的许多方面，并从微型基准和现实世界中使用报告测量。

Categories and Subject Descriptors

Distributed file systems

General Terms
Design, reliability, performance, measurement
Keywords : Fault tolerance, scalability, data storage, clustered storage
The authors can be reached at the following addresses:
{sanjay,hgobioff,shuntak}@google.com.

类别和主题描述符
分布式文件系统
一般条款
设计，可靠性，性能，测量
关键词 : 容错，可扩展性，数据存储，集群存储
作者可以在以下地址达到：
{桑杰，hgobioff，shuntak}@ google.com。

1. INTRODUCTION
We have designed and implemented the Google File System (GFS) to meet the rapidly growing demands of Google’s data processing needs.
GFS shares many of the same goals as previous distributed file systems such as performance, scalability, reliability, and availability.
However, its design has been driven by key observations of our application workloads and technological environment, both current and anticipated, that reflect a marked departure from some earlier file system design assumptions.
We have reexamined traditional choices and explored radically different points in the design space.

1.介绍
我们设计和实施了Google文件系统（GFS），以满足Google数据处理需求日益增长的需求。
GFS与以前的分布式文件系统具有许多相同的目标，例如
性能，
可扩展性，
可靠性和
可用性。
然而，其设计是由我们的应用工作负载和当前和预期的技术环境的重要观察所驱动的，反映出与早期文件系统设计假设的明显偏差。
我们重新审视了传统选择，并在设计领域探索了根本性的不同点。

First, component failures are the norm rather than the exception.
The file system consists of hundreds or even thousands of storage machines built from inexpensive commodity parts and is accessed by a comparable number of client machines.
The quantity and quality of the components virtually guarantee that some are not functional at any given time and some will not recover from their current failures.
We have seen problems caused by application bugs, operating system bugs, human errors, and the failures of disks, memory, connectors,networking, and power supplies.
Therefore, constant monitoring, error detection, fault tolerance, and automatic recovery must be integral to the system.

(1)组件故障是规范而不是异常。
文件系统由数百甚至数千个由廉价商品部件构成的存储机器组成，可由相当数量的客户端机器访问。
组件的数量和质量实际上保证了某些在任何给定的时间都不起作用，有些不能从当前的故障中恢复。
我们已经看到由应用程序错误，操作系统错误，人为错误以及磁盘，内存，连接器，网络和电源的故障引起的问题。
因此，不断的监控，错误检测，容错和自动恢复必须是系统的一部分。

Second, files are huge by traditional standards. Multi-GB files are common.
Each file typically contains many application objects such as web documents.
When we are regularly working with fast growing data sets of many TBs comprising billions of objects, it is unwieldy to manage billions of approximately KB-sized files even when the file system could support it.
As a result, design assumptions and parameters such as I/O operation and block sizes have to be revisited.

(2)传统标准文件是巨大的。多GB文件是常见的。
每个文件通常包含许多应用程序对象，如Web文档。
当我们经常使用包含数十亿个对象的许多TB的快速增长数据集时，即使文件系统可以支持它，也难以管理数十亿个大小的KB大小的文件。
因此，必须重新设计I/O操作和块大小等设计假设和参数。

Third, most files are mutated by appending new data rather than overwriting existing data.
Random writes within a file are practically non-existent.
Once written, the files are only read, and often only sequentially.
A variety of data share these characteristics.
Some may constitute large repositories that data analysis programs scan through.
Some may be data streams continuously generated by running applications.
Some may be archival data.
Some may be intermediate results produced on one machine and processed on another, whether simultaneously or later in time.
Given this access pattern on huge files, appending becomes the focus of performance optimization and atomicity guarantees,while caching data blocks in the client loses its appeal.

(3)大多数文件通过附加新数据而不是覆盖现有数据进行突变。
文件中的随机写入实际上不存在。
一旦写入，文件只能读取，而且通常只能顺序执行。
各种数据共享这些特点。
有些可能构成数据分析程序扫描的大型存储库。
一些可能是运行应用程序连续生成的数据流。
有些可能是存档数据。
一些可能是在一台机器上产生的中间结果，并在另一台机器上进行处理，无论是在时间上还是之后。
鉴于这种大型文件的访问模式，追加成为性能优化和原子性保证的重点，而客户端缓存数据块则失去了吸引力。

Fourth, co-designing the applications and the file system API benefits the overall system by increasing our flexibility.
For example, we have relaxed GFS’s consistency model to vastly simplify the file system without imposing an onerous burden on the applications.
We have also introduced an atomic append operation so that multiple clients can append concurrently to a file without extra synchronization between them.
These will be discussed in more details later in the paper.

(4)通过增加我们的灵活性，共同设计应用程序和文件系统API将有利于整个系统。
例如，我们放宽了GFS的一致性模型，大大简化了文件系统，而不会对应用程序造成沉重的负担。
我们还引入了一个原子附加操作，以便多个客户端可以并发地附加到一个文件，而不需要额外的同步。
这些将在本文后面的更详细的讨论。

Multiple GFS clusters are currently deployed for different purposes.
The largest ones have over 1000 storage nodes, over 300 TB of disk storage, and are heavily accessed by hundreds of clients on distinct machines on a continuous basis.

目前为了不同的目的部署了多个GFS集群。
最大的存储节点有超过1000个存储节点，超过300 TB的磁盘存储，并且在不同的机器上被数百个客户端连续访问。

阅读全文

0 0