经典Hadoop书籍介绍

来源:互联网 发布:linux装无线网卡驱动 编辑:程序博客网 时间:2024/04/25 14:42

1.Hadoop: The Definitive Guide(Hadoop权威指南)

这本书很全,Hadoop中的圣经级教材,不过看起来挺累。

内容简介

Discover how Apache Hadoop can unleash the power of your data. This comprehensive resource shows you how to build and maintain reliable, scalable, distributed systems with the Hadoop framework -- an open source implementation of MapReduce, the algorithm on which Google built its empire. Programmers will find details for analyzing datasets of any size, and administrators will learn how to set up and run Hadoop clusters.

This revised edition covers recent changes to Hadoop, including new features such as Hive, Sqoop, and Avro. It also provides illuminating case studies that illustrate how Hadoop is used to solve specific problems. Looking to get the most out of your data? This is your book.

  • Use the Hadoop Distributed File System (HDFS) for storing large datasets, then run distributed computations over those datasets with MapReduce
  • Become familiar with Hadoop’s data and I/O building blocks for compression, data integrity, serialization, and persistence
  • Discover common pitfalls and advanced features for writing real-world MapReduce programs
  • Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud
  • Use Pig, a high-level query language for large-scale data processing
  • Analyze datasets with Hive, Hadoop’s data warehousing system
  • Take advantage of HBase, Hadoop’s database for structured and semi-structured data
  • Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems

"Now you have the opportunity to learn about Hadoop from a master -- not only of the technology, but also of common sense and plain talk."

2.Hadoop in Action

这本书是入门推荐。

内容简介

"Hadoop in Action" teaches readers how to use Hadoop and write MapReduce programs. The intended readers are programmers, architects, and project managers who have to process large amounts of data offline. "Hadoop in Action" will lead the reader from obtaining a copy of Hadoop to setting it up in a cluster and writing data analytic programs. The book begins by making the basic idea of Hadoop and MapReduce easier to grasp by applying the default Hadoop installation to a few easy-to-follow tasks, such as analyzing changes in word frequency across a body of documents. The book continues through the basic concepts of MapReduce applications developed using Hadoop, including a close look at framework components, use of Hadoop for a variety of data analysis tasks, and numerous examples of Hadoop in action. "Hadoop in Action" will explain how to use Hadoop and present design patterns and practices of programming MapReduce. MapReduce is a complex idea both conceptually and in its implementation, and Hadoop users are challenged to learn all the knobs and levers for running Hadoop. This book takes you beyond the mechanics of running Hadoop, teaching you to write meaningful programs in a MapReduce framework. This book assumes the reader will have a basic familiarity with Java, as most code examples will be written in Java. Familiarity with basic statistical concepts (e.g. histogram, correlation) will help the reader appreciate the more advanced data processing examples.

3.Pro Hadoop

这本书据说挺好。

商品描述

You've heard the hype about Hadoop: it runs petabyte-scale data mining tasks insanely fast, it runs gigantic tasks on clouds for absurdly cheap, it's been heavily committed to by tech giants like IBM, Yahoo , and the Apache Project, and it's completely open source (thus free). But what exactly is it, and more importantly, how do you even get a Hadoop cluster up and running? From Apress, the name you've come to trust for hands-on technical knowledge, Pro Hadoop brings you up to speed on Hadoop. You learn the ins and outs of MapReduce; how to structure a cluster, design, and implement the Hadoop file system; and how to build your first cloud-computing tasks using Hadoop. Learn how to let Hadoop take care of distributing and parallelizing your software--you just focus on the code, Hadoop takes care of the rest. Best of all, you'll learn from a tech professional who's been in the Hadoop scene since day one. Written from the perspective of a principal engineer with down-in-the-trenches knowledge of what to do wrong with Hadoop, you learn how to avoid the common, expensive first errors that everyone makes with creating their own Hadoop system or inheriting someone else's. Skip the novice stage and the expensive, hard-to-fix mistakes...go straight to seasoned pro on the hottest cloud-computing framework with Pro Hadoop. Your productivity will blow your managers away. What you'll learn Set up a stand-alone Hadoop cluster the smart way, laid out simply and step by step so you can get up and running quickly to build your next data center, collaborative, data-intensive Internet services application, Software as a Service (SaaS), and more. Optimize your Hadoop production tasks like an experienced pro. Work with time-proven, bulletproof standard patterns that have been tested and debugged in high-volume production. Understand just enough theoretical knowledge to know why something works in Hadoop, without getting bogged down in abstruse walls of theory. Get detailed explanations of not only how to do something with Hadoop, but also why, from a front-line coder with years in the Hadoop game. Turn someone else's expensive cluster-wide "wrong" into an orderly, productive "right" with professional-level debugging and testing. Who is this book for? IT professionals interested in investigating Hadoop and implementing it in their organizations, and existing Hadoop users who want to deepen their professional toolkits. About the Apress Pro Series The Apress Pro series books are practical, professional tutorials to keep you on and moving up the professional ladder. You have gotten the job, now you need to hone your skills in these tough competitive times. The Apress Pro series expands your skills and expertise in exactly the areas you need. Master the content of a Pro book, and you will always be able to get the job done in a professional development project. Written by experts in their field, Pro series books from Apress give you the hard-won solutions to problems you will face in your professional programming career.

三本书的下载地址:http://dl.dbank.com/c0h9pycwed

原创粉丝点击