HBase Recipes

Nallapati, Vidyasagar

APress

Informatik & EDV

Erschienen am 01.10.2015

CHF 50,00

(inkl. MwSt.)

Informationen zum Versand

Noch nicht lieferbar

In den Warenkorb

Auf Wunschliste

Bibliografische Daten

ISBN/EAN: 9781484202272

Sprache: Englisch

Umfang: 350

Auflage: 1. Auflage

Beschreibung

HBase is an open source, non-relational, distributed database modeled after Google's BigTable and written in Java. It is built to deliver a fault tolerant way of storing large quantities of sparse data. HBase features compression, in-memory operation, and Bloom filters on a per-column basis as outlined in the original BigTable paper. Tables in HBase can serve as the input and output for MapReduce jobs run in Hadoop, and may be accessed through the Java API but also through REST, Avro or Thrift gateway APIs. For building a high performance, scalable and dependable application around HBase, it demands a good understanding of and practice with its fundamentals and various moving parts.HBase Recipes is a unique book, utilizing a problem-solution format that gives the required practical experience with columnar databases, distributed file systems, distributed HBase, ZooKeeper and HBase, configuring, scaling, monitoring, tuning and examples of building applications with all this knowledge. This book presents pragmatic code 'recipes' for preparing to accomplish particular, specific tasks. All the topics have been arranged in general sections according to subject matter, so you’ll be able to quickly get up to speed and become productive on topics and problem solutions of interest to you with virtually no trouble at all.For each topic, the problem is carefully defined and a solution is provided in depth and thoroughly explained so that you understand the core details of the solution and can apply them to your own problem or code any time you come across a similar issue.This code recipes book assumes you have a great interest in learning about the columnar database, HBase in particular and want to architecture applications around it. A basic understanding of databases, programming languages and scalable systems would help connecting the dots. If you are using HBase or considering using it, then this book would provide most of the problem solutions or issues you are likely to encounter.

Autorenportrait

InhaltsangabePart 1: Working with HBase: The first things one should knowChapter 1: Columnar Oriented Databases, Data Management and HBase: - Column Oriented Databases Data Management in column oriented databases Applications where Columnar Databases are used Data models in Column oriented Databases Introduction of the existing column oriented databases Basics of HBase Hadoop and HBase Installing HDFS and HBase Running HDFS and HBase Verifying Installation Chapter 2: Fundamentals of HBase - Logging into HBase and checking data Accessing HBase with command line Storing data in HBase Creating a table Creating table schema Modifying Data Reading Data Deleting Data Versioning Data Atomic Operations ACID Semantics in HBase Part 2: What is HBase, how it is designed and functions: Understanding HBaseChapter 3: Design of HBase - Tables and Schema in HBase Data types in HBase Modeling Data for HBase RowKey Design Data storage and Internals of data manipulation in HBase I/O considerations while designing applications Column family design and things to be kept in mind Block size/cache Bloom Filters Versioning, Implicit and Custom Compression Cache and Batch processes Chapter 4: HBase Internal Architecture - HBase Internal Architecture, a bird’s eye view BigTable, and evolution of HBase Design fundamentals in HBase Data and write path in HBase HFiles and KeyValue Format Regions in HBase WriteAhead Log mean Replication of data HBase and Yarn Security in HBase Part 3: Accessing HBase: ClientsChapter 5: HBase Clients - Client access layer in Hbase Using HBase from Java Using HBase from REST Using HBase from Thrift Using HBase with Avro Accessing with any other clients possible Native MapReduce and HBase Data Manipulation with Pig Data access with Hive Programming Cascading with Hive Part 4: Scaling HBase: Clustering, running, administration, monitoring and going distributedChapter 6: Distributed HBase and Clustering - Distributed HBase ZooKeeper and its Role HBase processes, RegionServer etc. HBase on the cloud (Servers, AWS, S3 etc…) Apache Hadoop, Cloudera and Hortonworks Setup Automating the Cluster setup Setting Up Distributed Setup Setting up HBase with Whirr Puppet and Chef configuration for HBase Cluster Setup Configurations for the cluster Chapter 7: Tuning, Administration and Monitoring HBase - Tuning Data for performance in HBase Tuning Processes in HBase Load Balancing Manual Performance Tests and Load Testing Nodes addition, deletion in HBase Logs checking and Debugging Master, Region Server, JVM Metrics Role of Ganglia and Nagios Using Graphite for Monitor Charting Backup of Data Automating setup Use of Vagrant for automation Part 5: Advanced HBase: Extending HBaseChapter 8: Extending HBase - HTable Utility Filters in HBase Comparison Filters Dedicated Filters Decorating Filters FilterList Writing custom Filters Counters Single and Multiple Counters CoProcessors CoProcessor class and loading HTablePool? Part 6: Development of ApplicationsChapter 9: Appl