BigDataRiding: HBase Architecture

The HBase Architecture consists of servers in a Master-Slave relationship as shown below. Typically, the HBase cluster has one Master node, called HMaster and multiple Region Servers called HRegionServer. Each Region Server contains multiple Regions – HRegions.

Just like in a Relational Database, data in HBase is stored in Tables and these Tables are stored in Regions. When a Table becomes too big, the Table is partitioned into multiple Regions. These Regions are assigned to Region Servers across the cluster. Each Region Server hosts roughly the same number of Regions.

The HMaster in the HBase is responsible for

Performing Administration
Managing and Monitoring the Cluster
Assigning Regions to the Region Servers
Controlling the Load Balancing and Failover

On the other hand, the HRegionServer perform the following work

Hosting and managing Regions
Splitting the Regions automatically
Handling the read/write requests
Communicating with the Clients directly

Each Region Server contains a Write-Ahead Log (called HLog) and multiple Regions. Each Region in turn is made up of a MemStore and multiple StoreFiles (HFile). The data lives in these StoreFiles in the form of Column Families (explained below). The MemStore holds in-memory modifications to the Store (data).

The mapping of Regions to Region Server is kept in a system table called .META. When trying to read or write data from HBase, the clients read the required Region information from the .META table and directly communicate with the appropriate Region Server. Each Region is identified by the start key (inclusive) and the end key (exclusive)

HBase Tables and Regions

Table is made up of any number of regions.

Region is specified by its startKey and endKey.

Empty table: (Table, NULL, NULL)
Two-region table: (Table, NULL, “com.ABC.www”) and (Table, “com.ABC.www”, NULL)

Each region may live on a different node and is made up of several HDFS files and blocks, each of which is replicated by Hadoop. HBase uses HDFS as its reliable storage layer.It Handles checksums, replication, failover

HBase Tables:

Tables are sorted by Row in lexicographical order
Table schema only defines its column families
Each family consists of any number of columns
Each column consists of any number of versions
Columns only exist when inserted, NULLs are free
Columns within a family are sorted and stored together
Everything except table names are byte[]
Hbase Table format (Row, Family:Column, Timestamp) -> Value

Hbase consists of,

Java API, Gateway for REST, Thrift, Avro
Master manages cluster
RegionServer manage data
ZooKeeper is used the “neural network” and coordinates cluster

Data is stored in memory and flushed to disk on regular intervals or based on size

Small flushes are merged in the background to keep number of files small
Reads read memory stores first and then disk based files second
Deletes are handled with “tombstone” markers

MemStores:

After data is written to the WAL the RegionServer saves KeyValues in memory store

Flush to disk based on size, is hbase.hregion.memstore.flush.size
Default size is 64MB
Uses snapshot mechanism to write flush to disk while still serving from it and accepting new data at the same time

Compactions:

Two types: Minor and Major Compactions

Minor Compactions

Combine last “few” flushes
Triggered by number of storage files

Major Compactions

Rewrite all storage files
Drop deleted data and those values exceeding TTL and/or number of versions

Key Cardinality:

The best performance is gained from using row keys

Time range bound reads can skip store files
So can Bloom Filters
Selecting column families reduces the amount of data to be scanned

Fold, Store, and Shift:

All values are stored with the full coordinates,including: Row Key, Column Family, Column Qualifier, and Timestamp

Folds columns into “row per column”
NULLs are cost free as nothing is stored
Versions are multiple “rows” in folded table

BigDataRiding

Tuesday, 31 December 2013

HBase Architecture

No comments:

Post a Comment

Contact Form