HBase is an open source, distributed, versioned, column-oriented store modeled after Google’s BigTable: A Distributed Storage System for Structured Data. Hbasecluster deployment is easy with pre-defined clusters and 1-Button Deploy™ technology on Datapipe.
Sample Production Cluster Topology
- Serving data to many users or applications – Relational databases are not inherently distributed. Therefore, as the number of users interacting with the database (i.e., reading and writing data) grows, the storage, memory, and CPU requirements can quickly grow beyond what a single machine can serve. HBase is distributed by design, which means it’s architected to leverage the storage, memory, and CPU resources of any number of servers (or nodes) in a “cluster” to scale the database horizontally as load and performance demands increase.
- Providing fast, random read-write access to users and applications – HDFS is a “write once read many” (WORM) file system that’s tuned for batch operations. The emphasis is on high throughput rather than low latency. HBase augments HDFS by providing record-based storage that lets users and applications perform fast, random reads and writes to data. Changes are cataloged in memory and eventually pushed down to HDFS for persistence, which allows the Hadoop system to serve random reads and writes to users and applications across big tables in real time.