Set up a standalone HBase local instance and connect to it with Python Happybase

Andrea Guidi
Towards Data Science
4 min readFeb 9, 2020

--

Recently, in my starting adventure as a Data Science consultant, I was involved in a project where the database we are taking the data from is the famous Apache HBase.

HBase is a distributed, Hadoop non-relational database which is really powerful at handling massive amounts of data.

In the HBase data model, rows are identified by a row key which serves more or less an equivalent of the relational Primary Key and they are sorted by row key.

Columns are grouped in the so-called Column Families which are physical and logical grouping of columns, and the columns in one family are stored separately from the columns in another family so that queries can be performed only on the data you need.

You can read more about HBase here.

The reason why I decided to install an HBase local instance is that I must live with an endless curiosity and I love experimenting.

Step 1: Setting up a Linux Virtual Machine

The first thing I did was to create a Linux Virtual Machine with Alpine Linux, mounted on VMWare. Once the VM was running, I logged into the installer by typing:

localhost login: root
setup-alpine

I followed the shell instructions and finally, I rebooted the system. I needed Java support in order for HBase to work, so I installed openjdk8:

apk add openjdk8-jre

After that, I created a couple of folders inside /home, just for comfort:

cd home
mkdir downloads
mkdir andrea
cd downloads

So now I am inside /home/downloads and I can download the HBase compressed folder, uncompress it and change directory:

wget https://www.apache.org/dyn/closer.lua/hbase/2.2.3/hbase-2.2.3-bin.tar.gztar xzvf hbase-2.2.3-bin.tar.gzcd hbase-2.2.3

Inside of this folder, there are two important folders: bin and conf.

I went inside conf:

And then modified the JAVA_HOME environment variable inside hbase_env.sh to /usr/lib/jvm/openjdk8-jre (folders might differ):

export JAVA_HOME=/usr/lib/jvm/openjdk8-jre

Also, I overwrote the content of the file hbase-site.xml with what reported here in Example 1 for Standalone HBase.

Then, I navigated up one folder back to Hbase main folder and changed to bin directory:

To check that HBase is working:

./start-hbase.sh./hbase shell

Et voilà!

To see the tables, type list:

Of course, there are no tables, but we can create one with the following convention:

create 'table_name','columnFamily1',...,'columnFamilyN'

In Shell:

And now let’s add two rows:

To retrieve the table, let’s run a Scan:

Now, the table is there and we can connect to it using Python package Happybase, built using Python Thrift.

First, stop the HBase instance and install Python 3 and Happybase package:

apk add python3pip install happybase

Then, start Thrift server:

And HBase again:

Now we open an Interactive Python session and make a connection to HBase to see that everything is working fine using Happybase APIs:

Thanks for reading, and reach me out for anything!

--

--

I love data, machine learning and Autoencoders. Control systems/electronic engineer.