Installing Phoenix – A step by step tutorial

Phoenix is an open source SQL skin for HBase. You use the standard JDBC APIs instead of the regular HBase client APIs to create tables, insert data, and query your HBase data.

Prerequisites –

1. Java JDK (This demo uses JDK version 1.7.0_67)

Make sure the JAVA_HOME system environment variable points to the JDK. Make sure the java executable’s directory is in the PATH environment variable, i.e., %JAVA_HOME%\bin.

2. Make sure you have installed Hbase on your machine, for that refer my post Hbase Installation in Pseudo-distributed mode.

Install Phoenix –

1. Download Phoenix-4.7.0-Hbase-1.1  and expand the installation tar.

tar -zxvf phoenix-4.7.0-Hbase-1.1-bin.tar.gz

2. Add the phoenix-[version]-server.jar to the classpath of HBase region server and master and remove any previous version. An easy way to do this is to copy it into the HBASE_INSTALL_DIR/lib directory.

3. Add the phoenix-[version]-client.jar to the Phoenix client.

4. Restart Hbase.

5. Run an example to test everything is working fine :

Open the Command Line – A terminal interface to execute SQL from the command line is now  bundled with Phoenix. To start it, execute the following from the bin directory:

$ sqlline.py localhost

And if Zookeeper is running externally then , in our case zookeeper is running on master node. Therefore run,

   $ sqlline.py <master-hostname>:2181

a. First, let’s create a us_population.sql file, containing a table definition:

CREATE TABLE IF NOT EXISTS us_population (
 state CHAR(2) NOT NULL,
 city VARCHAR NOT NULL,
 population BIGINT
 CONSTRAINT my_pk PRIMARY KEY (state, city));

b. Now let’s create a us_population.csv file containing some data to put in that table:

 NY,New York,8143197
 CA,Los Angeles,3844829
 IL,Chicago,2842518
 TX,Houston,2016582
 PA,Philadelphia,1463281
 AZ,Phoenix,1461575
 TX,San Antonio,1256509
 CA,San Diego,1255540
 TX,Dallas,1213825
 CA,San Jose,912332

c. And finally, let’s create a us_population_queries.sql file containing a query we’d like to run on that data.

SELECT state as "State",count(city) as "City Count",sum(population) as "Population Sum"
 FROM us_population
 GROUP BY state
 ORDER BY sum(population) DESC;

d. Loading Data : In addition, you can use the bin/psql.py to load CSV data or execute SQL     scripts. For example:

./psql.py <your_zookeeper_quorum> us_population.sql us_population.csv 
us_population_queries.sql

You have created a table in phoenix , inserted the data and run the query.

 

This is it 🙂

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s