Distribute Cloud Environment on Ubuntu 14.04 with Docker

Export and Import a Docker Image between Nodes in Cluster

NOTE: My experiment environment is on Windows Azure Virtual Machine.

The following figure shows what I plan to distribute my cloud cluster.

If you follow the previous sections, you will get a Docker container with hadoop environment. To use that, you need to duplicate it to other endpoints you want to distribute to build hadoop cluster.

Export a Docker container to a image.

Save the container as a compressed tar file.

$ sudo docker save <image_repository:tag> > XXXX.tar

After finishing export, I use scp to transport the image to other nodes in cluster.

$ scp -P [port] XXXX.tar [account]@[domain]/<where you want to store>

Now, switch to the master2 to show how to use the image we just transport from master.

$ sudo docker load < XXXX.tar

After loading tar file, we can check the image just importing from local repository.

$ sudo docker images

Now, we can start to use this image to distribute the cluster. In this example, I write a simple shell script to run Docker image instead of Dockerfile because I haven't knowed how to use Dockerfile to build a Docker container.

Shell script

Because --link option is not fit in my situation, I use the basically port mapping to try connect with each container via internet

$ vi bootstrap.sh
-> sudo docker run -i -t -p 2122:2122 -p 50020:50020 -p 50090:50090 -p 50070:50070 -p 50010:50010 -p 50075:50075 -p 8031:8031 -p 8032:8032 -p 8033:8033 -p 8040:8040 -p 8042:8042 -p 49707:49707 -p 8088:8088 -p 8030:8030 -p 9000:9000 -p 9001:9001 -p 3888:3888 -p 2888:2888 -p 2181:2181 -p 16020:16020 -p 60010:60010 -p 60020:60020 -p 60000:60000 -p 9090:9090 -h <hostname> <repository:tag>
wq!
$ sh bootstrap.sh

In this example, I use master2 as hostname and listen all needed ports from container to endpoint machine.

Distribute with Docker Container

Now, we can start to boot the hadoop cluster on. There are some steps before start-dfs.sh.

  • Each containers in cluster need to do the following statement
$ source /etc/profile
$ sudo vi /etc/hosts
-> put all IP and host name
## for example
127.0.0.5 master
10.0.0.2 master2
10.0.0.3 slave1
10.0.0.4 slave2
10.0.0.5 slave3
wq!

## restart ssh twice
$ sudo service ssh restart
$ sudo service ssh restart
  • Master container need to do this
## format hdfs namenode
$ hdfs namenode -format
$ <HADOOP_HOME> sbin/start-dfs.sh
$ <HADOOP_HOME> sbin/start-yarn.sh

## make the root directory of hbase
$ hadoop fs -mkdir /hbase

## start zookeeper
$ zkServer.sh start

## start hbase
$ start-hbase.sh

## TEST
$ jps

If success, you will see the following process running on master