Distribute Cloud Environment on Ubuntu 14.04 with Docker

Possible Problems

Hadoop

  • ClockOutOfSyncException

Becasue the hadoop cluster time date is not sync, you can use ntpdate asia.pool.ntp.org to sync date time of each hosts.

  • ConnectionRefused

Please confirm your IP and hostname in /etc/hosts is correct or not.

HBase

  • If the IP of Docker container change, how to reconnect hbase?
$ stop-hbase.sh
$ zkServer.sh stop  // Stop zookeeper
$ rm -rf $ZOOKEEPER_HOME/data/<version-?> // Remove the version of zk in data folder
$ zkServer.sh start // Start zookeeper
$ start-hbase.sh
  • If I use native Java API to manipulate HBase failed but without any error log showed on console, how to solve this problem?

Please check the client hadoop version is equal to server.

Spark

  • The firewall issue

http://stackoverflow.com/questions/28666183/submitting-jobs-to-spark-ec2-cluster-remotely

  • Manifest issue

http://apache-spark-user-list.1001560.n3.nabble.com/Invalid-signature-file-digest-for-Manifest-main-attributes-with-spark-job-built-using-maven-td14299.html http://stackoverflow.com/questions/28459333/how-to-build-an-uber-jar-fat-jar-using-sbt-within-intellij-idea

continue...