Skip to content
Dionysios Logothetis edited this page Mar 26, 2014 · 7 revisions

Welcome to the okapi wiki!

#slides https://dl.dropboxusercontent.com/u/20078054/slides/hack.pdf

Building Okapi

First we need a giraph and later we will build okapi:

git clone https://github.com/apache/giraph.git

cd giraph; mvn install -DskipTests

git clone https://github.com/grafos-ml/okapi.git

cd okapi; mvn package -DskipTests

Accessing AWS

These are instructions on getting access to the AWS cluster. The cluster will be setup and ready for you to start running jobs.

  1. Obtain the ssh key

Make sure you chmod 600 the key file.

  1. Use it to ssh to the frontend of the cluster

ssh -i /path/to/the/key hadoop@54.72.18.118

To run a job:

  1. Upload your jar file to the frontend using scp

Keep in mind that this is a shared cluster and every body is using the same account to access it so make sure to upload to a distinct location.

  1. Run the hadoop job

You should already have instructions how to do this.

for example: HADOOP_PATH=okapi/jar/ee42571209dad477aa913d2a8d428205/okapi-0.3.2-SNAPSHOT-jar-with-dependencies.jar hadoop jar okapi/jar/ee42571209dad477aa913d2a8d428205/okapi-0.3.2-SNAPSHOT-jar-with-dependencies.jar org.apache.giraph.GiraphRunner -Dmapred.job.name=OkapiTrainModelTask -Dmapred.reduce.tasks=0 -libjars okapi/jar/ee42571209dad477aa913d2a8d428205/okapi-0.3.2-SNAPSHOT-jar-with-dependencies.jar -Dmapred.child.java.opts=-Xmx1g -Dgiraph.zkManagerDirectory=okapi/_bsp -Dgiraph.useSuperstepCounters=false ml.grafos.okapi.cf.ranking.BPRRankingComputation -eif ml.grafos.okapi.cf.CfLongIdFloatTextInputFormat -eip okapi/data/7f11e5f748ee0bd129c49ec7085fb62e -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op okapi/BPR_output -w 1 -ca giraph.numComputeThreads=1 -ca minItemId=1 -ca maxItemId=870

Accessing the Hadoop web interface

If you want to access the Hadoop web interface from outside the AWS cluster, you need to follow these instructions:

http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-web-interfaces.html

Clone this wiki locally