-
Notifications
You must be signed in to change notification settings - Fork 45
Home
Welcome to the okapi wiki!
#slides https://dl.dropboxusercontent.com/u/20078054/slides/hack.pdf
First we need a giraph and later we will build okapi:
git clone https://github.com/apache/giraph.git
cd giraph; mvn install -DskipTests
git clone https://github.com/grafos-ml/okapi.git
cd okapi; mvn package -DskipTests
These are instructions on getting access to the AWS cluster. The cluster will be setup and ready for you to start running jobs.
- Obtain the ssh key
Make sure you chmod 600 the key file.
- Use it to ssh to the frontend of the cluster
ssh -i /path/to/the/key hadoop@54.72.18.118
To run a job:
- Upload your jar file to the frontend using scp
Keep in mind that this is a shared cluster and every body is using the same account to access it so make sure to upload to a distinct location.
- Run the hadoop job
You should already have instructions how to do this.
for example:
HADOOP_PATH=okapi/jar/ee42571209dad477aa913d2a8d428205/okapi-0.3.2-SNAPSHOT-jar-with-dependencies.jar hadoop jar okapi/jar/ee42571209dad477aa913d2a8d428205/okapi-0.3.2-SNAPSHOT-jar-with-dependencies.jar org.apache.giraph.GiraphRunner -Dmapred.job.name=OkapiTrainModelTask -Dmapred.reduce.tasks=0 -libjars okapi/jar/ee42571209dad477aa913d2a8d428205/okapi-0.3.2-SNAPSHOT-jar-with-dependencies.jar -Dmapred.child.java.opts=-Xmx1g -Dgiraph.zkManagerDirectory=okapi/_bsp -Dgiraph.useSuperstepCounters=false ml.grafos.okapi.cf.ranking.BPRRankingComputation -eif ml.grafos.okapi.cf.CfLongIdFloatTextInputFormat -eip okapi/data/7f11e5f748ee0bd129c49ec7085fb62e -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op okapi/BPR_output -w 1 -ca giraph.numComputeThreads=1 -ca minItemId=1 -ca maxItemId=870
If you want to access the Hadoop web interface from outside the AWS cluster, you need to follow these instructions:
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-web-interfaces.html