Skip to content

Conversation

@a-roberts
Copy link

A few API changes and lots of project file changes

The v2p0 folder contains the code that we'll run with Spark 2 so contains the API changes, it's hard to tell what's new as they're entirely new files but to summarise I have

  • used foreachrdd not foreach for streaming
  • used awaitTerminationOrTimeout
  • reduced the defaults in config/config.py to a tiny scale factor (I think people will want to build it and see if it works in a tiny environment before scaling up, previous defaults included 20 G driver memory) and uses your $SPARK_HOME

a-roberts added 16 commits June 20, 2016 14:30
This commit makes the default version of Spark "2.0.0-preview" and
consists of various configuration file changes and a couple of method changes.

We should remove the -preview from the project files once 2.0.0 is made
generally available (so we won't be relying on the preview builds).

* Several changes have been made including downloading Akka for streaming-tests

* Scala 2.11.8 is used

* config/py now looks for $SPARK_HOME instead of /root

* foreachRDD is used instead of foreach for a DStream

* awaitTerminationOrTimeout is used instead of awaitTermination for a StreamingContext

* json4s render call is removed owing to API changes
@a-roberts a-roberts mentioned this pull request Aug 26, 2016
@a-roberts a-roberts changed the title [WIP] Spark 2.0.0 support Spark 2.0.0 support Aug 26, 2016
@a-roberts
Copy link
Author

There's a problem here, if we do sbt package from the mllib directory then run the 25 mllib tests, there are four failures as certain test properties have been removed, thanks to Yves Leaute for pointing this out over an email. I'll resolve this by updating the PR and will also adjust the version setting in config.py to prevent artifact resolution problems

@TiagoPerez
Copy link

Hi Robert, great work with HiBench and Spark-Perf... Ever though of doing something similar for SparkBench? https://github.com/SparkTC/spark-bench

# * Don't use PREP_MLLIB_TESTS = True; instead manually run `cd mllib-tests; sbt/sbt -Dspark.version=1.5.0-SNAPSHOT clean assembly` to build perf tests

MLLIB_SPARK_VERSION = 2.0
MLLIB_SPARK_VERSION = 2.0.0
Copy link

@harschware harschware Oct 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My environment is a little customized by the time I run this code (so line numbers don't match), but I am seeing this:

Loading configuration from <snip>/spark-perf/config/config.py
Traceback (most recent call last):
  File "./bin/../lib/sparkperf/main.py", line 40, in <module>
    config = imp.load_source("config", "", cf)
  File "", line 410
    MLLIB_SPARK_VERSION = 2.0.0

Copy link

@zhang051 zhang051 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to run spark-perf with spark-2.0.0 support. Is there a beta- version of spark-perf (except updating individual files) that I can download?

Thanks,

Shuxia

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants