20.7.16

Running spark-perf from a container

 I recently created a spark-perf container (pull request = https://github.com/databricks/spark-perf/pull/111/files) which runs spark-perf as a microservice, and prints out all the results to standard out.   A few mods had to be made to spark perf to do this.
  • defaulting HOME so that root privileges werent needed (current spark perf defaults to it)
  • updating python wrappers to grab a spark master url from the environment, for lack of ambiguety.
  • creating a driver script which checked for environment variables
  • adding a SPARK_USER for user-nameless containers
  • running sbt bootstrapping as part of a container, and updating sbt in spark-perf
This is how you create a kubernetes job to run spark performance tests...

The container works
apiVersion: extensions/v1beta1
kind: Job
metadata:
  name: sparkperfjobname
spec:
  selector:
    matchLabels:
      app: sparkperfjob
  parallelism: 1
  completions: 1
  template:
    metadata:
      name: sparkperfjob
      labels:
        app: sparkperfjob
    spec:
      containers:
      - name: sparkperfjob
        image: jayunit100/spark-perf
        env:
        - name: "SPARK_MASTER_URL"
          value: "spark://spark-master-rwvm:7077"
        - name: "HOME"
          value: "/opt"
        command: ["/bin/sh","-c","SPARK_USER=jayunit100 /opt/driver-script.sh || find results/ -exec cat {} +"]
      restartPolicy: Never
    imagePullPolicy: Always
    restartPolicy: Never

Disclaimer

This doesn't yet run perfectly for me, but it might run perfectly at least on some systems.  In particular, I've seen a nosuchmethod error occuring int he sparkPerf test runners.

 
16/07/20 12:48:57 INFO TaskSchedulerImpl: Removed TaskSet 9.0, whose tasks have all completed, from pool
16/07/20 12:48:57 INFO DAGScheduler: ResultStage 9 (count at SchedulerThroughputTest.scala:60) finished in 7.361 s
16/07/20 12:48:57 INFO DAGScheduler: Job 9 finished: count at SchedulerThroughputTest.scala:60, took 7.383331 s
16/07/20 12:48:58 INFO BlockManagerInfo: Removed broadcast_9_piece0 on 10.1.4.2:33010 in memory (size: 1212.0 B, free: 13.1 GB)
16/07/20 12:48:58 INFO BlockManagerInfo: Removed broadcast_9_piece0 on 10.1.0.7:59244 in memory (size: 1212.0 B, free: 457.9 MB)
16/07/20 12:48:58 INFO BlockManagerInfo: Removed broadcast_9_piece0 on 10.1.1.3:54012 in memory (size: 1212.0 B, free: 457.9 MB)
16/07/20 12:48:58 INFO BlockManagerInfo: Removed broadcast_9_piece0 on 10.1.4.7:43934 in memory (size: 1212.0 B, free: 457.9 MB)
Exception in thread "main" java.lang.NoSuchMethodError: org.json4s.jackson.JsonMethods$.render(Lorg/json4s/JsonAST$JValue;)Lorg/json4s/JsonAST$JValue;
        at spark.perf.TestRunner$.main(TestRunner.scala:47)
        at spark.perf.TestRunner.main(TestRunner.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:724)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

No comments:

Post a Comment