Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scala Compiler on GraalVM: slower on macOS than Linux #745

Open
lrytz opened this issue Oct 17, 2018 · 6 comments
Open

Scala Compiler on GraalVM: slower on macOS than Linux #745

lrytz opened this issue Oct 17, 2018 · 6 comments
Assignees

Comments

@lrytz
Copy link

lrytz commented Oct 17, 2018

The Scala compiler ships with its own JVM bytecode optimizer. While working on that feature I got curious what would be its effect when running on GraalVM.

I benchmarked the Scala compiler built without and with its own optimizer enabled, and compared the performance of these two compilers on HotSpot (1.8.0_181), Graal CE RC7 and Graal EE RC7.

Results when running on Linux (i7-6700, 64 GB RAM, Debian 8.8), unit is ms/op, in parentheses is the factor to the baseline:

2.13.0-pre-84de40d-noopt 2.13.0-pre-8d17277-SNAPSHOT
HotSpot 742.931 ± 2 (1.0) 678.911 ± 2 (0.914)
Graal CE 695.825 ± 4 (0.937) 660.256 ± 2 (0.889)
Graal EE 578.633 ± 3 (0.779) 552.111 ± 2 (0.743)

(The Scala version numbers are not important; the first column is a compiler built without optimizer, the second is built with the optimizer enabled)

Graal is doing really well here, the unoptimized Scala compiler runs 22% faster on Graal EE than HotSpot. The Scala optimizer is helpful on HotSpot, less so on GraalVM.

I ran the same benchmarks on my MacBook Pro (i7-8850H, 16 GB RAM, macOS 10.14) and got very different results:

2.13.0-pre-84de40d-noopt 2.13.0-pre-8d17277-SNAPSHOT
HotSpot 743.920 ± 7 (1.0) 694.129 ± 6 (0.933)
Graal CE 840.872 ± 9 (1.130) 803.120 ± 7 (1.080)
Graal EE 741.984 ± 7 (0.997) 711.624 ± 6 (0.957)

Graal CE is slower than HotSpot, GraalEE just a bit faster when running the non-optimized Scala compiler. The optimized Scala compiler runs faster on HotSpot than on Graal.

We were quite surprised by these results, so we ran the benchmarks on an iMac (i7-6700K) and on something-like-a-Mac-Mini (i7-7700, macOS 10.13.6). The results were the same as on my MacBook Pro.

By suggestion of @axel22, I ran the benchmarks on a Linux VM (Debian 8.11, 8 CPUs, 4 GB RAM) on my MacBook Pro. The results were the same as on the non-VM Linux machine, GraalVM is doing well.

So it looks like GraalVM has suboptimal performance on macOS.


To reproduce the benchmarks

jabba use 1.8.191
for sv in 2.13.0-pre-84de40d-noopt 2.13.0-pre-8d17277-SNAPSHOT; do
  for vm in 1.8.191 1.8.172-graal-ce-rc7 1.8.172-graal-ee-rc7; do
    echo "----- $sv ----- $vm -----"
    sbt \
      'set resolvers in ThisBuild   = List("scala-integration" at "https://scala-ci.typesafe.com/artifactory/scala-integration/", "scala-release-temp" at "https://scala-ci.typesafe.com/artifactory/scala-release-temp/", "scala-pr" at "https://scala-ci.typesafe.com/artifactory/scala-pr-validation-snapshots/")' \
      "set scalaVersion in ThisBuild := \"$sv\"" \
      clean \
      "hot -jvm $(jabba which $vm)/bin/java -psource=scalap -f 1 -wi 20 -w 10 -i 10 -r 10"
  done
done
@dougxc
Copy link
Member

dougxc commented Oct 18, 2018

Hi @lrytz , thanks for the report. Would it be possible for you to provide some Java Flight Recorder profiles for the macOS and Linux executions on Graal CE? The first step is to see where time is being spent.

@lrytz
Copy link
Author

lrytz commented Oct 19, 2018

On Graal CE I get

$> java -XX:FlightRecorderOptions=loglevel=info
Unrecognized VM option 'FlightRecorderOptions=loglevel=info'
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

I'll do it with Graal EE.

@lrytz
Copy link
Author

lrytz commented Oct 19, 2018

I added -prof jmh.extras.JFR to sbt-jmh, so the full command was

sbt \
'set resolvers in ThisBuild   = List("scala-integration" at "https://scala-ci.typesafe.com/artifactory/scala-integration/", "scala-release-temp" at "https://scala-ci.typesafe.com/artifactory/scala-release-temp/", "scala-pr" at "https://scala-ci.typesafe.com/artifactory/scala-pr-validation-snapshots/")' \
"set scalaVersion in ThisBuild := \"2.13.0-pre-84de40d-noopt\"" \
clean \
"hot -jvm $(which java) -psource=scalap -f 1 -wi 20 -w 10 -i 10 -r 10 -prof jmh.extras.JFR"

This makes sbt-jmh run jfr with settings=profile

Here are the two profiles: https://drive.google.com/drive/folders/1obTnbb__irvpmOua21xcp2s_j9F8uZa7?usp=sharing. Let me know if that's useful.

@dougxc
Copy link
Member

dougxc commented Oct 19, 2018

Hmm, the profiles indicate roughly the same hot methods on both platforms although there seems to be almost twice as many samples taken on Linux. @thomaswue @tkrodriguez @gilles-duboscq how do we further diagnose this?

@thomaswue thomaswue self-assigned this Oct 19, 2018
@thomaswue
Copy link
Member

I will take a look and try to reproduce on my machine.

@tkrodriguez
Copy link
Member

I would guess it's a recompilation loop from stale profiles. I'll see if LogCompilation output supports that guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants