Scala rich wrapping performance

Posted by Mathias in [scala]

27 Aug 2010

Recently I repeatedly found myself wondering whether relying on Scalas rich wrappers for convenience was actually coming with some performance cost or not. Not that it really matters in all but some really time-critical edge cases, however, I remember reading something about modern JVMs being able to completely optimize these wrapping constructs away (by means of so-called “escape analysis”?), so maybe there is absolutely no performance cost involved.

When writing something like "123".toInt you actually activate an implicit conversion creating a StringOps object whose toInt method is immediately being called. The method toInt simply forwards the call to Integer.parseInt, which you could, of course, directly invoke yourself with Integer.parseInt("123"). The created StringOps does not serve any other purpose than providing the invocation environment for its toInt method and can be garbage collected right after the call. I wondered whether either the Scala compiler or the executing JVM is able to recognize this “useless” object creation and completely remove it, so I quickly whipped up the following small benchmark:

import testing.Benchmark

object Tester {
  val a = new Benchmark {
    def run = for (i <- 1 to 10000000) { "123".toInt }
  }

  val b = new Benchmark {
    def run = for (i <- 1 to 10000000) { Integer.parseInt("123") }
  }

  def median(list: List[Long]) = list.sorted.apply(list.size / 2)

  def main(args: Array[String]) {
    Console.println("Benchmarking A against B, Median of 11:")
    Console.println("with wrapper: " + median(a.runBenchmark(11)) + " ms")
    Console.println("w/o  wrapper: " + median(b.runBenchmark(11)) + " ms")
  }
}

These are the results when run on my machine under OS/X Java 6:

Benchmarking A against B, Median of 11:
with wrapper: 343 ms
w/o  wrapper: 310 ms

So, leaving aside the obvious answer to the question, which of the two options is faster to write and easier on the eyes, Scalas rich wrapping actually does come with at a performance cost. Compiling with -optimise doesn’t have any impact on the results, neither does switching the order in which the two versions are run.
The result may be different when the code is run against other JVMs, so if someone wants to report other numbers I’d be happy to include them here.

Cheers,
Mathias


Update (2010-08-30)

As some readers have pointed out, a JVM that supports “Escape Analysis” can indeed make the performance penalty for Scalas rich wrapping disappear. The reason I wasn’t seeing it on my machine is that the current version of the Java 6 JVM available on OS/X 10.6 (1.6.0_20 as of this writing) does not (yet) support Escape Analysis.

Simply put, Escape Analysis is a technique used by the VMs JIT compiler to determine whether an object can be created on the stack instead of the heap. The JIT compiler analyses all references to an object and checks whether one of them can “escape” the current scope (e.g. by being returned from the method). If there is no way for an object reference to “escape” from the current method the object creation generally qualifies for the “stack for heap” optimization.
Since Scalas “rich wrapper objects” are normally created implicitly, without any reference to them, they are prime candidates for stack-based object creation (which essentially results in no “real” object being created).

In July 2010 Sun released Update 21 to its Java 6 JRE which among other things includes the new version 17 of its Hotspot VM. This latest Hotspot release comes with a number of new optimizations, one of them being “Escape Analysis”. Actually Sun had included this optimization already in the VM shipped with Java 6 Update 14 but decided to pull it again due to some remaining issues. Now with Update 21 Escape Analysis is back and it seems to be working nicely, especially for Scala code relying on implicit rich wrappers.
The optimization is only available with the Hotspot Server VM, so make sure you to not run the Client VM to test this feature (Under Windows the Server VM is not shipped with the JRE distribution, you will have to download the JDK to get the option of running the Server Vm with the -server switch).

Running the above benchmark with the Windows Hotspot Server VM from Java 6 Update 21 with -XX:+DoEscapeAnalysis yields the following:

Benchmarking A against B, Median of 11:
with wrapper: 297 ms
w/o  wrapper: 297 ms

And without -XX:+DoEscapeAnalysis:

Benchmarking A against B, Median of 11:
with wrapper: 344 ms
w/o  wrapper: 297 ms

This shows two things: Firstly, the Escape Analysis optimization coming with the latest Hotspot release can completely remove all overhead introduced by Scala rich wrappers. Secondly the Sun VM for Windows performs significantly better that Apples implementation. I generated the numbers above in a Windows 7 Virtual Machine running on my MacBook Pro and even with the overhead introduced by VMWare Fusion the benchmark runs faster than the OS/X Java 6 JVM on “bare metal”.

Summing up, the take-away message from this short analysis should be: There is really no reason to not rely on rich wrappers in your Scala code. The latest release of Suns Hotspot VM enables the very efficient execution of such constructs and shows once again, that worrying about micro-optimizations is not worth it. The JVM technology available today (for free!) is excellent and will continue to drive performance increases especially for higher-abstraction level languages like Scala.

Cheers,
Mathias

View Comments