Specializing for primitive types

来源：互联网发布：c 定义数组编辑：程序博客网时间：2024/05/20 11:32

转自 http://www.scala-notes.org/2011/04/specializing-for-primitive-types/

One interesting feature that was added to Scala in version 2.8 is specialization, using the @specialized annotation. First, a little background information.

Generics in Java and the JVM, and consequently also in Scala, are implemented by using type erasure. That means that if you have an instance of a generic class, for example List[String], then the compiler will throw away the information about the type argument, so that at runtime it will look like List[Object].

In Java, primitive types are treated very differently from reference types. One of the things that you cannot do in Java is use primitive types to fill in type parameters – for example, you cannot make a List<int>. If you want to create a list that holds integers, you’ll have to use the wrapper class Integer. A drawback of this approach is that you need to box each of your primitive ints into an Integer object that takes up a lot more memory. I’ve done some tests and found that on my JVM a double only takes up 8 bytes, but a Double object takes up 24 bytes. Also, the boxing and unboxing takes up processor time, which can become a serious bottleneck if you’re dealing with large collections of numbers.

In Scala, the distinction between primitive types and reference types is far less great than in Java, and you can use types like Int directly for type arguments. But a List[Int] is still converted to a List[Object] at runtime because of type erasure and since primitive types are not subclasses of Object on the JVM, Scala will still box the Ints to objects of some wrapper class – which makes a List[Int] in Scala just as inefficient as aList<Integer> in Java.

The @specialized annotation and compiler support are meant to get rid of this inefficiency. Iulian Dragos explains it very clearly in this video. If you create a generic class and you use the @specialized annotation, for example like this:

1class Container[@specialized(Int) T](value: T) {
2  def apply(): T = value
3}

the compiler will actually generate two versions of the class: the normal, generic one, in which the type parameter is erased, and a special subclass that uses the primitive type Int, without the need to box or unbox the value. You can see this when you compile the code using the -print option of the compiler:

01jesper@jesper-desktop:~/Projects/boxing$ scalac -print Container.scala
02[[syntax trees at end of cleanup]]// Scala source: Container.scala
03package <empty> {
04  class Container extends java.lang.Object with ScalaObject {
05    <paramaccessor> protected[this] val value: java.lang.Object = _;
06    def apply(): java.lang.Object = Container.this.value;
07    <specialized> def apply$mcI$sp(): Int = scala.Int.unbox(Container.this.apply());  // #1
08    def this(value: java.lang.Object): Container = {
09      Container.this.value = value;
10      Container.super.this();
11      ()
12    }
13  };
14  <specialized> class Container$mcI$sp extends Container {
15    <paramaccessor> <specialized> protected[this] val value$mcI$sp: Int = _;
16    override <specialized> def apply(): Int = Container$mcI$sp.this.apply$mcI$sp();
17    override <specialized> def apply$mcI$sp(): Int = Container$mcI$sp.this.value$mcI$sp;  // #2
18    override <bridge> <specialized> def apply(): java.lang.Object = scala.Int.box(Container$mcI$sp.this.apply());
19    <specialized> def this(value$mcI$sp: Int): Container$mcI$sp = {
20      Container$mcI$sp.this.value$mcI$sp = value$mcI$sp;
21      Container$mcI$sp.super.this(scala.Int.box(value$mcI$sp));
22      ()
23    }
24  }
25}

In this output you see that there is a class Container, which is the regular generic version with erased type parameters, and a class namedContainer$mcI$sp which is a subclass of the regular Container. That class is the version that’s specialized for Int. Notice that both the classes have a method apply$mcI$sp(), which is called by their apply() methods. In the regular Container, this method (line 7 – #1) unboxes the Objectthat’s in the container. But in the specialized subclass, it directly returns the Int without unboxing (line 17 – #2).

However, life is not always so simple. Here is an example in which specialization is not as effective as you might think at first sight.

01object Boxing01 {
02  def clamp[@specialized(Double) T : Ordering](value: T, low: T, high: T): T = {
03    import Ordered._
04    if (value < low) low else if (value > high) high else value
05  }
06 
07  def main(args: Array[String]) {
08    val a = clamp(25.0, 13.0, 40.0)
09    println(a)
10  }
11}

I wanted the clamp() method to be generic, so that I could use it on anything for which there is an Ordering available, but I wanted to avoid it having to box and unbox when using it with Double. At first sight you’ll probably think that using @specialized will do the job here. But let’s look at the output of compiling this code with the -print option. Here it is (with some parts omitted):

01// The normal, type-erased version
02def clamp(value: java.lang.Object, low: java.lang.Object, high: java.lang.Object, evidence$1: scala.math.Ordering):java.lang.Object = if (scala.package.Ordered().orderingToOrdered(value, evidence$1).<(low))
03  low
04else
05  if (scala.package.Ordered().orderingToOrdered(value, evidence$1).>(high))
06    high
07  else
08    value;
09 
10// The specialized version
11<specialized> def clamp$mDc$sp(value: Double, low: Double, high: Double, evidence$1: scala.math.Ordering): Double = if(scala.package.Ordered().orderingToOrdered(scala.Double.box(value), evidence$1).<(scala.Double.box(low)))
12  low
13else
14  if (scala.package.Ordered().orderingToOrdered(scala.Double.box(value), evidence$1).>(scala.Double.box(high)))
15    high
16  else
17    value;

Again, the compiler generated two versions of the clamp() method: the normal one that works on java.lang.Objects, and a version namedclamp$mDc$sp() that’s specialized for Double. Take a closer look at the specialized version. You’ll see that it still contains calls toscala.Double.box() and scala.Double.unbox(). Using the @specialized annotation didn’t help at all here!

The reason for this is that the method orderingToOrdered() (defined in object scala.math.Ordered) and the trait scala.math.Ordered are not specialized. So, to call orderingToOrdered() and the < and > methods in trait Ordered, the Double must be boxed, and the return value must be unboxed. Instead of getting rid of the boxing and unboxing, all that we have achieved is that it happens somewhere else.

In Scala’s standard library (version 2.8.1), @specialized is used only very sparingly. Only a handful of traits and classes are specialized. It would be especially nice if the collection classes were specialized for the primitive types. A drawback of specialization is that it causes lots of extra classes and methods to be generated – if all collection classes would be specialized for all primitive types, the Scala language library would suddenly become about ten times as large. Hopefully, some more specialization will be added to the library in places where it really matters in future versions of Scala.