嵌套方法的成本

在Scala中,可以在其他方法中定义方法。 这将它们的使用范围限制在定义块的内部。 我使用它们来提高使用多个高阶函数的代码的可读性。 与匿名函数文字相反,这允许我在给他们有意义的名字之前通过他们。

例如:

class AggregatedPerson extends HashSet[PersonRecord] { def mostFrequentName: String = { type NameCount = (String, Int) def moreFirst(a: NameCount, b: NameCount) = a._2 > b._2 def countOccurrences(nameGroup: (String, List[PersonRecord])) = (nameGroup._1, nameGroup._2.size) iterator.toList.groupBy(_.fullName). map(countOccurrences).iterator.toList. sortWith(moreFirst).head._1 } } 

是否有运行成本,因为我应该知道嵌套的方法定义?

封闭的答案是否有所不同?

在编译期间,嵌套函数moveFirstcountOccurences被移出到与countOccurences相同的级别。 他们得到编译器综合名称: moveFirst$1countOccurences$1

另外,当你参考这些没有参数列表的方法之一时,它就被提升为一个函数。 所以map(countOccurences)和写map((a: (String, List[PersonRecord])) => countOccurences(a)) 。 这个匿名函数被编译为一个单独的类AggregatedPerson$$anonfun$mostFrequentName$2 ,它只不过是转发到countOccurences$

作为一个方面说明,将该方法提升为函数的过程称为Eta Expansion。 如果在需要函数types的上下文中忽略参数列表,或者使用_代替整个参数列表,或者代替每个参数( val f1 = countOccurences _ ; val f2 = countOccurences(_)

如果代码直接在闭包中,那么在栈中只需要less一个方法调用,而生成一个更less的合成方法。 在大多数情况下,性能影响可能为零。

我觉得这是一个非常有用的工具来构build代码,并考虑你的例子非常习惯斯卡拉。

另一个有用的工具是使用小块来初始化一个val:

 val a = { val temp1, temp2 = ... f(temp1, temp2) } 

您可以使用scalac -print来准确查看Scala代码是如何转换成为JVM准备好的。 下面是你程序的输出:

 [[syntax trees at end of cleanup]]// Scala source: nested-method.scala package <empty> { class AggregatedPerson extends scala.collection.mutable.HashSet with ScalaObject { def mostFrequentName(): java.lang.String = AggregatedPerson.this.iterator().toList().groupBy({ (new AggregatedPerson$$anonfun$mostFrequentName$1(AggregatedPerson.this): Function1) }).map({ { (new AggregatedPerson$$anonfun$mostFrequentName$2(AggregatedPerson.this): Function1) } }, collection.this.Map.canBuildFrom()).$asInstanceOf[scala.collection.MapLike]().iterator().toList().sortWith({ { (new AggregatedPerson$$anonfun$mostFrequentName$3(AggregatedPerson.this): Function2) } }).$asInstanceOf[scala.collection.IterableLike]().head().$asInstanceOf[Tuple2]()._1().$asInstanceOf[java.lang.String](); final def moreFirst$1(a: Tuple2, b: Tuple2): Boolean = scala.Int.unbox(a._2()).>(scala.Int.unbox(b._2())); final def countOccurrences$1(nameGroup: Tuple2): Tuple2 = new Tuple2(nameGroup._1(), scala.Int.box(nameGroup._2().$asInstanceOf[scala.collection.SeqLike]().size())); def this(): AggregatedPerson = { AggregatedPerson.super.this(); () } }; @SerialVersionUID(0) @serializable final <synthetic> class AggregatedPerson$$anonfun$mostFrequentName$1 extends scala.runtime.AbstractFunction1 { final def apply(x$1: PersonRecord): java.lang.String = x$1.fullName(); final <bridge> def apply(v1: java.lang.Object): java.lang.Object = AggregatedPerson$$anonfun$mostFrequentName$1.this.apply(v1.$asInstanceOf[PersonRecord]()); def this($outer: AggregatedPerson): AggregatedPerson$$anonfun$mostFrequentName$1 = { AggregatedPerson$$anonfun$mostFrequentName$1.super.this(); () } }; @SerialVersionUID(0) @serializable final <synthetic> class AggregatedPerson$$anonfun$mostFrequentName$2 extends scala.runtime.AbstractFunction1 { final def apply(nameGroup: Tuple2): Tuple2 = AggregatedPerson$$anonfun$mostFrequentName$2.this.$outer.countOccurrences$1(nameGroup); <synthetic> <paramaccessor> private[this] val $outer: AggregatedPerson = _; final <bridge> def apply(v1: java.lang.Object): java.lang.Object = AggregatedPerson$$anonfun$mostFrequentName$2.this.apply(v1.$asInstanceOf[Tuple2]()); def this($outer: AggregatedPerson): AggregatedPerson$$anonfun$mostFrequentName$2 = { if ($outer.eq(null)) throw new java.lang.NullPointerException() else AggregatedPerson$$anonfun$mostFrequentName$2.this.$outer = $outer; AggregatedPerson$$anonfun$mostFrequentName$2.super.this(); () } }; @SerialVersionUID(0) @serializable final <synthetic> class AggregatedPerson$$anonfun$mostFrequentName$3 extends scala.runtime.AbstractFunction2 { final def apply(a: Tuple2, b: Tuple2): Boolean = AggregatedPerson$$anonfun$mostFrequentName$3.this.$outer.moreFirst$1(a, b); <synthetic> <paramaccessor> private[this] val $outer: AggregatedPerson = _; final <bridge> def apply(v1: java.lang.Object, v2: java.lang.Object): java.lang.Object = scala.Boolean.box(AggregatedPerson$$anonfun$mostFrequentName$3.this.apply(v1.$asInstanceOf[Tuple2](), v2.$asInstanceOf[Tuple2]())); def this($outer: AggregatedPerson): AggregatedPerson$$anonfun$mostFrequentName$3 = { if ($outer.eq(null)) throw new java.lang.NullPointerException() else AggregatedPerson$$anonfun$mostFrequentName$3.this.$outer = $outer; AggregatedPerson$$anonfun$mostFrequentName$3.super.this(); () } } } 

运行成本很低。 你可以在这里观察(长码道歉):

 object NestBench { def countRaw() = { var sum = 0 var i = 0 while (i<1000) { sum += i i += 1 var j = 0 while (j<1000) { sum += j j += 1 var k = 0 while (k<1000) { sum += k k += 1 sum += 1 } } } sum } def countClosure() = { var sum = 0 var i = 0 def sumI { sum += i i += 1 var j = 0 def sumJ { sum += j j += 1 var k = 0 def sumK { def sumL { sum += 1 } sum += k k += 1 sumL } while (k<1000) sumK } while (j<1000) sumJ } while (i<1000) sumI sum } def countInner() = { var sum = 0 def whileI = { def whileJ = { def whileK = { def whileL() = 1 var ksum = 0 var k = 0 while (k<1000) { ksum += k; k += 1; ksum += whileL } ksum } var jsum = 0 var j = 0 while (j<1000) { jsum += j; j += 1 jsum += whileK } jsum } var isum = 0 var i = 0 while (i<1000) { isum += i; i += 1 isum += whileJ } isum } whileI } def countFunc() = { def summer(f: => Int)() = { var sum = 0 var i = 0 while (i<1000) { sum += i; i += 1 sum += f } sum } summer( summer( summer(1) ) )() } def nsPerIteration(f:() => Int): (Int,Double) = { val t0 = System.nanoTime val result = f() val t1 = System.nanoTime (result , (t1-t0)*1e-9) } def main(args: Array[String]) { for (i <- 1 to 5) { val fns = List(countRaw _, countClosure _, countInner _, countFunc _) val labels = List("raw","closure","inner","func") val results = (fns zip labels) foreach (fl => { val x = nsPerIteration( fl._1 ) printf("Method %8s produced %d; time/it = %.3f ns\n",fl._2,x._1,x._2) }) } } } 

有四种不同的方法来求和整数:

  • 原始while循环(“原始”)
  • 而循环内部方法是closures总和variables
  • while循环返回部分和的内部方法
  • 一个while-and-sum嵌套函数

我们在内部循环中以纳秒为单位在我的机器上看到结果:

 scala> NestBench.main(Array[String]()) Method raw produced -1511174132; time/it = 0.422 ns Method closure produced -1511174132; time/it = 2.376 ns Method inner produced -1511174132; time/it = 0.402 ns Method func produced -1511174132; time/it = 0.836 ns Method raw produced -1511174132; time/it = 0.418 ns Method closure produced -1511174132; time/it = 2.410 ns Method inner produced -1511174132; time/it = 0.399 ns Method func produced -1511174132; time/it = 0.813 ns Method raw produced -1511174132; time/it = 0.411 ns Method closure produced -1511174132; time/it = 2.372 ns Method inner produced -1511174132; time/it = 0.399 ns Method func produced -1511174132; time/it = 0.813 ns Method raw produced -1511174132; time/it = 0.411 ns Method closure produced -1511174132; time/it = 2.370 ns Method inner produced -1511174132; time/it = 0.399 ns Method func produced -1511174132; time/it = 0.815 ns Method raw produced -1511174132; time/it = 0.412 ns Method closure produced -1511174132; time/it = 2.357 ns Method inner produced -1511174132; time/it = 0.400 ns Method func produced -1511174132; time/it = 0.817 ns 

因此,底线是:在简单的情况下嵌套函数并不会伤害到你–JVM会发现这个调用可以被内联(因此rawinner给出相同的时间)。 如果采取更多function的方法,函数调用不能被完全忽略,但所花费的时间是非常小的(每个调用大约需要0.4 ns)。 如果你使用了大量的闭包,那么closures它们会给每次调用带来1ns的开销,至less在这种情况下,只写一个可变variables。

你可以修改上面的代码来find其他问题的答案,但最重要的是它是非常快速的,从“没有任何惩罚”直到“只担心在最紧密的内部循环,否则有最小的工作做”。

(PS为了比较,在我的机器上创build一个单一的小物体需要约4 ns。)

截至2014年1月

目前的基准是〜3岁,热点和编译器已经有了显着的发展。 我也在使用Google Caliper执行基准testing。

 import com.google.caliper.SimpleBenchmark class Benchmark extends SimpleBenchmark { def timeRaw(reps: Int) = { var i = 0 var result = 0L while (i < reps) { result += 0xc37e ^ (i * 0xd5f3) i = i + 1 } result } def normal(i: Int): Long = 0xc37e ^ (i * 0xd5f3) def timeNormal(reps: Int) = { var i = 0 var result = 0L while (i < reps) { result += normal(i) i = i + 1 } result } def timeInner(reps: Int) = { def inner(i: Int): Long = 0xc37e ^ (i * 0xd5f3) var i = 0 var result = 0L while (i < reps) { result += inner(i) i = i + 1 } result } def timeClosure(reps: Int) = { var i = 0 var result = 0L val closure = () => result += 0xc37e ^ (i * 0xd5f3) while (i < reps) { closure() i = i + 1 } result } def normal(i: Int, j: Int, k: Int, l: Int): Long = i ^ j ^ k ^ l def timeUnboxed(reps: Int) = { var i = 0 var result = 0L while (i < reps) { result += normal(i,i,i,i) i = i + 1 } result } val closure = (i: Int, j: Int, k: Int, l: Int) => (i ^ j ^ k ^ l).toLong def timeBoxed(reps: Int) = { var i = 0 var result = 0L while (i < reps) { closure(i,i,i,i) i = i + 1 } result } } 

结果

 benchmark ns linear runtime Normal 0.576 = Raw 0.576 = Inner 0.576 = Closure 0.532 = Unboxed 0.893 = Boxed 15.210 ============================== 

令人惊讶的是封闭testing比其他testing快了约4ns。 这似乎是热点而不是执行环境的特质,多次运行返回相同的趋势。

使用封闭执行装箱是一个巨大的性能打击,它需要约3.579ns执行一个拆箱和重新装箱,足以做很多原始的math操作。 在这个特定的位置上,在一个新的优化器上完成工作可能会变得更好。 在一般情况下,小拳击可以减轻拳击。

编辑:新的优化器在这里没有真正的帮助,它使Closure速度慢了0.1 ns, Boxed速度快了0.1 ns:

  benchmark ns linear runtime Raw 0.574 = Normal 0.576 = Inner 0.575 = Closure 0.645 = Unboxed 0.889 = Boxed 15.107 ============================== 

用magarciaEPFL / scala的2.11.0-20131209-220002-9587726041执行

版本

JVM

 java version "1.7.0_51" Java(TM) SE Runtime Environment (build 1.7.0_51-b13) Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode) 

Scalac

 Scala compiler version 2.10.3 -- Copyright 2002-2013, LAMP/EPFL