Thursday, September 22, 2016

Await.result is faster than pure Futures

This gentlemen says that no waiting must be used in your code -- just pure futures. However, in Scala IRC, I asked how to deal with a strategy when your actions create more actions, you will end up with more actions. Some actions become obsolete -- how do you cancel unnecessary ones. I was advised to wait for my computation to finish. I therefore decided to assess the cost of waiting in fibonacci sequence, which is easily parallelizable, you can spawn 2 threads at every recursive loop, to the extent that it becomes unproductive. I therefore limited the paralellization to the very highest levels and I can either wait until the two (parallel) child computations to finish before returning the result to the caller or return our parent a future so that it finishes itself when children are ready. The first one awaits for the children in the loop whereas the second collects all computations into one future and waits only once for this future to complete. It seems that waiting in the loop is faster. With the following definitions

import scala.concurrent._ ; import scala.concurrent.duration._; import ExecutionContext.Implicits.global
def fib(n: Int): Int = if (n < 2) n else fib(n-1)+fib(n-2)
def timeit[R](code: => R) = {val t1 = System.currentTimeMillis; val r = code; (r, (System.currentTimeMillis - t1) millis)}

println("running on " + Runtime.getRuntime().availableProcessors
   + " processors") // gives 8

I tried

val grainSize = 42 ; val fibArg = 45 ; 

def awaitFib = {
def par(n: Int): Int = if (n < grainSize) fib(n) else {
val fl = List(1,2) map (i => Future{par(n-i)})
Await.result(Future.sequence(fl), Duration.Inf) sum
}
par(fibArg)
}

vs

def futureFib = {
  def par(n: Int): Future[Int] = 
    if (n < grainSize) future{fib(n)} else {
      val flist = List(1,2).map(i => par(n-i))
      Future.sequence(flist) map (_.sum)
    }
  Await.result(par(fibArg), Duration.Inf)
}


Both compute fib(45) where bottom 42 levels are computed sequentially and upper levels 45 downto 43 start a parallel computation. Both end up in around 6-7 seconds but first seems to show lower time (needs to be proved with statisitcal test):

(1 to 10) foreach (_ => println(List(awaitFib _, futureFib _)
  .map (fib => timeit(fib())._2)))

gives 


List(4944 milliseconds, 5879 milliseconds)
List(6465 milliseconds, 6551 milliseconds)
List(4979 milliseconds, 5427 milliseconds)
List(4701 milliseconds, 4630 milliseconds)
List(4851 milliseconds, 6820 milliseconds)
List(5160 milliseconds, 6433 milliseconds)
List(4844 milliseconds, 4565 milliseconds)
List(5483 milliseconds, 6665 milliseconds)
List(5430 milliseconds, 6408 milliseconds)
List(4737 milliseconds, 4457 milliseconds)


With my t-interval utility it was established that average distance of 624.1 lies within 97% confidence interval (5.453 to 1243), which excludes 0. That is, difference is significant.



No comments: