Sunday, June 29, 2014

Map vs flatMap

In Scala, both map and flatMap are defined on lists. It would be easier to learn them if we consider that first maps every element of the list to another element of the list and collects the results, so that map can translate sentence "abc" into (1,2,3). The length of result is the same as the source since every element is mapped to exactly one.

Flat map, in contrast, not just concatenates images into one list. This may sound weird when you hear it for the first time. Why do we need this operation so often that we embedded it into our core? Why it becomes the first function that you will use next? It gets a perfect sense when you realize that the one-to-one correspondence between the elements is not necessary. It can happen that a is mapped to 1 whereas b is mapped to 11, like Huffman compressor does mapping characters to variable number of bits. It may even happen that some elements have no image at all. You produce empty lists in this case.

It is much more difficult in case of singletone monads, eg. custom probability distribution or Futures

image/svg+xml usdQuote flatMap { usd => chfQuote map {chf => usd+chf }}

I have copy-pasted the notorious example of combining Futures from the standard Scala manual. After looking under the hood, into implementation of the Future

  def map[S](f: T => S)(implicit executor: ExecutionContext): Future[S] = {
    val p = Promise[S]()
    onComplete { v => p complete (v map f) }
it can easily be understood that map creates a (green) future through a promise. The parent blue chfQuote future, in its onComplete routine, will translate its result using pink function chf => chf+usd and will communicate result of this translation to the (green) future through the promise. That will be result of the green future, which will complete immediately. That is my first question: why do you need to create a (map) future if it does not do anything but "concurrently" waits for a promise? In this particular case, green future is waiting until the blue terminates and terminates immediately after. What is the point? Where is the asynchronous action of the green future?

I see that blue and red futures are created in advance and they do some, allegedly lengthly job. Furthermore, they do it concurrently, as futures are supposed to operate. The latter also means that chfQuote can complete before usdQuote is mapped. The flatMap of usdQuote takes a gray function, which will be executed once usdQuote completes, and, likewise the map, produces the yellow future using the p.future. This raises another question: how is the flatMap different? It is much more complex

  def flatMap[S](f: T => Future[S])(implicit executor: ExecutionContext): Future[S] = {
    import impl.Promise.DefaultPromise
    val p = new DefaultPromise[S]()
    onComplete {
      case f: Failure[_] => p complete f.asInstanceOf[Failure[S]]
      case Success(v) => try f(v) match {
        // If possible, link DefaultPromises to avoid space leaks
        case dp: DefaultPromise[_] => dp.asInstanceOf[DefaultPromise[S]].linkRootOf(p)
        case fut => fut.onComplete(p.complete)(internalExecutor)
      } catch { case NonFatal(t) => p failure t }
The complexity is further boosted by the fact that we need to deal with nested futures, mapped to other futures.

The first thing to spot is that onComplete handler of the usdQuote (red) future starts by mapping result of this feature, v = usd, using function f, the (gray) argument of the flat map, to a (green) future and register completion handler for the it. That is, it tells that whenever the green future completes, the yellow completes with the same result!

Ugh. The (yellow) seems not used anywhere. It is a result of computation and, I believe, it is a useless future, since again, it does not compute anything (i.e. has not body) but its only purpose is to wait for a promise.

No comments: