I thought I would share my experience trying out scalaz
Task and then reverting to scala
Future because of a perceived limitation. I now realize that was a mistake and would like to share my insight in order to help prevent others from falling into the same trap.
I first learnt about scalaz
Task about a year ago when I was working on a project that required some special error handling. What got my attention was the ability to decouple concurrent execution from error handling. Scalaz defines both a
Task and a
Task simply adds error handling to the
Future. This means you can add your own error handling on top of
Future when required. That was exactly what we needed to do at the time. Although I could not have put words on it at the time, we built a Monad transformer in order to combine
Validation into a single
To Cache or not to Cache
At first, this worked out great for us, that is until we ran into an issue where some of our
Tasks were being run multiple times. Lets look at some sample code for the kinds of things we were trying to express:
Now if you try this out with
Task, you will notice that the database call will be run twice. That is because
Task is immutable. However, if you were to rewrite this example with scala
Future, the database access would only be performed once. Scala Futures follow a different programming model. A scala Future is generally “running” when you have a reference to one, and all references refer to the exact same object.
Computations that are being run concurrently are typically long running. That is because of the timed required to perform the computation must overweight the overhead of running it in on a separate Thread. One could then argue that caching the results of these computations is quite a desirable property. This is where we decided, after some reflection, to revert back to using scala
Future at the time.
It turns out, the appropriate way to use
Task in order to solve this problem is to construct our resultTask in a different way. Consider the following:
resultTask in such a way,
databaseCall will only be executed once. I would even argue that for this specific example, the code is prettier this way. That being said, for a bigger example it might involve many nested maps and flatMaps which could become unwieldy. I actually believe that is probably the main reason why this solution did not occur to us at the time since our composite
Tasks were quite large.
Referential transparency to the rescue
You might be thinking, okay, so both
Future can solve the above problem. Before coming to that conclusion, let’s consider some code that in some ways presents the opposite problem:
In the above example, the program will ask the user for two numbers and print the result of adding those two numbers together. I think most people would expect the program to prompt the user twice and that is indeed what happens when using
Task. The situation is a little more complicated if we were to rewrite this example in order to use scala
Future. In that case, if
readLine were a function that returns a new
Future whenever we call it, then the program would behave the same way. However, if
readLine is simply a reference to a Future, then the user will only be prompted once and
second will refer to the same value.
So the moral of the story is that
Task is much more “pure” when it comes to functional programming. To put a more precise term on it,
Task is referentially transparent, you can replace a call to a function that returns a
Task by the
Task itself and the program behaves the same. In order to change the semantic of the program, one must explicitly modify the structure of the program, not just replace a function call with its value.
This does not just apply to
Task, referential transparency is what makes a program purely functional and I believe you should strive to make your code as referentially transparent as possible (unfortunately, Scala, unlike Haskell, does not enforce referential transparency, but to be honest, that can sometimes be a good thing).
An interesting consequence of writing referentially transparent code is that we tend to end up writing code that describes our program instead of code that performs the program. You can then run your description in a different context then the one in which it was created, you can persist your program description and finally you can write multiple interpreters for the same description of a computation.
Hopefully I have managed to convinced you that
Task is a useful abstraction and one that should be favored over scala
Future. And if this wasn’t enough, I encourage you to read Runar’s blog post Easy performance wins with Scalaz for a perspective with more emphasis on performance.