Scalateral: scala

Showing posts with label scala. Show all posts

Wednesday, April 6, 2011

Scala IO versus Guava: The Basics

A friend of mine once said that everything in life was about search and sort. Thinking about it for a while, it seems he's right. Almost. The rest is about IO.

IO in Java

Question is how you do IO. Long, long ago, probably before Java 1.2, Java's IO classes were sketchy, to say the least. Later versions solved some of that (introducing Readers and Writers), and eventually, with Java 1.4, we got Java NIO. If all goes well, we will have the new NIO soon.

IO in external libraries

Nevertheless, in many cases, people still rely on external library to make their lives a little easier. Commons IO has been a popular choice for some time, and at some point, Guava also added some IO abstractions to its libraries.

IO in Scala

It makes you wonder about Scala's IO classes. At first, it doesn't look too good. The 'scala.io' package has a Source class that eases reading files, doing some automatic resource management. That's good. But then it turns out the abstraction returned is an Iterable. And you don't want to have an Iterator traversing the contents of you file. If it bails out, then you are left with an open file handle, leaving your file open for the rest of the existence of your VM instance. In fact, if you're searching StackOverflow, you will quickly find many complaints about scala.io being broken, or about scala.io being still broken.

Scala's New IO

But there might be hope out there. There is a Scala library that seems to address some of the concerns normally addressed by the libraries I mentioned, including decent support for automatic resource management. The name of the library: scala-io. I know. It might be a good idea to change the name.

What does it give you?

Scala IO first of all is built on top of scala-arm, the library providing the foundation for automatic resource management. On top of that, it gives you quite a bit of goodness for reading and writing bytes and text. In this post, I will go over some of its features, comparing it to how it's done in Scala:

Copying an InputStream into a Byte Array

This is how't its done in Guava:

InputStream in = ...;
byte[] buffer = ByteStreams.toByteArray(in);

And this is the same thing, done in Scala IO:

val in: InputStream = ...
Resource.fromInputStream(in).byteArray

Similar, but there is a big difference though. In the first case, the stream is not closed. In the second cases, it is.

InputSuppliers

Guava has an abstraction that allows you to pass an object providing access to an InputStream around. The InputStream itself is not opened yet, but it will get opened once you ask the object give you the Input. The good thing about it is that code that opens the stream can also be responsible for closing it, without having to know how the stream got opened:

public interface InputSupplier<T> {
    T getInput() throws IOException;
}

In a way, a Scala IO Resource is an InputSupplier or/and an OutputSupplier. However, there is no need to implement an interface to defer the construction of the actual underlying object providing or accepting bytes. Instead, you just pass in a block of code that will get evaluated right before you are about to access or write your bytes, leveraging Scala's by-name parameters.

So you could do something like this:

Resource.fromInputStream(new FileInputStream(...))

...without the file already getting opened. As a consequence you can access the Resource multiple times without running into trouble. The FileInputStream will be closed after you have acted on it, but you can still 'reopen' it afterwards.

Filling a byte array

In some cases, all you want to do is fill an existing byte array. In Guava, this is how you would do it:

InputStream in = null;
try {
  in = ...
  byte[] buffer = new buffer[100];
  ByteStreams.readFully(in, buffer);
} finally {
  Closeables.closeQuietly(in);
}

In Scala IO, it's quite a bit easier:

val in: InputStream = ...
val buffer = new Array[Byte](100)
Resource.fromInputStream(in).bytes.copyToArray(buffer)

Note the absence of a try finally block. First a Resource is getting created, then we obtain a bytes view on that object, and then we use Traversable's copyToArray method to copy the data into the array.

Copy InputStream to OutputStream

This is how you do it in Java using Guava:

InputStream in = ...;
OutputStream out = ...;
try {
  ByteStreams.copy(in, out);
} finally {
  Closeables.closeQuietly(in);
  Closeables.closeQuietly(out);
}

This is the same thing done in Scala IO:

val in: InputStream = ...
val out: OutputStream = ...
Resource.fromInputStream(in).copyData(Resource.fromOutputStream(out))

Seems rather verbose. And as a matter of fact, it doesn't need to be this way. If you import a number of implicits, then the above could expressed like this as well:

in.asInput.copyData(out.asOutput)

There are implicits turning the InputStream into an Input object with the copyData operation, and similar implicit conversions from OutputStream to an object upon which you can invoke toOutput.

Reading a String

This is how it's done in Guava:

InputStream in = ...;
String content = null;
try {
  content = CharStreams.toString(new InputStreamReader(in, "UTF-8"));
} finally {
  Closeables.closeQuietly(in);
}

... and this is the same thing, done in Scala IO:

val in: InputStream = ...
val content = Resource.fromInputStream(in).slurpString(Codec.UTF8)

or, alternatively:

val in: InputStream = ...
val content = in.asInput.slurpString(Codec.UTF8)

Monday, March 28, 2011

JIT Adjusted Map

One of the things I am currently working on is a Scala Kyoto Cabinet API. Kyoto Cabinet is a C++ library for accessing a very fast persistent key value store. It has a Java library, and as a consequence you can use it from Scala without a problem, but the standard Java API isn't really all that Scala-esque.

Kyoto Cabinet DB as a Map

A key value store is not all that different than a mutable Map in Scala. You pass in a key, and a value comes out. That means you can actually wrap a Kyoto Cabinet DB object by an implementation of Scala's mutable Map interface.

However, Kyoto Cabinet API only supports storing two types of values: Strings and byte arrays. In both cases, both the key and the value need to be of the same type. That means that - without doing any transformations - you can only wrap its DB object inside a Map[String,String] or a Map[Array[Byte], Array[Byte]]. That clearly leaves a lot left to be desired.

Adapters

So, whatever I am going to do, it should at least (1) allow me to wrap a DB object in a Map and (2) allow me to transform keys and values to the appropriate type. I want to be able to access a Kyoto Cabinet DB as a Map[Int,Date], - if I feel like it.

Instead of addressing both concerns inside a single class, I eventually opted for factoring it out into separate classes. It seemed having a mutable Map abstraction that on the fly transforms its keys and/or values to an alternative type would be useful in other circumstances as well.

The result works like this:

import scala.collection.mutable.Map
import nl.flotsam.collectionables.mutable.AdaptableMap._
val original = Map("1" -> "a", "2" -> "b", "3" -> "e")
val adapted = map.mapKey(_.toInt)(_.toString)
adapted += (1 -> "foobar")

How is this different than just mapping it?

Note that this is definitely not the same as this:

val adapted = original.map{ case (x,y) => (i.toInt, y) }
adapted += (1 -> "foobar")

In the second case, the entire original map is replaced by a new map. After the transformation, all keys have been transformed to an Int. In the former case, operations on 'adapted' taking keys of type Int will transform the key on the fly to the same operation taking a key of type String on the underlying Map. In some cases, transforming the entire Map in a single go might be the better option. But if you have a Kyoto Cabinet database with millions of records, then this is the last thing you want to do.

Show me the code

This is is the latest version of the code. It still is in flux, but you get the picture:

class AdaptedMap[A, B, AA, BB](decorated: Map[AA, BB],
                               a2aa: (A) => AA,
                               b2bb: (B) => BB,
                               aa2a: (AA) => A,
                               bb2b: (BB) => B) extends Map[A, B] {

  def iterator = new AdaptedMapIterator[A, B, AA, BB](decorated.iterator, aa2a, bb2b)

  def get(key: A) = decorated.get(a2aa(key)) map (bb2b(_))

  def -=(key: A) = {
    decorated -= a2aa(key)
    this
  }

  def +=(kv: (A, B)) = {
    val (key, value) = kv
    val adapted = (a2aa(key), b2bb(value))
    decorated += adapted
    this
  }

  def mapKey[C](a2c: (A) => C)(implicit c2a: (C) => A) = {
    def c2aa(c: C) = a2aa(c2a(c))
    def aa2c(aa: AA) = a2c(aa2a(aa))
    new AdaptedMap[C, B, AA, BB](decorated, c2aa, b2bb, aa2c, bb2b)
  }

  def mapValue[C](b2c: (B) => C)(implicit c2b: (C) => B) = {
    def c2bb(c: C) = b2bb(c2b(c))
    def bb2c(bb: BB) = b2c(bb2b(bb))
    new AdaptedMap[A, C, AA, BB](decorated, a2aa, c2bb, aa2a, bb2c)
  }

}

/**
 * Providing the implicit transformation allowing you to transform an existing mutable Map into an AdaptedMap, allowing
 * you to invoke mapKey an mapValue on it.
 */
object AdaptedMap {

  implicit def map2adaptable[A, B](map: Map[A, B]) =
    new AdaptedMap[A, B, A, B](map, identity, identity, identity, identity)

}

Sunday, March 27, 2011

Groovy Int operations in Scala

Today, I briefly opened a book on Groovy, looked at the first line, and it said something like this:

10.times(print it)

... and I realized Scala doesn't have it. Now obviously, you can do this:

(1 to 10) foreach(println(_))

... but that seems slightly more complicated than what Groovy has to offer. No worries. Let's fix that:

class SmartInt(i: Int) {
def times(block: Int => Unit): Unit = (1 to i) foreach { j => block(j) }
def times(block: => Unit): Unit = (1 to i) foreach { i => block }
}
implicit def int2SmartInt(i: Int) = new SmartInt(i)

Now, I can call times on an Int, passing in either a parameterless block, or a function accepting an Int.

3 times println("foo")
3 times { println("foo") }
3 times { println(_) }
3 times { i => println(i) }

Note that SmartInt defines two operations called times(...); one with and one without a parameter. I figured that - in case of times(...) - it would be pretty normal to have it accept a function that ignores the value. If we would only have had the first operation, you would always have to capture the parameter, and then ignore it. With the second version of times(...), you can pass in an arbitrary expression ignoring the parameter carrying the current element.

Saturday, March 26, 2011

Scala Roles

Not sure how I ever could have missed it, but for some reason the 2008 paper on Scala Roles (think DCI done in Scala) is starting to pop up all over the Internet. I just read it, and it actually looks pretty sensible. Just a couple of things that I haven't been able to figure out yet, and keeping here for future reference.

Disconnected Roles

Without going into too much detail, the general idea is that collaborations are instantiated with their roles. So if I instantiate a ThesesSupervision instance, I get an instance of each of the associated roles for free. These roles are stateful. The Student for instance has motivation and wisdom. The SuperVisor has the capability to advise and grade the Student.

Suppose we have two persons: Peter and Paul. If Paul is Peter's supervisor, then on advising Peter, he basically steps into the role of SuperVisor. In this role, he is able to advise Peter.

The corresponding Scala code:

(peter as phd.supervisor).grade

What I found surprising is that grading the roles continue to be unaware of the objects playing the role. Therefore, an action affecting the Student will never affect its 'Personality'. In reality, it obviously will. If I would be a student and get a bad grade, it would affect my personal life as well. My happiness would drop, for instance. The solution outlined in the paper seems to be unable to address that concern.

Links

An updated version of the library is here on GitHub. The original paper can be downloaded here.

Thursday, March 24, 2011

Scalatra, SBT and MongoDB

Last week I did a presentation on NoSQL at bol.com. In order to make it a little bit more compelling, I figured I would throw in a demo on how to use MongoDB for real - but I obviously didn't feel like doing it using Java.

So, behold, here is the entire catalog.

import javax.servlet.ServletConfig
import com.mongodb.casbah.Imports._
import scala.xml._
import org.scalatra._
import scala.util.control.Exception._

class WebApp extends ScalatraServlet {

  val missing = "http://cdn2.iconfinder.com/data/icons/august/PNG/Help.png"
  val mongo = MongoConnection()
  val coll = mongo("amazon")("products")

  get("/products") {
    val numberFormat = catching(classOf[NumberFormatException])
    val limit = numberFormat opt request.getParameter("limit").toInt
    val offset = numberFormat opt request.getParameter("offset").toInt
    <html>
    <head>
      <style type="text/css">
        body {{ font-family: Calibri; }}
      </style>
      <title>Products</title>
    </head>
    <body>
    <ul>
    {
      val items = coll.find().limit(limit getOrElse 10).skip(offset getOrElse 0)
      for (item <- items) yield {
        val set = item.as[DBObject]("ItemAttributes")
        val authors = set.getAs[BasicDBList]("Author") map(_.mkString(", ")) getOrElse("No authors")
        val title = set.as[String]("Title")
        val publisher = set.getAs[String]("Publisher") getOrElse("No publisher")
        val img: String = item.getAs[DBObject]("SmallImage") flatMap(_.getAs[String]("URL")) getOrElse(missing)
        <li>
          <img src={img}/>
          <b>{title}</b>
          <span> ({publisher})</span>
          <em> {authors}</em>
        </li>
      }
    }
    </ul>
    </body>
    </html>
  }

}

Okay, it's just a single page, but the first lesson learned is that the combination of Scalatra, SBT and MongoDB gives you a lot of bang for the buck.

Now, I could easily imagine that it is quite hard to digest everything in a single go, so I am going to explain a couple of things.

Lesson learned 2: Dealing with exceptions

One way of dealing with exceptions in Scala is to use a try-catch block. I am not even going to discuss that, because it's pretty much the same as in Java, apart from the fact that in Scala it's less code.

In my particular case however, I had to see if some parameters would be present in the request. I could have created a complicated conditional block containing a try-catch block to capture NumberFormatExceptions, but that would be a lot of code.

Instead I did this:

    val numberFormat = catching(classOf[NumberFormatException])
    val limit = numberFormat opt request.getParameter("limit").toInt
    val offset = numberFormat opt request.getParameter("offset").toInt

First I defined an object called number format by calling a factory method on the Exceptions object, passing in the type of exceptions I want to have handled. The object returned is giving me several options for handling blocks of code that will generate these exceptions. The method I am using here is opt.

The 'opt' method takes a by-name parameter that will be evaluated by the operation itself. Once it is evaluated, it will wrap the result into a Some, and return that Option. That is, unless the NumberFormatException occurred. In that case it will return a None. And later on, I am calling getOrElse(...) on that option, to supply a default value in case it is a none.

So in terms of Java, I am doing this:

int limit = 0;
try {
  limit = Integer.parseInt(request.getParameter("limit"));
} catch (NumberFormatException nfe) {
  limit = 10;
}

The whole construct in Scala is getting reduced to:

val limit = 
  (numberFormat opt request.getParameter("limit").toInt) getOrElse(10)

To me, that looks a lot more sensible. The entire policy for dealing with the exception has now been encoded in a library class.

Lesson learned 3: Accessing MongoDB from Scala is Easy

Accessing MongoDB from Scala seems pretty easy. The Casbah library makes it easy. One of the things that I found a little hard to grasp at first is what to expect from the object model returned from MongoDB. If you don't have a clue what the MongoDB Java drivers would have normally returned, then figuring out what to expect from Casbah can be a little challenging. I think I'm getting the hang of it now though.

These expressions might seem a little bewildering at first:

val publisher = set.getAs[String]("Publisher") getOrElse("No publisher")
val img = item.getAs[DBObject]("SmallImage") flatMap(_.getAs[String]("URL")) getOrElse(missing)

but actually Scala is helping a lot in these cases. In my database schema, a lot of fields are optional. In Java, you would have no other option than getting the value, storing it in a variable, checking if it is null, and then continuing based on the outcome. If your data is tugged away deeply into your document, then you would have pages of code in no time.

In Scala, with Scala's support for Options, it is actually quite easy. No need to capture results in variables before being able to move on. The Option allows you to keep on chaining operations to the result of previous operations. (By the way, the flatMap operation on the second line is going to make sure that instead of getting an Option[Option[String]], I end up getting an Option[String]. On that result, I can invoke getOrElse and pass a default value.)

Lesson learned 4: Scalatra is simple

In all honesty, I have only scratched its surface, and it's questionable if you would ever create a huge web application this way, but it really is a 'hit-the-ground-running' experience.

Lesson learned 5: SBT makes it even sweeter

This is the way it works. You start sbt, and then type:

jetty-run
~prepare-webapp

From that point on, SBT will examine changes in your sources, and for every change immediately recompile your code and replace the existing web app. Way faster than you would imagine.

Saturday, February 12, 2011

Properties of relations, done in Scala

I am currently reading "Elements of Distributed Computing". In order to eventually explain different models for distributed computing, they start with introducing the notion of total order and partial order. A relation defines a reflexive partial order if its reflexive, antisymmetric and transitive.

If X is a set of things, then a relation R over X is a subset of X * X. For example, let

scala> val X = Set('a, 'b, 'c)
X: scala.collection.immutable.Set[Symbol] = Set('a, 'b, 'c)

Then, one possible relation is:

scala> val R = Set(('a, 'c), ('a, 'a), ('b, 'c'), ('c, 'a))
R: scala.collection.immutable.Set[(Symbol, Any)] = Set(('a,'c), ('a,'a), ('b,c), ('c,'a))

A relation is reflexive if for each x belonging to X, (x, x) belongs to R. How would you check if a relation is reflexive in Scala? The trouble is, with the "natural" way of defining a relation as a Set of tuples in Scala, there is no way of telling which values are part of X, just like that. However, if the relation is completely defined by all values in R, then we can determine the values in X simply by combining the values in the domain (the first value in the tuple) and the domain (the second value in the tuple).

If X is not provided than we need a way to derive X from R. In order to get there, I am going to write two functions:

scala> def toSet[T](xs: (T, T)): Set[T] = Set(xs._1, xs._2)  
toSet: [T](xs: (T, T))Set[T]

scala> def domainAndRangeOf[T](xs: Set[(T, T)]): Set[T] = xs flatMap(toSet) 
domainAndRangeOf: [T](xs: Set[(T, T)])Set[T]

So, the set on which the binary relation R is defined is:

scala> val X = domainAndRangeOf(R)                                         
X: Set[Symbol] = Set('a, 'c, 'b)

A relation is reflexive if x R x for all x in X.

scala> def isReflexive[T](xs: Set[(T, T)]): Boolean = 
  domainAndRangeOf(xs) forall(x => xs contains (x, x))
isReflexive: [T](xs: Set[(T, T)])Boolean

scala> isReflexive(Set((1, 1), (1, 2), (2,2)))        
res17: Boolean = true

A relation is antisymmetric if for x, y for which x R y there is no y R x.

scala> def flip[T](t: (T,T)) = (t._2, t._1)                                                             
flip: [T](t: (T, T))(T, T)

scala> def isAntiSymmetric[T](xs: Set[(T,T)]): Boolean = 
  xs forall (x => x == flip(x) || !(xs contains flip(x)))
isAntiSymmetric: [T](xs: Set[(T, T)])Boolean

scala> isAntiSymmetric(Set((1, 2), (1, 3), (1, 4)))
res19: Boolean = true

scala> isAntiSymmetric(Set((1, 2), (1, 3), (2, 1)))
res20: Boolean = false

A relation is transitive if for all x, y, z for with x R y and y R z, x R z is also defined.

scala> def isTransitive[T](xs: Set[(T,T)]) = {
     |   xs forall { x =>
     |     xs filter(y => y._1 == x._2) forall (y => xs contains ((x._1, y._2)))
     |   }
     | }
isTransitive: [T](xs: Set[(T, T)])Boolean

scala> isTransitive(Set((1, 2), (2, 3)))
res21: Boolean = false

scala> isTransitive(Set((1, 2), (2, 3), (1, 3)))
res22: Boolean = true

A relation R defines a reflexive partial order if it is both reflexive, transitive and antisymmetric:

scala> def isReflexivePartialOrder[T](xs: Set[(T,T)]): Boolean = 
     | isReflexive(xs) && isTransitive(xs) && isAntiSymmetric(xs)

scala> isReflexivePartialOrder(Set((1, 2), (1, 4), (2, 4), (1, 1), (2, 2), (4, 4)))
res31: Boolean = true

Wednesday, January 5, 2011

Clojure versus Scala (part 2)

In my previous post, I went over all of the basics introduced by the authors of "Clojure: functioneel programmeren". In the second part of their first article, they build a Last.fm client, based on the programming concepts introduced before. Let me do the same thing for Scala.

Build environment

Clojure has Leiningen, but I bet Maven is supported as well. Same goes for Scala: there are people using Rake or Gradle, and of course there's SBT (discussed before). However, for people coming from a Java world, Maven works just as well.

So, to start a Scala Maven project, just type this on the commandline:

mvn archetype:generate -DarchetypeCatalog=http://nexus.scala-tools.org/content/groups/public

... and choose the simple Scala project. Fill out the basic details, and you will have something working. (Now, this is a command that you're going to use more often. This might be a good time to turn it into a key macro.)

In order to make sure you can download the proper libraries, you obviously need to add the repo and a dependency:


        
            xebia-maven
            http://os.xebia.com/maven2
        
    
...
    
        
            net.roarsoftware
            last.fm-bindings
            1.0

Namespace

The next thing the authors do is talk about namespaces for a while. They mention that in Clojure, namespaces are first-class citizens. I guess the same applies to Scala as well. However, you cannot add new symbols to a package, as Clojure allows you to do.

Listing the top tracks

This is the Clojure version:

(defn top-tracks
  [user-name api-key]
  (User/getTopTracks user-name api-key))

This is the Scala version:

def topTracks(user: String, apiKey: String) =
  getTopTracks(user, apiKey).toSeq

Now, the above only works since I imported all of the User's functions somewhere else (so getTopTracks has been pulled into scope). And I can only invoke toSeq on the results of getTopTracks (normally a java.util.Collection) because of an import of some implicit conversions:

import net.roarsoftware.lastfm.User._
import net.roarsoftware.lastfm.Track
import scala.collection.JavaConversions._

Converting Track to a String

This is the Clojure version:

(defn track-to-str
  [track]
  (let [track-name (.getName track)
        artist-name (.getArtist track))
  str track-name " by " artist-name)))

This is the Scala version:

def trackToString(track: Track) =
    track.getName + " by " + track.getArtist

Numbering a list of items

This is the way the authors do it in Clojure:

(defn number-a-sequence
  [seeq]
  (map-indexed #(str (+ 1 %1) " " %2) seeq))

This is the Scala version. Basically, what it's doing is first create a sequence of tuples, existing of the element itself followed by its index. And then it goes on to map each individual item to a String.

def numberASequence(seq: Seq[Any]) =
  seq.zipWithIndex.map({
    case (elem, index) => (index + 1) + " " + elem
  })

Building HTML

Again, Clojure:

(defn to-html
  [str-seeq]
  (let [ header "<html><body>"
         footer "</body></html>"]
    (str header (reduce str (map #(str % "< br />") str-seeq)) footer )))

And this is Scala:

def toHtml(list: Traversable[Any]) =
  <html>
    <body>{list.map(item => <p>{item}</p>)}</body>
  </html>

In this case, it might be worth noting that the Scala version is actually building XML, whereas the Clojure version is generating a String. Building XML is a little safer: if the text included in your XML contains special characters, then Scala's XML support will guarantee that those special characters are getting escaped properly. (Who knows, perhaps there is an artist called "".)

Conclusion

This basically constitutes everything discussed in the Clojure article. They conclude that a Clojure program like this required less than 25 lines of code. I think it's fair to say that both Scala and Clojure are in good shape in that regard. I have counted the LoC of the Scala version, and it adds up to 22.

So, which one is the winner? I think it's inconclusive. I like the fact that Scala is statically typed, without a significant penalty. The number of lines of code is roughly the same. What do you think?

(Full source code is here.)

Tuesday, January 4, 2011

Clojure versus Scala (part 1)

This is just a brain dump, after having read an excellent article on Clojure by Maurits and Sander in the Dutch Java magazine. Admittedly, without having access to the original article, it probably isn't of any use, but I just wanted to jot it down here, for future reference.

Hello World

Let's start with their simple Hello world example. This is the Clojure version:

(def hello (fn [target] (println (str "Hello " target))))
(hello "Clojure")

or the short version:

(defn hello [target] (println (str "Hello " target)))
(hello "Clojure")

... and this is the Scala version - even shorter:

def hello(target: Any) = println("Hello " + target) 
hello("Scala")

I'm happy to say that the Scala version is shorter in number of characters, even though the type of parameter 'target' had to be specified explicitly.

Doc String

Next they explain the purpose of the doc string. Now this is something that I truly miss in Scala. It would be totally awesome to have the ability to pull up documentation on a function from the REPL, but it doesn't exist. Scala does have scaladoc, but that's all thrown away at compilation time. It should be possible to store some of this in AttributeInfo, but it doesn't.

This is clearly an area in which Clojure's ancestry of LISP and Scala's ancestry of Java shows.

First class functions

Before going any further, we first need to have square function, as defined in the article like this:

(defn square [x] (* x x))

Now, I would love to say that defining square in Scala is just as easy. If you would only define square for integers, it would be:

def square(x: Int) = x * x

... but we obviously want it to work for doubles as well, as I suspect the Clojure version does.

Now, this is the Scala version that supports any numeric type:

def square[T](x: T)(implicit numeric: Numeric[T]): T = 
  numeric.times(x, x)

in Scala 2.8.1 there is a shorthand notation for the same thing:

def square[T: Numeric](x: T) = 
  implicitly[Numeric[T]].times(n, n)

Not quite as intuitive as you would have expected, but it works quite well:

scala> square(4.0)
res1: Double = 16.0

scala> square(5)
res2: Int = 25

Once the square function has been defined, the article explains that functions like square could be passed to a function called twice, with twice being defined like this:

(defn twice [f a] (f (f a))) 
(twice square 2)

This is the Scala version:

def twice[T](a: T)(f: (T) => T) = f(f(a)) 
twice(4)(square)

It may look a little awkward at first, but this is the only way you get it to work without having to pass in additional type information. (Check this for more information.)

Here's an alternative that does not work:

scala> def twice[T](a: T, f: (T) => T) = f(f(a)) 
twice: [T](a: T,f: (T) => T)T

scala> twice(4, square)         
:8: error: could not find implicit value for evidence parameter of type Numeric[T]
       twice(4, square) 
                ^

If you insist on defining twice like this - perhaps because of its resemblance to the Clojure version, then the only option you have is call twice like this:

scala> twice(square[Int], 2)
res70: Int = 16

Data structures

Onward to lists. Clojure example to produce a list:

(list 1 2 3)

... and then a couple of alternatives for doing a Scala List:

List(1, 2, 3)
1 :: 2 :: 3 :: nil

Adding an item to a list in Clojure:

(conj (list 1 2 3) 4)

... and the same in Scala:

4 :: List(1, 2, 3)

In Scala, Lists are typed. You can only add anything to a List if it's a List[Any]. So, this is fine:

scala> 1 :: "a" :: 2 :: "b" :: Nil
res74: List[Any] = List(1, a, 2, b)

scala> List(1, "a", 2, "b")
res75: List[Any] = List(1, a, 2, b)

While Clojure does not allow you to directly access one of the elements unless you use a vector, Scala does allow you to get the n^nth element of a list:

scala> val list = List(1, "a", 2, "b")
list: List[Any] = List(1, a, 2, b)

scala> list(0)
res80: Any = 1

scala> list(2)
res81: Any = 2

scala> list(1)
res82: Any = a

Now, even though Scala does allow you to do it - that doesn't mean you should feel encouraged to code like that; it also defines a Vector, which is - just like the one in Clojure - way more optimized for random access.

Maps

This is the Clojure version of defining a map:

(def mymap '{:aap "monkey" :ezel "donkey" :walvis "whale" :onbekend "platypus"})

(:ezel mymap)

... and the Scala version:

val mymap = Map("aap"->"monkey", 
  "ezel"->"donkey", 
  "walvis"->"whale", 
  "onbekend"->"platypus")
mymap("ezel")

Updating the map is a little different than what you are used to, if you come from the Java space. In fact, in that sense, it's not unlike Clojure. In Clojure, you update a map like this:

(assoc mymap :onbekend "unknown")

In Scala it's done slight differently, but the net effect is the same: a new map, containing all of the previously defined entries plus a new one.

scala> mymap + ("onbekend" -> "unknown")
res90: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map((aap,monkey), (ezel,donkey), (walvis,whale), (onbekend,unknown))

Monday, January 3, 2011

Day 7: Asynchronous Http Client

Since I'm trying to get into the habit of doing *everything* in Scala, and I needed to write a Confluence client, I wanted to see how hard it would be to do that in Scala as well. That quickly detoriated into an attempt to write a convenience layer around async-http-client. I'm still not sure if this is going anywhere, but I will show you what I have anyway.

Async Http Client

... is an awesome asynchronous HTTP client. Wherever I looked, I couldn't find a Scala version of it. There is a Clojure version of it though. I didn't take much time to get any inspiration from that though.

How to use it?

This is what the wrapper library allows you to do. First of all, it allows you to "GET" a resource and return only part of it.

val http = new Http
val statusCode = 
  http.get("http://www.xebia.com/")
      .returning(Http.statusCode)
statusCode.get should equal (200)

Now, in reality, you obviously want something else. You want to get a couple of values, for instance. That also works. So, instead of the returning the full underlying response class, it just returns the values you're interested in, as a tuple.

val http = new Http
val result = 
  http.get("http://www.xebia.com/")
      .returning(Http.statusCode, Http.contentType)
val (statusCode, contentType) = result.get
statusCode should equal (200)
contentType should startWith ("text/html")

And last but not least, if that's not what you are after, then you can still get the full response object.

val http = new Http
val result = 
  http.get("http://www.xebia.com/")
      .returning(Http.fullResponse)
result.get.getContentType 
  should startWith ("text/html")

Currently, the returning method takes functions like these:

(Response) => T

... and if you pass in a couple, you get response tuples existing of values matching the return types of those functions.

How useful is this?

Honestly? Probably not all that much. I need to look at it again in some time to see if there is any point in doing it like this.

Wednesday, December 8, 2010

Day 6: Using Ensime

Monday, December 6, 2010

Day 5: Building Ensime

Sunday, November 28, 2010

Day 4: Trying units

The previous two posts have just been about getting a little bit more comfortable with the tooling around Scala. Now it's time to write some code for real. (I am actually writing the code with Emacs Ensime. More on that in a future post.)

If I write code in order to learn a new language, there are actually a couple of rules I observe:

I don't write 'Hello world';
I want to write code that not only explores some concept, but also proofs its value for real;
I want to be able reuse that code for something real in the future.

In this particular case, I decided to explore an area in which I suspect Scala will really shine, which is...

Units

I remember having looked at JSR 275 in the past. It left me with a feeling of admiration on one hand ("interesting from a hacker point of view"), and yet at the same time with a feeling of disappointment ("this is probably the best we will be able to get, *sigh*").

Fact is that dealing with numerical data based on a certain unit (money, temperature, size, coordinates) will never really feel natural. It seemed from first start that Scala would actually be able to solve this pretty naturally.

So last weekend I sat down and gave it a go.

Temperatures

I'm from Europe. We measure our temperatures in degrees Celcius (or if you are in science, in Kelvin). However, if you are from the US, you will be used to Fahrenheit instead. It always hurts my brain to turn degrees Fahrenheit into degrees Celcius and the other way around. Let's see if Scala can make it a little easier.

Fahrenheit and Celcius classes

Let's start by defining Fahrenheit and Celcius as case classes that only embody their value.

case class Celcius(temperature: Double);
case class Fahrenheit(temperature: Double);

Clearly not as much code as you would need in Java. And even creating and instance is easier: Celcius(35) in Scala says exactly the same as the slightly more verbose: new Celcius(35) in Java.

Making it a little bit more readable

Scala by default might be a little bit more compact, but still reading Celcius(35) doesn't read all that well. Perhaps adding a little bit of syntactic sugar will improve it.

To that end, I am introducing a new class Temperature that has two operations: fahrenheit() and celcius(). The first one returns a temperature in Fahrenheit, and the other one a temperature in Celcius. This class is going to serve as an aid in helping me doing what I really want to do, which is to type:

34 fahrenheit

... and get a Fahrenheit temperature of 34 as a result. Basically what needs to happen is this:

34 fahrenheit

should be interpreted as

34.fahrenheit()

... and therefore I need to introduce a fahrenheit() operation to Double. That's not doable just like that. The Temperature is therefore going to serve as an intermediate step. Whenever I type 34 fahrenheit, I want Scala to go look for a type that has a fahrenheit() operation, and see if there is an implicit conversion from Double to an instance of that type.

If I define Temperature like this:

class Temperature(value: Double) {

  def fahrenheit() = Fahrenheit(value);
  def celcius() = Celcius(value);

}

... then the only thing I need to do to make it work is define an implicit conversion:

implicit def doubleToTemperature(value: Double) = new Temperature(value);

... and I'm in business. If now I type "34 fahrenheit", this is what the REPL tell me:

scala> 34 fahrenheit
res8: org.scalateral.sample.units.Fahrenheit = 34.0 °F

That is, it will probably print the value slightly differently. I just added different implementations of toString to the definitions of Celcius and Fahrenheit:

case class Fahrenheit(temperature: Double) {
  override def toString() = temperature.toString() + " \u00B0F";
}

... just to get something a little bit more human friendly.

Comparing temperatures

Although we got something going, it really isn't all that helpful yet. First of all, what's the point of an immutable temperature if all you can do is extract its value from it? It would be clearly way more sensible to define some operations on it as well.

The first thing I want to do is compare temperatures. The current temperature classes (Celcius and Fahrenheit) are not 'comparable' or - in Scala terms - ordered. So let's make them ordered. (I will stick to the Fahrenheit example, but as you can imagine, the Celcius example is exactly the same.)

case class Fahrenheit(temperature: Double) extends Ordered[Fahrenheit] {
  override def toString() = temperature.toString() + " \u00B0F";
  override def compare(that: Fahrenheit): Int = temperature.compare(that.temperature);
}

Because of this I can now type this:

scala> (34 fahrenheit) > (25 fahrenheit)
res12: Boolean = true

No big shakes. This is actually no more than what you would expect. However, it would also be nice to compare temperatures using different units. Like, checking if a temperature in Celcius is higher than a certain temperature in Fahrenheit.

In order to be able to do that, there would be two options. The first option would be to implement a toCelcius on Fahrenheit, and a toFahrenheit on Celcius. However, I am going for the second option, which is IMHO a little bit more extensible. (Adding a new unit of temperature would be easier.) I am going to add two more implicit conversions:

implicit def fahrenheitToCelcius(value: Fahrenheit) 
  = Celcius((value.temperature - 32) / 1.8);
  
implicit def celciusToFahrenheit(value: Celcius) 
  = Fahrenheit((value.temperature * 1.8) + 32);

Once I've done this, comparing temperatures based on different units now all of sudden starts to make sense:

scala> (34 fahrenheit) < (32 celcius)   
res13: Boolean = true

The same goes for operations that take Fahrenheit. I can now simply pass in a temperature in Celcius, and still get valid results:

scala> def printFahrenheit(value: Fahrenheit) = println(value);
printFahrenheit: (value: org.scalateral.sample.units.Fahrenheit)Unit

scala> printFahrenheit(-34 celcius)                            
-29.200000000000003 °F

Conclusion

If you are working with units, then Scala can be a life-saver. With just a little bit of work, Scala allows you to write code that is just way more readable than its Java counterpart.

Word of caution

If you happen to prefer a more natural implementation of toString(), like I do, then you need to be aware of the fact that the Scala REPL will by default use platform encoding for printing to the console. That means that - unless you explicitly override it - on MacOs it will use MacRoman. Unfortunately, MacRoman doesn't have the degree symbol in its character table. As a consequence, you will get the platform specific representation of a non-printable character.

In order to get something a little bit more readable, pass -Dfile.encoding=UTF-8 on the commandline. (In my case, I run the REPL from SBT. So I start SBT like this: java -Dfile.encoding=UTF-8 -jar ~/local/sbt/sbt-launch-0.7.5.RC0.jar).

Thursday, November 25, 2010

Day 3: SBT dependencies and Scala versions

Some people sell Scala saying that it will boost your productivity. Allow me to disagree. It may help a little, but I doubt if it's significant to the productivity boost you get from using other libraries people have written. So, regardless of what you are going to do with Scala, mastering the art of dealing with dependencies is going to make a huge difference. And therefore - before doing anything fancy with SBT - I first want to understand how to do that using SBT.

Dependencies in Scala

Dependencies in Java are simple compared to dependencies in Scala. There, I said it. It's harder to manage dependencies on Scala libraries since different versions of Scala are not compatible. (You might want to read that line again.) The bytecode of a class in Scala 2.8.* just looks different than the bytecode of a class in Scala 2.7.*. It uses different ways to translate syntax constructs in Scala to bytecode.

Things like closures are simply not supported by the Java virtual machine, so there will always be translation to concepts like those to concepts supported by the Java VM. And different versions of Scala do it in different way.

Phrased otherwise: if you are working with Scala 2.8.*, then trying to link to a library that was built using Scala 2.7.* will fail. And there is no way to work around it (yet).

Java doesn't have that problem, which is one of the blessings (and probably the only one) of Sun/Oracle's conservative approach on extending Java.

The Consequences

As a consequence, when you have a dependency on a certain Scala library, you always need to link to a version of that library built for the version of Scala you are running. That makes things slightly more complicated then what you would typically do in Maven.

The good news is: many of the Scala libraries are getting pushed to the central Maven repository. And SBT has a certain scheme for resolving the appropriate versions of your Scala libraries. Let's see how that works.

Customizing an SBT build

Adding dependencies to our SBT project is our first customization of a default project. Before going ahead, it's important that you understand the SBT approach a little bit better. An SBT project is basically nothing but a default implementation of a class coordinating the build. The name of that class is DefaultProject. In order to add your own customization, you will need to create a new class that extends DefaultProject, and override some of its code, or add some more behavior. Your own extension of DefaultProject needs to be placed in project/build.

So, without any further ado: let's add a customization to our SBT project created in the previous post:

import sbt._

class SbtExampleProject(info: ProjectInfo) 
  extends DefaultProject(info)

If you already happened to be running SBT, this is the time to type reload, which will (no surprise) reload the project. So that wasn't all that hard - albeit a bit awkward at first sight. You know have your own customized project, even though nothing has been customized for real yet. Your build should still execute in the same old way.

Adding a dependency on a Java library

Let's first add a Java library, to start the easy way. Let's add guava. Now, this may come as a surprise, but really, then only thing you need to do is add a val with any name you like, with a value that is composed out of three parts. (The groupId, the artifactId and the version.) Like this:

import sbt._

class SbtExampleProject(info: ProjectInfo) 
extends DefaultProject(info) {
  
  val guava = "com.google.guava" % "guava" % "r07"

}

After that, in order to make sure SBT learns about your changes, first enter 'reload' and then 'update' to update its dependencies. (Remember that last step. Without it, nothing will be changed at all.)

> reload
[info] Recompiling project definition...
[info]    Source analysis: 1 new/modified, 0 indirectly invalidated, 0 removed.
[info] Building project test 1.0 against Scala 2.8.1
[info]    using SbtExampleProject with sbt 0.7.5.RC0 and Scala 2.7.7
> update
[info] 
[info] == update ==
[info] downloading http://repo1.maven.org/maven2/com/google/guava/guava/r07/guava-r07.jar ...
[info]  [SUCCESSFUL ] com.google.guava#guava;r07!guava.jar (3176ms)
[info] :: retrieving :: test#test_2.8.1 [sync]
[info]  confs: [compile, runtime, test, provided, system, optional, sources, javadoc]
[info]  1 artifacts copied, 0 already retrieved (1052kB/233ms)
[info] == update ==
[success] Successful.
[info] 
[info] Total time: 9 s, completed Nov 20, 2010 4:07:53 PM

Now, to prove that we actually can start using Guava, let's modify our HelloWorld class a little. Before doing that, I am going to enter '~ compile' and hit enter in SBT. This will make SBT go into a mode in which it will continuously compile source code once it's changed:

> ~compile
[info] 
[info] == compile ==
[info]   Source analysis: 0 new/modified, 0 indirectly invalidated, 0 removed.
[info] Compiling main sources...
[info] Nothing to compile.
[info]   Post-analysis: 2 classes.
[info] == compile ==
[success] Successful.
[info] 
[info] Total time: 0 s, completed Nov 20, 2010 4:18:18 PM
1. Waiting for source changes... (press enter to interrupt)

Just to show it works, I'm changing HelloWorld into this:

import com.google.common.base._
import scala.collection.JavaConversions._

object HelloWorld {
  def main(args: Array[String]) {
    Splitter.on(',').split("foo,bar").foreach(println);
  }
}

… and hit 'run':

[info] == run ==
[info] Running HelloWorld 
foo
bar
[info] == run ==
[success] Successful.

Good, so that worked out fine. Now that was adding a dependency on a Java library. But as I said, adding Scala libraries is a little bit more complicated. So, let's try that as well.

Adding a Scala library dependency

As I said, if you have dependencies on Scala libraries, then you need to carefully consider how these dependencies should be resolved, in order to make sure you get the libraries that match the version of Scala used in your build.

If you're wondering which version of Scala is used during your build, then SBT provides a simple command that you can run, which will give you all the information you need. You just type current and hit ENTER.

> current
Current project is test 1.0
Current Scala version is 2.8.0
Current log level is info
Stack traces are enabled

So apparently, we are compiling our sources using Scala 2.8.0. Now, say that we would like to give the specs library a try. For that, we would have to find a version 2.8.0 specific library. I am not aware of a place where you can search for Scala libraries specifically, so I will stick to mvnrepository.com.

If you search for org.scala-tools.testing on mvnrepository.com, you will see many different versions of the specs library. These are the versions that have been built using specific versions of Scala, named specs_2.8.0, specs_2.8.1.RC0, etc.

There are two ways we can refer to these libraries. The first approach is to link to these versions that directly, by adding this to the SbtExampleProject class:

val specs = "org.scala-tools.testing" % "specs_2.8.0" % "1.6.5"

If I run reload and update after that, everything is fine: the new dependency got included.

However, if I switch the target version (++2.7.7), then I'm still stuck with a version of specs that has been specifically built for Scala 2.8.0. Not so nice. (Adding some real test code to the project doesn't reveal a real problem yet, so you will have to wait for another post on the actual problems you might run into.)

It turns out, SBT offers a way out. You can make your build a little bit more tolerant towards version differences. In this particular case, it just requires a small modification. Instead of the dependency as illustrated above, you include a dependency as given below. By replacing the '%' with a '%%' (wonder what the right name for that operator would be), you essentially tell SBT to go and look for a version of specs 1.6.5 that was built for your current version of Scala.

val specs = "org.scala-tools.testing" %% "specs" % "1.6.5"

Now, in theory, this should solve everything. However, reality is a little harder on us. It turns out that if I do switch to version 2.8.1 of Scala, that particular version does not exist:

[error] sbt.ResolveException: unresolved dependency: org.scala-tools.testing#specs_2.8.1;1.6.5: not found

Looks like I'm back on square one. Although... not entirely. It would be awesome if the entire Scala world would maintain builds of their libraries for every version of Scala that would ever come about. In absence of that, you might be able to use various versions of that library, assuming that these different versions have been built with different versions of Scala. (Unfortunately, there is not always an easy way to tell which version of Scala was used to build these libraries.)

If that's what you want to do - pick different version of a library based on the version of Scala you are running - then SBT allows you to state that as well. And this is where it comes in handy that SBT is really just Scala at work. You can just code your dependency policy in Scala:

  val specs =
    buildScalaVersion match {
      case "2.8.0" => "org.scala-tools.testing" %% "specs" % "1.6.5"
      case "2.8.1" =>"org.scala-tools.testing" %% "specs" % "1.6.6"
      case x => error("Unsupported Scala version " + x)
    }

Switching to Scala 2.8.1 and running update now gives:

> update
[info] 
[info] == update ==
[info] downloading http://repo1.maven.org/maven2/org/scala-tools/testing/specs_2.8.1/1.6.6/specs_2.8.1-1.6.6.jar ...
[info]  [SUCCESSFUL ] org.scala-tools.testing#specs_2.8.1;1.6.6!specs_2.8.1.jar (7135ms)
[info] :: retrieving :: test#test_2.8.1 [sync]
[info]  confs: [compile, runtime, test, provided, system, optional, sources, javadoc]
[info]  1 artifacts copied, 1 already retrieved (2793kB/110ms)
[info] == update ==
[success] Successful.
[info] 
[info] Total time: 11 s, completed Nov 25, 2010 8:43:16 PM

Huray! It managed to get a version of specs that a) is available on the Internet, and b) compatible with the latest version of Scala. Nice!

Summary

This turned out to be a long (!) post. In fact, it took me way more time to put this one together than I had anticipated. I just comfort myself that this is probably one of the harder areas of working with Scala. Once you master dealing with different versions of Scala, the rest will be relatively easy. (Right?)

In working with different versions of Scala, SBT seems to be a blessing, especially with regard to the not-so-much-standardized namespace used for managing Scala dependencies and Scala versions. At least you are able to encode your own policy inside the SBT build file - something that is probably a lot harder to do in Maven.

All in all, I have to say that this second time I am looking at SBT, it actually feels quite ok. That is, I will definitely look further at other solutions in the future (stay tuned), but for now, it's time to move on to something different. Let's see how well all of this works in combination with some of the Scala-supporting IDEs, in particular Ensime, which will be the topic of the next post.

Thursday, November 18, 2010

Day 2: Starting SBT

I want to program Scala for real. Like, big projects. And I don't want to feel like being cast back into the dark ages, of relying on IDEs to take care of automated builds. As far as I am concerned, all of the goodness that I can rely on in the Java world should be present in the Scala world as well. It may be sub-optimal for a while, but I need to have something to hold on to.

Having a tool for automating your builds is a must. I don't feel like going to Make or Rake or even Gradle. It just doesn't feel right. I'd be interested to work with Maven, but then again, that would be a little bit too comfortable, and the whole idea is that I would step out of this comfort zone, to get used to something else. Other than that, the word is out that Scala builds can be quite slow, compared to Java builds, and it turns out using the Maven plugins is aggravating it. (Devoxx Scala BOF.) So, let's try something else. Let's try simple-build-tool (SBT).

Giving it a go

So, down below you see me giving it a go. For some reason, Jing didn't record the audio, and I didn't really feel like doing it all over again, so I will explain what's going on below in a few bullets:

First, I am downloading SBT. SBT isn't packaged as a tar file with scripts, so you will have to create an installation directory yourself. In my case, I create the installation directory, download the jar, and then create a softlink in order to be able to address the directory a little easier.
Next, I am running SBT in an example directory. This will create the project files, and in fact copy loads of jar files into the directory. That feels a little funny at first, when you're used to Maven, but there will probably be a reason for it.
Notice that in this particular case, when SBT asks me about creating project files, I don't answer with 'y' (yes), but instead type 's'. Normally, SBT will assume files to exist in src/main/scala and src/test/scala. If I enter 's' at project construction time, it will add the project root to the list of folders containing Scala files as well - which is convenient if all you want to do is play around with SBT.
I already added a HelloWorld Java class, so once SBT is getting started, just typing compile will compile the class. Typing run, will run the class. Notice that - once I run 'run' - the compilation phase will detect that the classes have already been compiled and the sources haven't changed. Which is one reason why SBT is a little bit more convenient than the Scala Maven plugins.

Wednesday, November 17, 2010

Day 1: Biting the bullet

So, here I am. I desperately want to switch over to Scala, big time, but then I am also just way too comfortable programming Java. Something needs to happen. Now.

This is my stab at it. I will force myself to explain you a little bit of Scala every day. I don't know who you are. I don't know if you know anything about Scala at all. In fact, I don't even know if you are there at all.

It's just that I found over the course of the last couple of years that nothing helps getting into a subject than when you force yourself to explain it to someone else. So, although it may seem you could eventually benefit from my attempts to get my head around certain Scala related subjects, in reality, you are helping me. Even if you would never comment on what I am trying to say here, there will always be the illusion of someone at the other end of the line. And I am certain that will help me to push forward.