Victus Spiritus

home

I'm having way too much fun reading a programming book - Beginning Scala

31 Oct 2009

What's so special about Beginning Scala by Dave Pollack?

Beginning Scala is a fairly recently published introductory Scala book (2009). It has made learning about coding and design with Scala thus far a pleasure. While I've been aware of Scala for a few months now (when I started web programming), I've been very busy with high priority task juggling and hadn't made time to experience the language (or read the book). My first memory of Scala is hearing about how twitter switched over much of their Ruby on Rails to a Scala framework to improve performance. Since then I've sampled 14 million other web technologies ;).

why I'm loving the book

After the intro and background (who developed scala: Martin Odersky and why: Java needed a successor - ty sisyphus) Dave shows us sample code which is exactly how I prefer to learn something. When I want to learn about a topic, I prefer to dive in head first. I want to experience it for myself, to really understand it, and make it my own. I like to cut to the chase, and then back track to fill in the knowledge gaps fueled by the curiosity that naturally arises by going through something interesting the first time. If it's not interesting, I can drop it with minimal time lost. This is precisely how I feel about Scala, and Dave Pollack gets it. Even in his first real sample program we get a line by line description of some syntax covering a nice range of functionality and syntax. Here's an excerpt from the book (from chapter 2).

Listing 2-1. Sum.scala

ScalaSnippet_Sum

Let’s go through this file in detail.

Importing Stuff

The import scala.io._ code imports all the classes from the scala.io package. This is the same as Java’s import scala.io.*;. Scala uses the _ rather than the * as a wildcard. Coming from Javaland, it takes a little getting used to, but it’ll soon make sense.

Parsing a String to an Int

Next, we define the toInt method, which takes a single parameter called in. That parameter has the type String:

def toInt(in: String): Option[Int] =

In Scala, method definitions begin with the def keyword. The method name follows, along with the method’s parameter list. In this case, the toInt method takes one parameter: in, whose type declaration follows it rather than precedes it. In some cases, the Scala compiler can figure out or infer the type of a variable or the return type of a method. You need to declare the parameter types for a Scala method, but we may omit the return type if the return type can be inferred and the method is not recursive.2 We declare the return type as Option[Int]. In general, if the return type is not immediately obvious, it’s an act of kindness and good citizenship to your fellow programmers and your future self to declare the return type.

What’s Option and what are those funky square brackets around Int? Option is a container that holds one or zero things. If it holds zero elements, it’s None, which is a singleton, which means that only one instance of None. If the Option holds one element, it’s Some(theElement). The funky square brackets denote the type of thing that’s held by the Option. In this case, the Option holds an Int. In Scala, everything is an instance of a class, even Int, Char, Boolean, and the other JVM primitive types. The Scala compiler puts primitive types in instance boxes (boxing) only when necessary. The result is that you can treat all classes uniformly in Scala, but if your primitive data does not require boxing, you’ll see the same program performance you see using primitives in Java. If your primitive does require boxing, the Scala compiler does all the boxing and unboxing for you, and it even does null testing when it unboxes—nice and polite.

So, Option[Int] is a container that holds zero or one Int value. Using Option is one of the ways that Scala lets you avoid null pointer exceptions and explicit null testing. How? You can apply your business logic over all the elements in the Option. If the Option is None, then you apply your logic over zero elements. If the Option is Some, then you apply your business logic over one element. Option can be used and nested in the for comprehension. We’ll explore Option in more depth in Chapter 3.

When I’m writing code, I return Option from any method that, based on business logic, might return some value or might return none. In this case, converting a String to an Int might succeed if the String can be parsed or might fail if the String cannot be parsed into an Int. If the String cannot be parsed, it is not something that’s worthy of an exception because it’s not an exceptional situation. It is merely a calculation that has no legal value, thus it makes sense to return None if the String cannot be parsed. This mechanism also avoids the Java patchwork of sometimes returning null when there’s no legal value to return and sometimes throwing an exception.

Speaking of exceptions, that’s exactly what Integer.parseInt does when it cannot parse the String into an Int. So, in our code, we wrap a try/catch around

Some(Integer.parseInt(in.trim)).

If the Integer.parseInt method succeeds, a new instance of Some will be created and returned from the toInt method. There’s no explicit return statement as the last expression evaluated in the method is its return value.

2. A recursive method is a method that calls itself.

3. Option[Int] is a “variant type” or “sum type” with None as one variant and Some[Int] as the other.

Neither None nor Some[Int] is the same as Int, but if you’re working with an Option[Int] that happens to be of the variant Some[Int] then you can extract the actual Int from it by calling the get method. If Integer.parseInt throws an exception, it will be caught by the catch block. The catch block looks different from Java’s catch. In Scala, there’s a single catch and a series of patterns to match the exception. Pattern matching is a language-level Scala construct, and it’s uniformly applied across the language. In this code, we have

case e: NumberFormatException => None

This pattern matches the exception to NumberFormatException and returns the expression None, which is the last expression in the method. Thus toInt will return None if parseInt throws a NumberFormatException. To summarize: toInt takes a String and attempts to convert it to an Int. If it succeeds, toInt returns Some(convertedValue), otherwise it returns None.

Summing Things Up

Next, let’s tackle the sum method. We define our method:

def sum(in: Seq[String]) = {

We don’t declare the return type for sum because the compiler can figure it out and the method is short enough that a quick glance at the code shows us that the return type is an Int. The in parameter is a Seq[String]. A Seq is a trait (which is like a Java interface) that is inherited by many different collections classes. A Seq is a supertrait to Array, List, and other sequential collections. As Option[Int] is an Option of Int, Seq[String] is a sequence of String elements.

A trait has all the features of the Java interface construct. But traits can have implemented methods on them. If you are familiar with Ruby, traits are similar to Ruby’s mixins. You can mix many traits into a single class. Traits cannot take constructor parameters, but other than that they behave like classes. This gives you the ability to have something that approaches multiple inheritance without the diamond problem (http://en.wikipedia.org/wiki/ Diamond_problem). The first line of the sum method transforms the Seq[String] to Seq[Int] and assigns the result to a val named ints:

val ints = in.flatMap(s => toInt(s))

This maps and flattens each element by calling the toInt method for each String in the sequence. toInt returns a collection of zero or one Int. flatMap flattens the result such that each element of the collection, the Option, is appended to the resulting sequence. The result is that each String from the Seq[String] that can be converted to an Int is put in the ints collection.

In Scala, you can declare variables as assign-once or assign-many. Assign-once Scala variables are the same as Java’s final variables. They are identified with the val keyword. Assign-multiple variables in Scala are the same as Java variables and are identified with the var keyword. Because I’m not changing the value of ints after I set it, I chose the val keyword. I use val in my programs unless there’s a compelling reason to use var, because the fewer things that can change, the fewer defects that can creep into my code.

Another fancy thing that we’ve done is create a function that calls the toInt method and passes it to the flatMap method. flatMap calls the function for each member of the sequence, in. In our example, we defined a function that takes a single parameter, s, and calls toInt with that parameter. We pass this function as the parameter to flatMap, and the compiler infers that s is a String. Thus, an anonymous function is created, and an instance of that function is passed to the flatMap method. Additionally, Scala sees that the return type of toInt is an Option[Int], so it infers that the ints variable has the type Seq[Int]. So, you’ve done your first bit of functional programming. Woo-hoo! The next line sums up the Seq[Int]:

ints.foldLeft(0)((a, b) => a + b)

foldLeft takes a seed value, 0 in this case, and applies the function to the seed and the first element of the sequence, ints. It takes the result and applies the function to the result and the next value in the sequence repeatedly until there are no more elements in the sequence. foldLeft then returns the resulting accumulated value. foldLeft is useful for calculating any accumulated value. In math, sum, prod, min, max, and so on can be implemented easily with foldLeft. In this case, we defined a simple function that takes two parameters, a and b, and returns the sum of those parameters. We did not have to declare the types of a or b, because the Scala compiler infers that they are both Ints. The foldLeft line is the last expression in the method, and the sum method returns its results.

Program Body

The following defines the input variable:

val input = Source.fromInputStream(System.in)

Its type is Source, a source of input, which wraps the JVM’s System.in InputStream. In this case, we didn’t have to do anything fancy to access a Java class. We used it just as we might have from a Java program. This illustrates the awesome interoperability between Scala and Java.

The next line gets the lines from our source and collects them into a Seq[String]:

val lines = input.getLines.collect

Finally, we print a message on the console with the sum of the lines with parsible integers on them:

println("Sum "+sum(lines))

To run the program, type

> scala Sum.scala

When you’re prompted, enter some lines with numbers. When you’re done, press Ctrl-D (Unix/Linux/Mac OS X) or Ctrl-C (Windows), and the program will display the sum of the numbers.

Great, you’ve written a Scala program that makes use of many of Scala’s features including function passing, immutable data structures, and type inference. Now, let’s look more deeply into Scala’s syntax.

what's special about scala

It's a software language designed to match object oriented and functional paradigms. We are guided to think (and code) differently when writing libraries versus generating specific applications. Martin Odersky describes the design goals of the language best in this interview on scalazine, the Goals of Scala:

Martin Odersky: The first thing we cared about was to have as clean an integration of functional and object-oriented programming as possible. We wanted to have first-class functions in there, function literals, closures. We also wanted to have the other attributes of functional programming, such as types, generics, pattern matching. And we wanted to integrate the functional and object-oriented parts in a cleaner way than what we were able to achieve before with the Pizza language. That was something we deeply cared about from the start.

Later on, we discovered that this was actually very easy, because functional languages have a fixed set of features. They had been well researched and well proven, so the question was only how to best integrate that in object-oriented programming. In Pizza we did a clunkier attempt, and in Scala I think we achieved a much smoother integration between the two. But then we found out that on the object-oriented side there remained lots of things to be developed. Object-oriented programming, at least when you throw in a static type system, was very much terra incognita. There was some work we could look at and use, but almost all languages that we found had made a lot of compromises.

So as we developed Scala, we started to discover how we could mix objects together with traits and mixin composition, how we could abstract the self types, how we could use abstract type members, and how we could let this all play together. Up to then there had been a couple of research languages that addressed a few of these aspects in specialized ways, but there wasn't much in terms of mainstream languages that covered the whole spectrum of making it all work. In the end it turned out that the main innovations in Scala were on the object-oriented side, and that's also something we really cared about.

what's special about Dave Pollack

He's an active developer that's designed and built lift with the help of a great open source group. This is a full featured web framework built with scala but focused on developing web applications fast and effectively. Here's Dave's personal blog. From the liftweb site:

Lift is an expressive and elegant framework for writing web applications. Lift stresses the importance of security, maintainability, scalability and performance, while allowing for high levels of developer productivity. Lift open source software licensed under an Apache 2.0 license.

I first discovered Dave passionately describing why he preferred Scala's static typing to his development time with Ruby on Rails (and heavy type testing cost) on Lambda the Ultimate. You'll get a good feel for Dave's thinking, and some incredible cross evaluation and discussion from other group members.