Functional Programming
Functional Programming (FP) is a programming paradigm that (re-)gained quite some traction in the past years. One of the most populuar functional programming languages these days is Scala, which combines object-oriented and functional aspects with a very neat syntax and high portability: it is executed on the Java VM and can thus integrate seamlessly with any existing Java libraries.
If you prefer reading a text book, I can recommend Functional Programming in Java by Venkat Subramaniam.
Detour: The Beautiful Syntax of Scala
Scala has a beautiful syntax:
- It follows the substitution principle, where the result of the last instruction is the return value.
- It has built-in operators for list operations (add, split, etc.)
With scala, insertion sort can be written in just a few lines of code:
// to sort a list...
def isort(xs: List[Int]): List[Int] = xs match {
// an empty list is sorted
case Nil => Nil
// a list with a single element is also sorted
case List(x) => List(x)
// otherwise, cut off the first element (y) and
// insert it into the sorted remaining list (ys)
case y :: ys => insert(isort(ys), y)
}
// to insert an element into a (sorted) list...
def insert(xs: List[Int], x: Int): List[Int] = xs match {
// if the list was empty, return a new list with just x
case Nil => List(x)
// otherwise: cut off the first element of xs and ...
case y :: ys =>
if (x < y) x :: xs // prepend x to xs
else y :: insert(ys, x) // insert x into ys
}
But back to FP in Java.
(Pure) FP is based on two principles: Immutability (all objects are unchangeable) and functions as first-class citizens (functions are objects and can thus be passed as arguments).
Immutability is a concept already found in the Java language.
For example, String
s are immutable since there is no method to change their content, and variables can be declared final
.
nota bene:
final
on method signatures indicate that a method cannot be overwritten anymore.
Java 8 added syntactical sugar to make the language more functional: lambda expressions and functional interfaces (@FunctionalInterface
).
As you know, everything (aside from primitive types) in Java is an Object
.
@FunctionalInterface
is an annotation to interfaces indicating that there is exactly one non-default method to be implemented.
Anonymous instances to functional interfaces can be written as lambda expressions:
@FunctionalInterface
interface Function<A, B> {
B apply(A obj);
}
Function<Integer, Integer> square1 = new Function<Integer, Integer>() {
@Override
public Integer apply(Integer i) {
return i * i;
}
}
// or shorter as lambda expression (arglist) -> { block; }
Function<Integer, Integer> square2 = (Integer i) -> { return i * i };
// or even shorter, for single instructions
Function<Integer, Integer> square3 = i -> i * i;
The types are usually automatically inferred. For single instructions, you can omit the curly braces and
return
.
So what’s the big deal with functional programming?
- Since objects are immutable, parallelization is (almost) trivial (you may have heard of map-reduce).
- Separation of Concerns (SoC): FP helps you to separate the data traversal (how you iterate the data) from the business logic (what you do with the data).
Let me illustrate this with a simple example (will go through it in detail at the end of the chapter). Say you want to
- retrieve all students from a database,
- filter out those who took Softwarearchitektur,
- load (all of) their transcript of records from another database
- print all class names
// code in src/fplive/{Database,Student,Transcript,Record}.java
// data in src/resources/{students,tors}.json
for (Student s : Database.getStudents()) {
if (s.getClasses().contains("Softwarearchitektur")) {
Transcript tr = Database.getToR(s.getMatrikel());
for (Record r : tr)
System.out.println(r);
}
}
As you can see, the traversal logic is closly tied with the business logic. With functional programming, we will be able to formulate this in a much cleaner way, so stay tuned!
Immutability
Java’s built-in java.lang.String
class is immutable: there is no way to change the contents of a String
object once it has been assigned.
For the remainder of this (and the next) chapter, let’s consider a new (generic) immutable list class:
class List<T> {
final T head;
final List<T> tail;
private List(T el, List<T> tail) {
this.head = el;
this.tail = tail;
}
boolean isEmpty() {
return head == null;
}
// factory methods for convenience...
static <T> List<T> empty() {
return new List<T>(null, null);
}
static <T> List<T> list(T elem, List<T> xs) {
return new List<>(elem, xs);
}
static <T> List<T> list(T... elements) {
if (elements.length == 0)
return empty();
int i = elements.length - 1;
List<T> xs = list(elements[i], empty());
while (--i >= 0)
xs = list(elements[i], xs);
return xs;
}
}
Here’s an example usage:
import static List.empty;
import static List.list;
List<Integer> sequence = list(1, 2, 3, 4, 5);
List<Integer> emptyList = empty();
List<Integer> prepend = list(0, empty());
System.out.println(sequence.isEmpty()); // "false"
System.out.println(emptyList.isEmpty()); // "true"
System.out.println(prepend.isEmpty()); // "false"
By now, you probably already realized the main issue with this class: once it’s initialized, there is no way to change it. That means: all mutations on this list will have to create a new list.
Even “worse”: if variables can’t be changed, there is no for
/while
iteration!
This brings us right to the techique central to functional programming:
Recursion
To understand recursion, you first must understand recursion.
(Even Google knows that!)
To warm up, let’s formulate a recursive toString()
method for our List
class.
Remember: When writing recursive functions, you ned to make sure to capture the terminal cases (the ones where you know the answer) and the recursion cases (the ones where you make the recursive calls).
For convenience, we’ll make this a member function:
class List<T> {
// ...
/**
* Either 'nil' if the list is empty, or
* '( head tail.toString )' otherwise
*/
@Override
public String toString() {
if (isEmpty()) return "nil";
else return "(" + head + " " + tail + ")";
}
}
System.out.println(list(7, 3, 1, 3)); // "(7 (3 (1 (3 nil))))"
Note: For the remainder of this chapter, there is absolutely no
for
/while
, onlyif
statements and recursive calls.
Simple Recursion
Similarly, we can formulate a contains
method that checks if a list contains an element: Either the head
matches, or it may be contained in tail
.
static <T> boolean contains(List<T> xs, T obj) {
if (xs.isEmpty()) return false;
else if (xs.head.equals(obj)) return true;
else return contains(xs.tail, obj);
}
The same with the length of a list: An empty list has length zero, any other list is one plus the length of its tail.
static <T> int length(List<T> xs) {
if (xs.isEmpty()) return 0;
else return 1 + length(xs.tail);
}
Recursion with List Generation
Things become a bit more tricky if we want to mutate lists (or more precicely:
get new lists which differ from the old ones).
For example, consider the take(int i)
and drop(int i)
functions that return a list with the first i
or the sublist following the i
-th element, respectively.
// recursion with list generation
static <T> List<T> take(List<T> xs, int n) {
if (n <= 0 || xs.isEmpty()) return empty();
else return list(xs.head, take(xs.tail, n-1));
}
static <T> List<T> drop(List<T> xs, int n) {
if (n <= 0 || xs.isEmpty()) return xs;
else return drop(xs.tail, n-1);
}
Appending to a list recursively is actually similar to the iterative way: if the target list is empty, the new list is the appendix; otherwise we make a new list where we keep the head but append to the tail.
static <T> List<T> append(List<T> xs, List<T> y) {
if (xs.isEmpty()) return y;
else return list(xs.head, append(xs.tail, y));
}
Since we know how to append to a list, list reversal becomes trivial: just make a new list of the current head, and append that to the reversal of the tail.
static <T> List<T> reverse(List<T> xs) {
if (xs.isEmpty()) return xs;
else return append(reverse(xs.tail), list(xs.head, empty()));
}
Recursive Sort Algorithms
Some sort algorithms can be formulated particularly simple using recursion.
Insertion Sort
The idea of insertion sort is that inserting an element x
into an already sorted list xs
trivial:
Skip all elements smaller then x
before inserting.
In our immutable list scenario this means:
“Copy” all values while smaller then x
, then insert x
and append the remaining list.
The actual insertion sort method then just inserts the head into the sorted remaining list.
static <T extends Comparable<T>> List<T> isort(List<T> xs) {
if (xs.isEmpty()) return xs;
else return insert(xs.head, isort(xs.tail));
}
private static <T extends Comparable<T>> List<T> insert(T x, List<T> xs) {
if (xs.isEmpty()) return list(x, empty());
else {
if (x.compareTo(xs.head) < 0) return list(x, xs);
else return list(xs.head, insert(x, xs.tail));
}
}
Merge Sort
Merge sort is a divide-and-conquer algorithm where the key idea is that merging two already sorted lists is trivial: Keep adding the smaller of both lists to your result list until all items have been added. The actual merge sort method then (recursively) splits the input lists into halves until they only contain a single element or none at all – those are already sorted.
static <T extends Comparable<T>> List<T> msort(List<T> xs) {
if (xs.isEmpty()) return xs; // no element at all
else if (xs.tail.isEmpty()) return xs; // only single element
else {
int n = length(xs);
List<T> a = take(xs, n/2);
List<T> b = drop(xs, n/2);
return merge(msort(a), msort(b));
}
}
private static <T extends Comparable<T>> List<T> merge(List<T> xs, List<T> ys) {
if (xs.isEmpty()) return ys;
else if (ys.isEmpty()) return xs;
else {
if (xs.head.compareTo(ys.head) < 0)
return list(xs.head, merge(xs.tail, ys));
else
return list(ys.head, merge(xs, ys.tail));
}
}
Detour: Tail Recursion
Let’s take another short detour.
If you worked out all the above examples yourself, you probably got intimately familiar with StackOverflowError
(link) – the “infinite loop” of recursion.
The reason is that you had that many recursive calls that you ran out of stack memory to store your call data (arguments, variables, etc.).
This is also why some (less informed) programmers call recursive (and to that extent: functional) programming “inefficient”.
They are right: if done wrong, recursive programming can be very inefficient.
Consider for example recursive definition of the Fibonnaci numbers:
long fib(int i) {
if (i < 2) return i;
else return fib(i-1) + fib(i-2)
}
This quickly gets out of hands:
the two recursive calls on the return
statement either take ages to terminate (try 40!) or results in a StackOverflowError
.
There are just too many calls of fib
for larger i
.
Here’s a version of fib
that has only a single unnested recursive call:
long fib(long a, long b, int i) {
if (i == 0) return a;
else if (i == 1) return b;
else return fib(b, a+b, i-1);
}
This not only terminates quickly even for large i
, but some languages and VMs can optimize for this special situation.
Since the single recursive call is always at the end (just with altered parameters), all stack variables for this function can be reused, effectively turning tail recursive functions into for
loops.
While Scala supports this, there is no official support for this in the JVM yet.
Functions as First-Class Citizens
Enough of the torture with recursion, let’s talk about more relevant aspects of functional programming in Java, and how functions as (kind of) first-class citizens can dramatically improve the readability of your code.
for-each
A functional way to traverse our List<T>
and print every element would be:
static <T> void print(List<T> xs) {
if (xs.isEmpty()) return;
else {
System.out.println(xs.head); // (1)
print(xs.tail);
}
}
Obviously, printing is just one thing we might want to do with those elements.
Maybe we want to write them to a file, or send them via network?
But instead of duplicating the code and effectively only changing line (1) of the above example, we can separate the concerns by adding a Consumer<T>
(link) argument to our traversal function to be executed on each element.
@FunctionalInterface
interface Consumer<T> {
void accept(T t);
}
static <A> void forEach(List<A> xs, Consumer<A> c) {
if (xs.isEmpty()) return;
else {
c.accept(xs.head);
forEach(xs.tail, c);
}
}
And here’s a Consumer
that prints elements to System.out
:
List<Integer> xs = list(1, 2, 3, 4);
forEach(xs, new Consumer<Integer>() {
@Override
public void accept(Integer i) {
System.out.println(i);
}
});
// or shorter with lambda
forEach(xs, i -> System.out.println(i));
// or even shorter with method references
forEach(xs, System.out::println);
filter
A different yet very frequent use of lists is to filter them by a particular predicate.
The result of filter
is a list that contains only elements that satisfy some condition.
Let’s do this right away with a helper “function”, a Predicate
(link)
@FunctionalInterface
interface Predicate<T> {
boolean test(T t);
}
static <A> List<A> filter(List<A> xs, Predicate<A> p) {
if (xs.isEmpty()) return xs;
else if (p.test(xs.head)) return list(xs.head, filter(xs.tail, p_));
else return filter(xs.tail, p);
}
List<Integer> xs = list(1, 2, 3, 4);
List<Integer> lt3 = filter(xs, i -> i < 3);
map
The last functional concept for this class is map
.
When working with data, you often need to transform one type of data into another.
For example, you might retrieve a list of Student
, but you actually need only a list of their family names.
That is: given a list of type Student
, you want a list of type String
.
static List<String> familyNames(List<Student> xs) {
if (xs.isEmpty()) return empty();
else return list(xs.head.getFamilyName(), familyNames(xs.tail));
}
Well, this seems fairly generic, doesn’t it? You want to map one object to another, given some function. Let’s try this again, with the logic moved to a functional interface:
@FunctionalInterface
interface Function<A, B> {
B apply(A a);
}
static <A, B> List<B> map(List<A> xs, Function<A, B> f) {
if (xs.isEmpty()) return empty();
else return list(f.apply(xs.head), map(xs.tail, f));
}
List<Student> xs = ...;
List<String> fns = map(xs, s -> s.getFamilyName());
List<String> fns = map(xs, Student::getFamilyName); // even shorter
Our functional approach to traversing, filtering and mapping data successfully separated the “iteration” from the actual logic, which was provided as a “function as an object”.
FP in Java: Streams
So far, we did all the exercises with a pretty useless list class.
But filter
, map
and forEach
are essential tools to process data.
In Java, these functional aspects are not attached (and thus limited) to lists, but to a more general concept of (possibly infinite) data streams. This is more appropriate, since the data may originate from very different sources: web APIs, database result sets, or plain text files.
We’ll talk more about Stream
s next week, but for now, please take note of the following methods:
Stream<T>.filter(Predicate<? super T> p)
Stream<T>.map(Function<? super T, ? extends R>)
Stream<T>.forEach(Consumer<T> consumer)
In our examples above, the filter
and map
methods returned new lists.
Here, these intermediate methods return Stream
s.
Our forEach
method had return type void
; here, this terminal operation also returns void
.
Recall the (iterative) example from the very top: retrieve a list of students, find those who attended a certain class, and then print out the names of the classes on their transcript of records.
for (Student s : Database.getStudents()) {
if (s.getClasses().contains("Softwarearchitektur")) {
Transcript tr = Database.getToR(s.getMatrikel());
for (Record r : tr)
System.out.println(r);
}
}
In functional Java, this becomes (in a most detailed variant)
Database.getStudents().stream()
.filter(s -> s.getClasses().contains("Softwarearchitektur"))
.map(Student::getMatrikel)
.map(Database::getToR)
.flatMap(t -> t.records.stream()) // stream of lists to single list
.forEach(System.out::println);
Isn’t that much more precise as the nested for
loops with if
and method calls?
Lazy Evaluation
One last word on efficiency.
The stream methods are lazy in a sense that the downstream operations are only applied to the actual results of the previous steps.
To stick with the example above, the Student::getMatrikel
would only be applied to those who were passed on by filter
.
In other words: the terminal operation (here: forEach
) pulls data from the streams, all the way to the originating stream.
∎