HomeDigital EditionSys-Con RadioSearch Java Cd
Advanced Java AWT Book Reviews/Excerpts Client Server Corba Editorials Embedded Java Enterprise Java IDE's Industry Watch Integration Interviews Java Applet Java & Databases Java & Web Services Java Fundamentals Java Native Interface Java Servlets Java Beans J2ME Libraries .NET Object Orientation Observations/IMHO Product Reviews Scalability & Performance Security Server Side Source Code Straight Talking Swing Threads Using Java with others Wireless XML
 

In creating the HotScheme interpreter (JDJ Vol. 4, Issue 1), we decided to employ functional programming concepts to Java, our implementation language, whenever it was practical. Functional programming has a number of advantages over more traditional procedural code, which we will enumerate below. The common thread uniting these advantages is an attempt to create code that's conceptually transparent. Employing this functional style directly in Java allowed us to define many Scheme functions in Java the same as they'd be written in Scheme itself. We wouldn't recommend attempting to program Java in a completely functional style -- the resulting Java code would run poorly and appear bizarre and convoluted to most Java programmers. But we do feel that many of the benefits of functional programming can be brought to Java, if some discernment is used about when to apply the style.

What Is Functional Programming?
The functional programming style is characterized by an extensive (sometimes even exclusive) use of "pure" functions. Input to pure functions comes only from their arguments, not on state information stored in local or global variables. (Of course, Java does not have global variables per se, but public static class variables play essentially the same role.) Pure functions produce no side effects, such as setting a counter or a flag, and are called only for their return value. In the idiom of functional programming, procedures that don't meet these strict criteria are called pseudofunctions. The theory of functional programming has a firm foundation in the theory of mathematical notation, where centuries of experience have confirmed the value of a concise, transparent notation for expressing abstract ideas.

There are six key properties of code written in the functional style, which also hold for mathematical statements in their traditional form. (This list of properties is based on Bruce J. MacLennan's Functional Programming, Addison Wesley, 1990.)

  1. Value is independent of the evaluation order.
    In evaluating a pure function, it doesn't matter whether the arguments are evaluated from left to right, from right to left or even from the middle out. Since each expression depends only on inputs and doesn't create any side effects, no order dependencies can ever arise.

  2. Expressions can be evaluated in parallel.
    The first property implies that multiple expressions can be evaluated at the same time, on different processors. Since the final result doesn't depend on evaluation order, neither the programmer nor the compiler needs to worry about the sort of complicated dependencies between different procedural statements that make parallel computation so notoriously difficult.

  3. Referential transparency
    This means that any expression is independent of its context. If you see f(7, 12) in two different places, you can be sure f will return the same value both times.

  4. Absence of side effects
    Fundamentally, this entails avoiding assignments. Assignments are the "go-tos" of data. They create order dependence in code execution, hidden state and nonlocal relationships between pieces of code.

  5. Inputs to an operation are obvious from the written form.
    Because pure functions don't keep state or use global variables, their input appears right there in the function call.

  6. Effects of an operation are obvious from the written form.
    Since a pure function doesn't set any state or produce any other side effects, to use the return value makes it the direct argument to another function. For example, g(f(7, 12), h(1)). It's clear in this code snippet that f is called to supply g with its first argument, and h is called to supply it with its second.

Why Employ It?
Functional programming can aid in the creation of extremely concise code, which, once you are familiar with the style, is easier to reason with than procedural code. Someone unfamiliar with the style might, when viewing a functional program for the first time, suspect that the programmer has been deliberately cryptic. With a great deal of meaning packed into each line of code, the density can be intimidating. However, 2,500 years of mathematical history show that once the language of mathematical expressions is comprehended, mathematicians can glance at a few symbols and grasp what the untutored won't understand in an hour of verbose explanation. Mastery of the functional style having been achieved, economy of expression, which is inseparable from the properties listed above, will facilitate the comprehension of complex programs tremendously.

This economy of expression means that algorithms are expressed more directly in the functional style than in procedural code. Because the code isn't busy maintaining state and creating side effects, each expression tends to directly model a step in the algorithm itself.

C.S. Peirce pointed out that the form of mathematical expressions is iconic in that the arrangement of terms in an expression is a picture, or icon, of their relationship in our minds (Philosophical Writings of Peirce, Dover, 1940. [reprinted]). Mathematical expressions are the original "visual programming," for these expressions are diagrams of the logical relationship of their terms. So it is with functional programming: the return of a pure function depends only on what you see when looking at its written form, namely, its arguments. The use of its return value is clearly visible from its context, where it will appear as an argument to some other function. In short, a pure function has manifest interfaces. Two examples will illustrate the difference between a pure function and a pseudofunction in this regard.

The following is a pure function. The return depends only on the value of the parameter x:

int succ (int x) { return x + 1; }

The following is a pseudofunction. The return depends on the value of x and the "hidden" value of a:

int plusA(int x) { return x + a; }

Feeding the function 7 as an argument won't always return the same thing. For you to understand what will happen when plusA() is called, it's not enough to look at the local contexts of the call and the function definition. You must also hunt through the code to where a is defined and then determine where and how its value is set. If the value of a depends on some other global or state variable, then you can soon find yourself reading an entire program in order to understand a single statement.

The manifest interfaces of pure functions allow easier proof of correctness. You can imagine how having to follow the state of persistent variables around hundreds of lines of code adds tremendous complexity to a formal proof and pressure for the program to correctly implement its requirements.

This can be expressed more formally by saying that the syntax graph and the dataflow graph for a functional language have identical, treelike structures. Subexpressions that communicate data to each other are always found to be adjacent in the syntax tree. This is not true with procedural code as assignments allow nonlocal communication - an assignment in one module of a program can have an effect in an entirely different module when the variable is finally used. Note that encapsulation does not make this any less true; just because you're accessing the variable through a function, rather than directly, doesn't change the degree of nonlocality.

The absence of state makes functional programming inherently "thread safe." There are no persistent variables to worry about locking, and therefore no critical code portions that can't be entered simultaneously by different instances of the same function. Since no pure function maintains state or creates side effects, it is, by definition, safe to execute as many of them in parallel as the environment cares to run.

Java has many features that reduce bug count significantly. It forces you to have accurate arguments and returns, catches or throws for all thrown exceptions and so forth. The absence of pointers eliminates the source of many bugs in C and C++ programs. There is also runtime trapping (array bounds, etc.), so bugs that slip by the compiler are often found in the first test run of the program rather than lurking silently until a real user hits some peculiar condition and the program blows up.

Even so, functional programming offers a whole other level of "bug protection." The lack of assignments (in a pure functional program) removes the need to worry about state, and the possibility of bugs arising from failing to account for all possible pseudofunction states disappears. The inherent thread safety also gets rid of many of the difficulties inherent in debugging threaded programs that are written in a procedural language.

How to Do It in Java
As we mentioned earlier, we don't view functional programming as a panacea, nor do we recommend its exclusive use, especially when working in a nonfunctional language such as Java. But for those times when you do want to use the functional style in Java, we can recommend the following measures:

Have functional primitives available early on, then use them in higher-level functions designed later.

For instance, very early in the project we implemented Length(), first(), second() and restl(), all of which operate on a list argument and return the length, the first item, the second item and a list of all items but the first of that argument, respectively. We also defined a cons() function to build a new list from an object and a list.

Once these functions were done, we used them frequently. See Listing 1 for an example of how we used each of these primitives several times in a fairly small class.
Avoid assignments.
Instead of writing:
int f4(a, b, c, d) {
int x = f1(a, b);
int y = f2(c, d);
int z = f3(x, y);
return z;
}

Write:
int f4(a, b, c, d) {
return f3(
f1(a, b),
f2(c, d));
}

You may find some who claim you are being obscure by writing the second version. However, note that it precisely illustrates the use of all the arguments and subcalls in one statement, while the first version has four times as many lines and three new, unnecessary variables.

Use auxiliary functions as another substitute for assignments.
See Listing 2 for an example.


Use recursion instead of looping.
public final long Length() throws SchemeException
{
// the last object will be #f -- see Length there!
return 1L + cdr.Length();
}

Can we do functional programming in Java and still take advantage of its object-oriented features? We'll examine this question in terms of polymorphism, inheritance, encapsulation and the use of class libraries.

Polymorphism is a great aid in writing functional Java. Many of the core functions in the functional style (those for forming and pulling apart lists, for mapping functions over lists and for searching lists) can accept many data types as arguments and may return an object of a different type, depending on what is passed to them. Polymorphism obviates the need to create versions of these functions for all the different permutations of argument and return type.

Inheritance doesn't lose any of its applicability in "functional Java." You can still subclass and override methods in subclasses as long as the new methods are pure functions; then you haven't lost any purity of functional style.

Encapsulation becomes less important in functional programming since there are fewer state variables and less to encapsulate. However, when you break the functional mold and do employ state variables for performance or code simplicity, encapsulation plays an important role in hiding and localizing the nonfunctional parts of the program.

The standard Java library is quite extensive and is partly responsible for the language's success. It and other third-party libraries can be used in one of two ways in a (mostly) functional Java program. The first possibility is to encapsulate them in a functional layer, then use that layer for the rest of the program. The other is simply to accept them as being among the nonfunctional parts of the program wherever they need to be used.

Results and Trade-offs
Using the functional style may result in some performance loss in some situations as a result of the increased number of function calls and recursion. In other cases performance may increase because of fewer temporary assignments.

Code may be hard to understand for those unfamiliar with style, a difficulty that is exacerbated by the fact that Java wasn't designed as a functional language. In a language like Scheme, the notation:

(print (eval (make term START global_env)))

is easier to decipher than the OOP version:
SchemeObject.make(term, SchemeObject.START,
global_env).Eval(global_env).Print();

However, we were generally satisfied using the functional style in the HotScheme interpreter. We were able to implement new Scheme functions and even syntactic constructs with remarkably few lines of code, and they were almost always correct the first time we wrote them. Also, to code in this style, we had to think like a Scheme interpreter, so there was a unity of conceptualization between the application and the code that implemented it.

About the Authors
Gene Callahan, president of St. George Technologies, designs and implements Internet projects. He has written articles for several national and international industry publications. Gene can be reached at [email protected]

Robert Dodson is a software developer who writes options-trading software in Java and C++ for OTA Limited Partnership. Previous projects include weather analysis software, tactical programs for Navy submarines, and code for electronic shelf labels. He can be reached at [email protected]

	

Listing 1.
 
// this is the class that 
// implements the Scheme ımapı command 
class Map extends BuiltIn 
{ 
    public SchemeObject Apply(SchemeObject args, Environment env) 
    { 
        return map_aux(args.first(), args.second(), 
            (args.Length() > 2) ? (args.restl()).restl() : null, 
            env); 
    } 
    private SchemeObject map_aux(SchemeObject f, 
        SchemeObject ls, SchemeObject more, Environment env) 
    { 
        if(more == null) return map_aux1(f, ls, env); 
        else             return map_aux_more(f, ls, more, env); 
    } 
    private SchemeObject map_aux1(SchemeObject f, 
        SchemeObject ls, Environment env) 
    { 
        if(ls.Nullp()) return SchemeObject.False; 
        else 
            return SchemeObject.cons( 
                f.Apply(SchemeObject.cons(ls.first(), 
                SchemeObject.False), 
                env), 
                map_aux1(f, ls.restl(), env)); 
    } 
    private SchemeObject map_aux_more(SchemeObject f, 
        SchemeObject ls, SchemeObject more, Environment env) 
    { 
        if(ls.Nullp()) return SchemeObject.False; 
        else 
            return SchemeObject.cons( 
                f.Apply( 
                    SchemeObject.cons( 
                        ls.first(), 
                        map_aux(First.car, more, null, env) 
                    ), 
                    env), 
                    map_aux_more(f, ls.restl(), 
                        map_aux(Rest.cdr, more, null, env), 
                        env)); 
    } 
} 
  

Listing 2.
 
// This code implements the Scheme "begin" syntax, which 
//  executes a number of expressions in sequence. 
//  Syntax: (begin exp1 Š) 
public final SchemeObject Apply(SchemeObject args, 
    Environment env) 
  throws SchemeException 
{ 
  return begin_aux(args, args.Length(), env); 
} 
private final SchemeObject begin_aux(SchemeObject args, int len, 
    Environment env) 
  throws SchemeException 
{ 
  switch(len) 
  { 
    case 0: return False; 
    case 1: return args.first().Eval(env); 
    default: 
      args.first().Eval(env); 
// we evaluate the first statement, then call begin_aux 
// recursively on the rest of the statements 
      return begin_aux(args.restl(), len - 1, env); 
  } 
} 


  

 

All Rights Reserved
Copyright ©  2004 SYS-CON Media, Inc.
  E-mail: [email protected]

Java and Java-based marks are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. SYS-CON Publications, Inc. is independent of Sun Microsystems, Inc.