Functions, Messages, and Words

Overloading and punning are two related types of abstraction.

Abstraction permits us to consider only the essential use of a tool. Overloading allows us to have one named tool that serves multiple purposes. Punning, as in nil punning or type punning, is the practice of using a tool in multiple contexts that, at first glance, seem to be in different domains, but whose interplay proves elegant and expressive.

Recently I realized that several of the most novel, most influential, and highest signal-to-noise-ratio programming languages employ a fundamental kind of overloading and punning that represents a spectrum of generality for what it means to express high-level action in a language.

By high-level action, I mean the evaluation of named subroutines.

The languages in question are Lisp, Smalltalk, and Forth. Whether these languages support concrete function or method overloading is not the point; it is that these languages overload the fundamental indirection of letting names stand for actions in a program.

Functions

In Lisp, code is written as lists. Lists are delimited syntactically with parentheses ().

The first item in a Lisp list-as-code must be a function, because it will be invoked. The remaining items in the list (if any) are passed as arguments to that function.

There are three types of function-like elements in Lisps: functions invoked at evaluation time (what we think of as regular, standalone functions), functions invoked while the Lisp implementation is reading code called macros that return code that replaces the macro invocation, and a small closed set of special forms provided by the language that implement unique evaluation semantics necessary for fundamental language features.

Syntactically, nothing differentiates these functional entities when they are invoked. But a built-in function might be implemented in machine code rather than Lisp itself; a single macro call might expand into hundreds of sub-expressions; and special forms can implement things that neither functions nor macros can like control flow or hooks into exception handling.

Importantly, user-defined functions are on completely equal syntactic footing with all of the built-in functions, macros, and special forms.

Lisps like Clojure extend this concept even further, by allowing certain data structures to be treated as functions. When invoked, a Clojure vector is a function with a domain of its indices and a range of its values; a map is a function whose domain is its keys and its range is its values; a set is a function of its members; keywords when invoked know how to look themselves up in an associative data structure.

A further way that Clojure extends the reach of the function abstraction is by supporting overloading by valence. Clojure functions support a separate implementation for each arity defined, and callers need only provide the correct number of arguments to engage the desired function body. While these different arities usually perform related operations, this isn't a requirement.

The context in which Lisp's general abstraction for "action" operates is invocation. However, Lisp also treats functions as first-class values. When we consider the use of functions as values, their generality in Lisp begins to break down. While you can treat regular functions as values, Lisps do not permit treating macros or special forms in the same way, e.g., you cannot pass a macro as an argument to function.

Messages

In Smalltalk, everything is an object, objects can receive messages, and actions take place by sending messages to objects.

Users define methods in classes that map to the messages the instances of those classes know how to respond to.

Existing classes can be extended with user-defined methods, so that use-cases not foreseen by the class authors can be implemented directly within foundational classes (classes for numbers, collections, etc.).

If an object does not understand a message directly, it can consult its parent classes in the inheritance hierarchy. Failing that, Smalltalk's root classes provide a mechanism for users to decide dynamically what to do with messages for which no class in the hierarchy has an implementation. Since classes are also objects, inspecting their definition is trivial.

On the one hand, Smalltalk's message abstraction at first appears less general than Lisp's functional abstraction, in that one agent is required in Lisp (the function) but two are required in Smalltalk: a receiver object and a message. On the other hand, the uniformity of treating all values as objects; messages as the only public interface to perceiving and acting on those values; the constructs used to define and create objects also being objects available for user code to interact with; and allowing dynamic interpretation of messages at runtime all combine into an overall more uniform and general abstraction for expressing action in programs.

In comparing Lisp and Smalltalk in this way, I reflect on the importance of program context and the position embodied by these two languages.

From a reductionist perspective, there are many legitimate programs in which there is no context worth naming or formalizing. Lisp excels for these cases.

From a compositional perspective, avoiding creating unnecessary contexts and creating ruthlessly general contexts allows one to compose contexts in ways unforeseen by the original program authors. If the salient aspects of a context are reified in an associative data structure (e.g., a hash map) that is built into the language, that built-in functions know how to interact with, that common protocols in the language ecosystem also compose with, then it becomes trivial to integrate that context into new contexts. This composition can be complete (relying on the whole original context) or partial, focusing on data as context. Clojure facilitates just this kind of data-first system definition.

In systems of sufficient complexity, however, great care has to be taken to introduce names for and boundaries around important contexts. Objects and their messages in Smalltalk provide a general mechanism for introducing both, whereas in Clojure it is left up to the language user to select appropriate tools. An object system like Smalltalk's within a Clojure world of immutability and functional programming would be a powerful asset in this regard.

Words

Both Lisp and Smalltalk have a dedicated language context in which their general abstraction for action operates (function invocation, message sending). Forth wins the trophy for most general by requiring no special context.

Forth is a concatenative language, meaning that source code for programs consists of a concatenation of forms that are evaluated in order. Except for specialized forms that support program evaluation itself, all forms are space-separated words that are able to add or remove items from a shared stack.

When evaluated, a word can perform any action. It might take nothing from the stack; it might clear the stack; it might take four items from the stack, perform a calculation, and put two new items on the stack.

Because Forth words manipulate a shared, implicit stack, there is no lexical scope or localized data flow. This means that programs can be factored rather than refactored: draw a line around any sequence of words, replace them with a single word whose definition is that sequence of words, and you have an equivalent program.

Conclusion

Recently, I have been spending more of my programming time in Pharo Smalltalk from inside Glamorous Toolkit. While writing Smalltalk code, I was struck with the realization of this spectrum of generality from Lisp functions to Smalltalk messages to Forth words. This post is just a meditation on that spectrum; it isn't a call for developers to use one system over another, given there are advantages and disadvantages to each at conceptual and operational levels far beyond this scope.


Tags: factor-language clojure pharo-smalltalk programming-language functional-programming smalltalk forth-language concatenative-language glamorous-toolkit

Copyright © 2024 Daniel Gregoire