Homoiconicity revisited

Jun 02, 2020 — Tags: Musings

In which we explore the possible meaning of “Homoiconic” by ignoring all pre-existing definitions and providing one of our own.

In an earlier article, I concluded that you probably shouldn’t use the word “homoiconic”: starting with the original definition of the word, I noted that this this definition is problematic for a number of reasons, and that the best we can say is that there is a degree to which a language is homoiconic: languages that have a smaller conceptual distance between their program text and machine operation are more homoiconic and vice versa.

That article also explored some of the plethora of competing, often mutually exclusive definitions in active use – a fact that should come as a warning anyone who’s intention is to clearly communicate their ideas.

It is therefore somewhat disappointing (though not unsurprising) that when the article was discussed on Hacker News, the top voted comment started like this:

I’m going to keep using the word ‘homoiconic’ because it is a useful term.

In style, that commenter neither explained why the term would be useful, nor provided a definition of their own.

Given that people will continue to use the word, what are we to understand if they do? Let’s revisit the topic, but start at the other end: instead of taking the official definition as a starting point, let’s say a word means what the people who use it mean by it. Even if that meaning has little to do with the original definition, and even if they are not willing to put forward a definition of their own.

If we don’t want to start with the official definition, we’re faced with a bit of a challenge though: what are we to take as a starting point for the discussion? To get it around it, we’ll simply start with a definition of our own, and then match this back to typical examples and counter-examples of homoiconicity to see how it fits.

Homoiconic, a working definition

Homoiconic usually refers some combination of the following (weights assigned to the bullets below will depend on the speaker):

Strong language support for simple, composable (i.e. tree-like) data structures, preferably using a minimal syntax (i.e. literals).
The language semantics are directly defined in terms of such data structures and the programs are formed using instances of these structures.
The structure of such data-structures is explicitly reflected in their visual representation on-screen; it is immediately apparant for humans.
High similarlity between representations of the program in (some of):
- the head of the programmer
- the visual representation on-screen
- the formal semantics of the language
- the implementation of the (virtual) machine

The benefit of homoiconicity is then: first, the structured manipulation of programs, possibly by other programs, becomes trivial and natural. Second, there is little mental effort spent in mapping the program text to the program structure, and in imagining how the machine operates on said structure.

Examples and counter-examples

Lisps tick all, or most of, of the boxes:

S-expressions take the role of the simple, composable data structures.
Lisps’ semantics are directly defined in terms of s-expressions and lisp programs are formed using s-expressions.
S-expressions are trees, and the nesting of items is immediately apparent: ( denotes the start of a child, and ) its end. Children are nested inside their parents.
Lisp programmers think in s-expressions, which are always on-screen, and form the basis of the semantics of the language (always) and the implementation of the VM (if it is an interpreter).

Other languages tick only some of the boxes, or none of them.

JavaScript has strong language support for composable, tree-like, structures (bullet 1) and the structure is immediately apparent by looking at the pairs of { and } brackets (bullet 3). However, the language semantics are not defined in such terms (bullet 2) and a JavaScript program is not provided as a JavaScript object (or JSON object).
For machine code, little mapping between the representations for the human and the machine is required, for the simple reason that no special representation for humans exist (bullet 4). However, machine code fails on bullets 1 to 3. I’d argue that few people that praise the benefits of homoiconicity actually think of machine code while doing so, though they might admit machine code fits some definition of homoiconicity when pressed.
XSLT is defined in XML, i.e. in composable, tree-like structures, so it gets a pass on bullets 1, 2 & 3 (with the exception of “simple syntax”). This is another one for “homoiconic, but not a poster child”.
TRAC probably fails on bullets 1, 2 & 3, even though it was the language in the context of which the term homoiconic was coined. This is simply a reflection of the fact that the term has evolved, and that modern-day usage is different from the original meaning.
Java has no strong language support for literal representation of simple composable data in the language (bullet 1) and also fails to meet the other criteria.

What Lispers don’t tell you

So is the above definition new in any way? I think it is, in the sense that is spells out some things that might be so obvious to Lispers that they forget to include them in their definitions. (“What’s water?” says the fish)

Take for example the definition on Wards Wiki

In a homoiconic language, the primary representation of programs is also a data structure in a primitive type of the language itself.

A common objection against this definition being meaningful is that this is or can be the case for most languages: e.g. in Java the primary representation of programs is as text (strings), and strings are also a primitive type of the language itself. Such an objection (understandably) glosses over the meaning implied by “a datastructure in”.

For non-lispers, that phrase might not evoke much, or may perhaps suggest an implementation detail (since “data structure” is often used to refer to the implementation of data types, i.e. a string might be implemented as an array, in which case the array is the underlying datastructure).

For Lispers, this part is essential though, because for them it is naturally a reference to s-expressions. And thus for Lispers it evokes 2 parts of the definition that I made explicit in the above under bullets 1 and 2, namely:

the compositional nature of the data type under consideration
the fact that data and programs can be composed in the same way

As a second example, consider the opening line of the current definition on Wikipedia:

A language is homoiconic if a program written in it can be manipulated as data using the language

Here, again, the objection can be raised that this is true for all languages. That is, all languages allow for programs written in them to be manipulated by them “as data” (i.e. as strings). After all, strings are data.

Well… not for Lispers, to whom “data” will likely evoke something more structured, such as s-expressions (again). In other words, the key property here is that the program is a piece of hierarchically structured data (bullets 1 & 2 in the above).

Finally, let’s consider how Wikipedia continues:

and thus the program’s internal representation can be inferred just by reading the program itself.

Here, the typical counterpoint would be that one can either never be true (because the inner workings of the machine are hidden from us) or is always true (if we assume that we are being provided the specification).

Fitting this sentence back to bullets 3 & 4 makes more sense though: the key point is not that the inner representation can be inferred, but rather that a “good enough model” is easy enough to imagine while looking at the program text.

“Homo” reinterpreted

If we indeed accept the working definition in the above as what most people who speak about homoiconicity actually mean, it seems that an interesting shift has occurred. Take one more look at the original definition:

Because TRAC procedures and text have the same representation inside and outside the processor, the term homo-iconic is applicable, from homo meaning the same, and icon meaning representation.

Now compare this with the working definition in the above: bullets 1, 2, and 3 are not at all concerned with sameness of internal and external representation. However, they are concerned with different kinds of sameness: Mostly, the fact that datastructures in the code, and the program text, are represented in the same way.

Conclusions

I’d still argue that homoiconicity is a concept that confuses more than it clarifies, mostly because of the many competing definitions.

Anyone who wants to get a point across is better off by simply referring to more direct properties of a language such as “has a nice literal syntax for structured data” or “the shape of the AST is immediately apparent from looking at your screen”.

Still, faced with continued usage, the working definition above might at least serve as a dictionary for the perplexed.