Looking for Golems in all the wrong places...

IMPORTANT NOTICE
There will be no presentation on Thursday, the 12th of March, as I will be out of London for personal reasons.
Sorry!

"...one has only learnt to get the better of words

For the thing one no longer has to say, or the way in which

One is no longer disposed to say it."

T.S. Eliot; East Coker

"...when you finally have the pleasure of saying the thing you mean to say at the moment you mean to say it, remorse inevitably follows."

Nora Ephron; You've Got Mail

A young woman seated at her computer sends romantic messages to her unseen beau, a Golem, seated at his own computer and responding in kind.

I know I've been talking about Golems rather a lot in this series, but honestly, the parallels are just too compelling to ignore.

The Golem was an inanimate object in the shape of a human, given life by the power of written language. A Large Language Model is a computer system that has effectively been "brought to life" (if you want to be poetic about it) by the sum total of everything expressed in language... more or less. (You can read my thoughts about the Golem's "forbidden name of God" as an analog of A.I. training data by clicking here.)

Like the Golem, a Large Language Model is not "alive" in any human sense. But it acts like it's alive... or to be precise, it talks like it's alive.

"It turned out that when you fed the sum total of virtually all available written material through a massive array of silicon wood chippers, the resulting model figured out on its own how to extrude sensible text on demand."

Gideon Lewis-Kraus, writing in the New Yorker magazine

I am not claiming for one moment that A.I. is even close to achieving sentience (although plenty of other people seem to be less certain about that point). If Large Language Models sound human, it is simply because they are getting very good at putting the right words in the right order, and human language has been specifically built for humans to use.

Let me explain that for a moment.

English (along with most modern languages) is constructed according to human perceptions of Reality. English grammar is built around an implicit "sense of self": there is "me" (the person speaking) and there is "not me" (everything else, including whoever is out there reading this). Thus English gives us first-person (me), second-person (you) and third-person (them). When I construct a sentence using the first person (such as this very sentence) you (the sentient being on the receiving end of the sentence) intuit that I am possessed of a Sense of Self ("I think, therefore I am...") and I am describing something about that Self. If I use a verb (like "construct") you understand that I am describing an act performed by this specific self. Verb tenses tell you that "I" experience the linear passage of time, and can distinguish between Reality as it exists right now, and Reality as it occurred in some past moment (to say nothing of the Reality yet to occur).

When Large Language Models generate coherent sentences in English, they are going to sound human simply because the language itself sounds human. If a Language Model generates the sentence "I have enjoyed our chat" we immediately form a mental image of a coherent entity with a sense of self ("I") who has experienced (present perfect tense, ergo the passage of time) an emotional response ("enjoyed") to an exchange of ideas with a discreet entity that it can recognise as "not itself".

None of these things may be true. A ChatBot (probably) has no sense of self, and it (very likely) doesn't experience emotions. And if it has no sense of self then it's unlikely to comprehend the idea of "other". The only thing "it" has is the ability to generate sentences like "I have enjoyed our chat" because that's what language does.

And this of course is the distinctive feature of language: it will never be abstract; it always needs to signify something. That's what makes it language.

"Words Without Thoughts..."

When musicologists talk about music, they distinguish between "programme music" (music that tells a story) and what they call "absolute music" or "pure music". Pure music has no narrative; it doesn't describe ardent lovers or wayward mischief-makers or walking brooms or anything at all: it simply is.

Mickey Mouse animates the broomstick in "The Sorcerer's Apprentice" as depicted in "Fanstasia"

For a long time there was a general tendency to look down on programme music, as if there was something inherently inferior about music that "signified" something outside itself. (Can you guess where I'm going with this?) Even the very term "pure" suggests that programme music is somehow... impure? Corrupted? Tainted?

The Harvard Dictionary of Music didn't mince words when discussing the relative merits of programme music:

"There is a certain weakness inherent in the underlying principle of program music. Music is basically an art in its own right, and too great a reliance on extraneous associations is likely to weaken rather than enhance the artistic value of the composition. The great vogue which program music enjoyed during the 19th Century led to a deplorable misunderstanding of the fundamental nature and purpose of music."

The Harvard Brief Dictionary of Music; 1960 edition

I would strongly disagree with their highly elitist dismissal of programme music, but that's a conversation for another time. My point right now is that unlike music, language must always refer to something outside itself. Shakespeare may have talked about "Words without thoughts", but can a Signifier be a signifier if it has no Signified? A word that doesn't signify something isn't a word.

’Twas brillig, and the slithy toves

Did gyre and gimble in the wabe:

All mimsy were the borogoves,

And the mome raths outgrabe.

Lewis Carroll; Jabberwocky

The word "DOG" is only a dog if everyone agrees that it signifies "🐕". Language only works if it refers to the Reality outside itself (experiments like Lewis Carroll's Jabberwocky notwithstanding) but Large Language Models are not privy to that external Reality. Their ability to generate language is a function of structures and patterns rather than concepts and meanings, and in that sense, they are treating language as if it were "pure music"; generating coherent speech from the inside out, rather than the outside in. The Sorcerer's Apprentice is about magic brooms and out-of-control flooding, but the music is music: it's F#s and B♭s and time signatures and chord progressions.

Would it be hypothetically possible to compose a work that coherently tells the story of the Sorcerer's Apprentice if you know everything about music, but have no concept of "water" or "brooms" (or anything else in physical Reality)? If you simply generate the music, one note at a time, following the rules and conventions of Western harmony and structure and the sympathetic vibrations of sound waves; could the narrative (the Signified) just emerge, all by itself?

the splinters of wood form into an army of animated broomsticks in "The Sorcerer's Apprentice"

This is the paradox of a system that generates language without any comprehension of meaning... because language is only language if it signifies something. Language by definition is programmatic. Without meaning it's just organised sounds (or organised shapes on a page). René Descartes may have reduced the human condition to "I think, therefore I am," but a Large Language Model's existence is more Gertrude Stein: "There's no there there."

ChatBots have mastered the internal organisation of Language without any awareness of the exterior meaning; they have effectively achieved "pure" language. The Harvard Dictionary of Music would be thrilled.

But here's where it gets really interesting: none of this actually matters.

The ChatBots may not exist in a world of Signifieds, but we do... and more importantly, language itself does. Valid sentences don't just contain subjects, verbs and objects, they contain meanings. Any system that can generate language is going to generate meaning, whether it knows it or not. "I think, therefore I am" is a philosophical precept, but it's also a sentence, with Signifiers and Signifieds.

I can stand in front of a mirror and say "I think, therefore I am," because I am a human with a sense of self. But what if a ChatBot says "I think, therefore I am"? Does the sentence still have meaning? What if I put those words into the (non-existent) mouth of a fictional (unreal) character?

"I think, therefore I am," said Hamlet.

For that matter, what happens when René Descartes says this? Descartes has been dead for 376 years; he no longer thinks, and also he isn't. And yet here he is, still saying it:

"I think, therefore I am," said Descartes.

(Actually he said it in Latin, but let's not get distracted; this is tortuous enough already.)

When those words are spoken by anything non-human (a fictional character; a ChatBot; a dead philosopher) the language itself is effectively "creating" an illusion of person-hood. The letters of the alphabet are manifesting an entity with a sense of self.

Remind you of anything?

"I think, therefore I am," said the Golem.

"The Sentience of the Sentence"

In one of the letters preserved in 84 Charing Cross Road, Helene Hanff expresses her disapproval of fiction as a general concept.

"I never can get interested in things that didn't happen to people who never lived," she says to Frank. Except of course she doesn't say that to Frank, because she and Frank never met each other, and never spoke. Instead, Helene wrote a sentence on a piece of paper, and then (presumably weeks later) Frank read that sentence and added it to his mental image of the hypothetical construct named "Helene Hanff".

Frank's conception of Helene is not the same as the real Helene, and the image he forms of "her" (based on the words she has elected to share) is no more or less alive than, say, his mental image of Samuel Pepys, or Anne Frank, or Jonathan Harker.

Helene's dislike of fiction is mildly ironic, because the act of "expressing" herself exclusively through language essentially means she is rebuilding herself as a completely new entity; an entity that exists as words on a page. The "entity" she sends to Frank might resemble the real Helene Hanff, or it might be something quite different. She chooses the words, so she can make herself anything she wishes to be.

This is not unique to Helene Hanff. Generations of writers have discovered that "expressing" themselves through writing allows them manifest in any way they choose. They can make themselves male or female without taking a single hormone or chopping off a single body-part, or even purchasing a single article of clothing. They can be older; younger; blacker; whiter, or sexually liberated. It's all words, and creating life with words is easy. Even a computer can do it, it seems.

Suddenly, everyone's a Golem.

The Accidental Catfish

In 1937 the Hungarian playwright Miklós László wrote Parfumerie; a gentle comedy about two people who form an enthusiastic romance via (anonymous) written correspondence, completely unaware that they are co-workers who detest each other in real life.

Margaret Sullavan and James Steward argue in "The Shop Around the Corner"

Parfumerie was eventually filmed as the classic 1940 Ernst Lubitsch comedy The Shop Around the Corner, starring James Stewart and Margaret Sullavan, then again as a 1949 musical with Judy Garland and Van Johnson (In the Good Old Summertime). It was also adapted for stage as the musical She Loves Me in 1963 (starring Barbara Cook) and then formed the basis for Nora Ephron's You've Got Mail in 1998.

Tom Hanks and Meg Ryan argue in "You've Got Mail"

It's You've Got Mail that I want to showcase this week, as Nora Ephron's take on the story updates the scenario to that specific moment in modern history when the nascent internet was about to turn Western culture upside down.

Meg Ryan checks her mail in "You've Got Mail"

Like Helene and Frank in 84 Charing Cross Road, Kathleen and Joe (Meg Ryan and Tom Hanks) are veteran book lovers who bond (in their anonymous emails) over a love of language. Unfortunately, they happen to be rival bookstore owners in real life who have no idea that their digital alter egos (their written constructs) are growing ever closer and more intimate.

Nora Ephron was the daughter of Henry and Phoebe Ephron, the screenwriting duo who (forty years earlier) had written Desk Set for Katharine Hepburn and Spencer Tracy.

Katharine Hepburn and Spencer Tracy in "Desk Set"

In that film (a prophetic and prescient examination of the impact computers were to have on the workplace) Tracy pays Hepburn what he considers the ultimate compliment: "I bet you write wonderful letters."

Tom Hanks and his dog read their latest email in "You've Got Mail"

You've Got Mail could almost be read as a further exploration of that moment in Desk Set. Kathleen and Joe can't stand each other, but their emails are in love.

Written personas are not the same as real people, but that doesn't affect our emotional reactions. We respond to written text whether it comes from an anonymous emailer, a long-dead historical figure, a fictional character, or a ChatBot. The meaning is in the language, no matter the source.

We've been wrong about the Golem for all these years.

The Golem isn't the clay figure; it's the words.

We will Screen You've Got Mail at 7.30 on Thursday, the 5th of March at the Victoria Park Baptist Church.

Search This Blog

The Illusion of Depth