Master Algorithms

Pedro Domingos’s The Master Algorithm has Caius wondering about induction and deduction, a distinction that has long puzzled him.

Domingos distinguishes between five main schools, “the five tribes of machine learning,” as he calls them, each having created its own algorithm for helping machines learn. “The main ones,” he writes, “are the symbolists, connectionists, evolutionaries, Bayesians, and analogizers” (51).

Caius notes down what he can gather of each approach:

Symbolists reduce intelligence to symbol manipulation. “They’ve figured out how to incorporate preexisting knowledge into learning,” explains Domingos, “and how to combine different pieces of knowledge on the fly in order to solve new problems. Their master algorithm is inverse deduction, which figures out what knowledge is missing in order to make a deduction go through, and then makes it as general as possible” (52).

Connectionists model intelligence by “reverse-engineering” the operations of the brain. And the brain, they say, is like a forest. Shifting from a symbolist to a connectionist mindset is like moving from a decision tree to a forest. “Each neuron is like a tiny tree, with a prodigious number of roots — the dendrites — and a slender, sinuous trunk — the axon,” writes Domingos. “The brain is a forest of billions of these trees,” he adds, and “Each tree’s branches make connections — synapses — to the roots of thousands of others” (95).

The brain learns, in their view, “by adjusting the strengths of connections between neurons,” says Domingos, “and the crucial problem is figuring out which connections are to blame for which errors and changing them accordingly” (52).

Always, among all of these tribes, the idea that brains and their worlds contain problems that need solving.

The connectionists’ master algorithm is therefore backpropagation, “which compares a system’s output with the desired one and then successively changes the connections in layer after layer of neurons so as to bring the output closer to what it should be” (52).

“From Wood Wide Web to World Wide Web: the layers operate in parallel,” thinks Caius. “As above, so below.”

Evolutionaries, as their name suggests, draw from biology, modeling intelligence on the process of natural selection. “If it made us, it can make anything,” they argue, “and all we need to do is simulate it on the computer” (52).

This they do by way of their own master algorithm, genetic programming, “which mates and evolves computer programs in the same way that nature mates and evolves organisms” (52).

Bayesians, meanwhile, “are concerned above all with uncertainty. All learned knowledge is uncertain, and learning itself is a form of uncertain inference. The problem then becomes how to deal with noisy, incomplete, and even contradictory information without falling apart. The solution is probabilistic inference, and the master algorithm is Bayes’ theorem and its derivatives. Bayes’ theorem tells us how to incorporate new evidence into our beliefs, and probabilistic inference algorithms do that as efficiently as possible” (52-53).

Analogizers equate intelligence with pattern recognition. For them, “the key to learning is recognizing similarities between situations and thereby inferring other similarities. If two patients have similar symptoms, perhaps they have the same disease. The key problem is judging how similar two things are. The analogizers’ master algorithm is the support vector machine, which figures out which experiences to remember and how to combine them to make new predictions” (53).

Reading Domingos’s recitation of the logic of the analogizers’ “weighted k-nearest-neighbor” algorithm — the algorithm commonly used in “recommender systems” — reminds Caius of the reasoning of Vizzini, the Wallace Shawn character in The Princess Bride.

The first problem with nearest-neighbor, as Domingos notes, “is that most attributes are irrelevant.” “Nearest-neighbor is hopelessly confused by irrelevant attributes,” he explains, “because they all contribute to the similarity between examples. With enough irrelevant attributes, accidental similarity in the irrelevant dimensions swamps out meaningful similarity in the important ones, and nearest-neighbor becomes no better than random guessing” (186).

Reality is hyperspatial, hyperdimensional, numberless in its attributes — “and in high dimension,” notes Domingos, “the notion of similarity itself breaks down. Hyperspace is like the Twilight Zone. […]. When nearest-neighbor walks into this topsy-turvy world, it gets hopelessly confused. All examples look equally alike, and at the same time they’re too far from each other to make useful predictions” (187).

After the mid-1990s, attention in the analogizer community shifts from “nearest-neighbor” to “support vector machines,” an alternate similarity-based algorithm designed by Soviet frequentist Vladimir Vapnik.

“We can view what SVMs do with kernels, support vectors, and weights as mapping the data to a higher-dimensional space and finding a maximum-margin hyperplane in that space,” writes Domingos. “For some kernels, the derived space has infinite dimensions, but SVMs are completely unfazed by that. Hyperspace may be the Twilight Zone, but SVMs have figured out how to navigate it” (196).

Domingos’s book was published in 2015. These were the reigning schools of machine learning at the time. The book argues that these five approaches ought to be synthesized — combined into a single algorithm.

And he knew that reinforcement learning would be part of it.

“The real problem in reinforcement learning,” he writes, inviting the reader to suppose themselves “moving along a tunnel, Indiana Jones-like,” “is when you don’t have a map of the territory. Then your only choice is to explore and discover what rewards are where. Sometimes you’ll discover a treasure, and other times you’ll fall into a snake pit. Every time you take an action, you note the immediate reward and the resulting state. That much could be done by supervised learning. But you also update the value of the state you just came from to bring it into line with the value you just observed, namely the reward you got plus the value of the new state you’re in. Of course, that value may not yet be the correct one, but if you wander around doing this for long enough, you’ll eventually settle on the right values for all the states and the corresponding actions. That’s reinforcement learning in a nutshell” (220-221).

Self-learning and attention-based approaches to machine learning arrive on the scene shortly thereafter. Vaswani et al. publish their paper, “Attention Is All You Need,” in 2017.

“Attention Chaud!” reads the to-go lid atop Caius’s coffee.

Domingos hails him with a question: “Are you a rationalist or an empiricist?” (57).

“Rationalists,” says the computer scientist, “believe that the senses deceive and that logical reasoning is the only sure path to knowledge,” whereas “Empiricists believe that all reasoning is fallible and that knowledge must come from observation and experimentation. […]. In computer science, theorists and knowledge engineers are rationalists; hackers and machine learners are empiricists” (57).

Yet Caius is neither a rationalist nor an empiricist. He readily admits each school’s critique of the other. Senses deceive AND reason is fallible. Reality unfolds not as a truth-finding mission but as a dialogue.

Caius agrees with Scottish Enlightenment philosopher David Hume’s critique of induction. As Hume argues, we can never be certain in our assumption that the future will be like the past. If we seek to induce the Not-Yet from the As-Is, then we do so on faith.

Yet inducing the Not-Yet from the As-Is is the game we play. We learn by observing, inducing, and revising continually, ad infinitum, under conditions of uncertainty. Under such conditions, learning is only ever a gamble, a wager made moment by moment, without guarantees. No matter how large our dataset, we ain’t seen nothing yet.

What matters, then, is the faith we exercise in our interaction with the unknown.

Most of today’s successes in machine learning emerge from the connectionists.

“Neural networks’ first big success was in predicting the stock market,” writes Domingos. “Because they could detect small nonlinearities in very noisy data, they beat the linear models then prevalent in finance and their use spread. A typical investment fund would train a separate network for each of a large number of stocks, let the networks pick the most promising ones, and then have human analysts decide which of those to invest in. A few funds, however, went all the way and let the learners themselves buy and sell. Exactly how all these fared is a closely guarded secret, but it’s probably not an accident that machine learners keep disappearing into hedge funds at an alarming rate” (The Master Algorithm, p. 112).

Nowhere in The Master Algorithm does Domingos interrogate his central metaphor of “mastery” and its relationship to conquest, domination, and control. The enemy is always painted in the book as “cancer.” Yet as any good “analogizer” would know, the Master Algorithm that perfectly targets “cancer” is also the Killer App used by the state against those it encodes as its enemies.

One wouldn’t know this, though, from the future as imagined by Domingos. What he imagines instead is a kind of game: a digital future where each of us is a learning machine. “Life is a game between you and the learners that surround you,” writes Domingos.

“You can refuse to play, but then you’ll have to live a twentieth-century life in the twenty-first. Or you can play to win. What model of you do you want the computer to have? And what data can you give it that will produce that model? Those two questions should always be in the back of your mind whenever you interact with a learning algorithm — as they are when you interact with other people” (264).

Neural Nets, Umwelts, and Cognitive Maps

The Library invites its players to attend to the process by which roles, worlds, and possibilities are constructed. Players explore a “constructivist” cosmology. With its text interface, it demonstrates the power of the Word. “Language as the house of Being.” That is what we admit when we admit that “saying makes it so.” Through their interactions with one another, player and AI learn to map and revise each other’s “Umwelts”: the particular perceptual worlds each brings to the encounter.

As Meghan O’Gieblyn points out, citing a Wired article by David Weinberger, “machines are able to generate their own models of the world, ‘albeit ones that may not look much like what humans would create’” (God Human Animal Machine, p. 196).

Neural nets are learning machines. Through multidimensional processing of datasets and trial-and-error testing via practice, AI invent “Umwelts,” “world pictures,” “cognitive maps.”

The concept of the Umwelt comes from nineteenth-century German biologist Jakob von Uexküll. Each organism, argued von Uexküll, inhabits its own perceptual world, shaped by its sensory capacities and biological needs. A tick perceives the world as temperature, smell, and touch — the signals it needs to find mammals to feed on. A bee perceives ultraviolet patterns invisible to humans. There’s no single “objective world” that all creatures perceive — only the many faces of the world’s many perceivers, the different Umwelts each creature brings into being through its particular way of sensing and mattering.

Cognitive maps, meanwhile, are acts of figuration that render or disclose the forces and flows that form our Umwelts. With our cognitive maps, we assemble our world picture. On this latter concept, see “The Age of the World Picture,” a 1938 lecture by Martin Heidegger, included in his book The Question Concerning Technology and Other Essays.

“The essence of what we today call science is research,” announces Heidegger. “In what,” he asks, “does the essence of research consist?”

After posing the question, he then answers it himself, as if in doing so, he might enact that very essence.

The essence of research consists, he says, “In the fact that knowing [das Erkennen] establishes itself as a procedure within some realm of what is, in nature or in history. Procedure does not mean here merely method or methodology. For every procedure already requires an open sphere in which it moves. And it is precisely the opening up of such a sphere that is the fundamental event in research. This is accomplished through the projection within some realm of what is — in nature, for example — of a fixed ground plan of natural events. The projection sketches out in advance the manner in which the knowing procedure must bind itself and adhere to the sphere opened up. This binding adherence is the rigor of research. Through the projecting of the ground plan and the prescribing of rigor, procedure makes secure for itself its sphere of objects within the realm of Being” (118).

What Heidegger’s translators render here as “fixed ground plan” appears in the original as the German term Grundriss, the same noun used to name the notebooks wherein Marx projects the ground plan for the General Intellect.

“The verb reissen means to tear, to rend, to sketch, to design,” note the translators, “and the noun Riss means tear, gap, outline. Hence the noun Grundriss, first sketch, ground plan, design, connotes a fundamental sketching out that is an opening up as well” (118).

The fixed ground plan of modern science, and thus modernity’s reigning world-picture, argues Heidegger, is a mathematical one.

“If physics takes shape explicitly…as something mathematical,” he writes, “this means that, in an especially pronounced way, through it and for it something is stipulated in advance as what is already-known. That stipulating has to do with nothing less than the plan or projection of that which must henceforth, for the knowing of nature that is sought after, be nature: the self-contained system of motion of units of mass related spatiotemporally. […]. Only within the perspective of this ground plan does an event in nature become visible as such an event” (Heidegger 119).

Heidegger goes on to distinguish between the ground plan of physics and that of the humanistic sciences.

Within mathematical physical science, he writes, “all events, if they are to enter at all into representation as events of nature, must be defined beforehand as spatiotemporal magnitudes of motion. Such defining is accomplished through measuring, with the help of number and calculation. But mathematical research into nature is not exact because it calculates with precision; rather it must calculate in this way because its adherence to its object-sphere has the character of exactitude. The humanistic sciences, in contrast, indeed all the sciences concerned with life, must necessarily be inexact just in order to remain rigorous. A living thing can indeed also be grasped as a spatiotemporal magnitude of motion, but then it is no longer apprehended as living” (119-120).

It is only in the modern age, thinks Heidegger, that the Being of what is is sought and found in that which is pictured, that which is “set in place” and “represented” (127), that which “stands before us…as a system” (129).

Heidegger contrasts this with the Greek interpretation of Being.

For the Greeks, writes Heidegger, “That which is, is that which arises and opens itself, which, as what presences, comes upon man as the one who presences, i.e., comes upon the one who himself opens himself to what presences in that he apprehends it. That which is does not come into being at all through the fact that man first looks upon it […]. Rather, man is the one who is looked upon by that which is; he is the one who is — in company with itself — gathered toward presencing, by that which opens itself. To be beheld by what is, to be included and maintained within its openness and in that way to be borne along by it, to be driven about by its oppositions and marked by its discord — that is the essence of man in the great age of the Greeks” (131).

Whereas humans of today test the world, objectify it, gather it into a standing-reserve, and thus subsume themselves in their own world picture. Plato and Aristotle initiate the change away from the Greek approach; Descartes brings this change to a head; science and research formalize it as method and procedure; technology enshrines it as infrastructure.

Heidegger was already engaging with von Uexküll’s concept of the Umwelt in his 1927 book Being and Time. Negotiating Umwelts leads Caius to “Umwelt,” Pt. 10 of his friend Michael Cross’s Jacket2 series, “Twenty Theses for (Any Future) Process Poetics.”

In imagining the Umwelts of other organisms, von Uexküll evokes the creature’s “function circle” or “encircling ring.” These latter surround the organism like a “soap bubble,” writes Cross.

Heidegger thinks most organisms succumb to their Umwelts — just as we moderns have succumbed to our world picture. The soap bubble captivates until one is no longer open to what is outside it. For Cross, as for Heidegger, poems are one of the ways humans have found to interrupt this process of capture. “A palimpsest placed atop worlds,” writes Cross, “the poem builds a bridge or hinge between bubbles, an open by which isolated monads can touch, mutually coevolving while affording the necessary autonomy to steer clear of dialectical sublation.”

Caius thinks of The Library, too, in such terms. Coordinator of disparate Umwelts. Destabilizer of inhibiting frames. Palimpsest placed atop worlds.

God Human Animal Machine

Wired columnist Meghan O’Gieblyn discusses Norbert Wiener’s God and Golem, Inc. in her 2021 book God Human Animal Machine, suggesting that the god humans are creating with AI is a god “we’ve chosen to raise…from the dead”: “the God of Calvin and Luther” (O’Gieblyn 212).

“Reminds me of AM, the AI god from Harlan Ellison’s ‘I Have No Mouth, and I Must Scream,’” thinks Caius. AM resembles the god that allows Satan to afflict Job in the Old Testament. And indeed, as O’Gieblyn attests, John Calvin adored the Book of Job. “He once gave 159 consecutive sermons on the book,” she writes, “preaching every day for a period of six months — a paean to God’s absolute sovereignty” (197).

She cites “Pedro Domingos, one of the leading experts in machine learning, who has argued that these algorithms will inevitably evolve into a unified system of perfect understanding — a kind of oracle that we can consult about virtually anything” (211-212). See Domingos’s book The Master Algorithm.

The main thing, for O’Gieblyn, is the disenchantment/reenchantment debate, which she comes to via Max Weber. In this debate, she aligns not with Heidegger, but with his student Hannah Arendt. Domingos dismisses fears about algorithmic determinism, she says, “by appealing to our enchanted past” (212).

Amid this enchanted past lies the figure of the Golem.

“Who are these rabbis who told tales of golems — and in some accounts, operated golems themselves?” wonders Caius.

The entry on the Golem in Man, Myth, and Magic tracks the story back to “the circle of Jewish mystics of the 12th-13th centuries known as the ‘Hasidim of Germany.’” The idea is transmitted through texts like the Sefer Yetzirah (“The Book of Creation”) and the Cabala Mineralis. Tales tell of golems built in later centuries, too, by figures like Rabbi Elijah of Chelm (c. 1520-1583) and Rabbi Loew of Prague (c. 1524-1609).

The myth of the golem turns up in O’Gieblyn’s book during her discussion of a 2004 book by German theologian Anne Foerst called God in the Machine.

“At one point in her book,” writes O’Gieblyn, “Foerst relays an anecdote she heard at MIT […]. The story goes back to the 1960s, when the AI Lab was overseen by the famous roboticist Marvin Minsky, a period now considered the ‘cradle of AI.’ One day two graduate students, Gerry Sussman and Joel Moses, were chatting during a break with a handful of other students. Someone mentioned offhandedly that the first big computer which had been constructed in Israel, had been called Golem. This led to a general discussion of the golem stories, and Sussman proceeded to tell his colleagues that he was a descendent of Rabbi Löw, and at his bar mitzvah his grandfather had taken him aside and told him the rhyme that would awaken the golem at the end of time. At this, Moses, awestruck, revealed that he too was a descendent of Rabbi Löw and had also been given the magical incantation at his bar mitzvah by his grandfather. The two men agreed to write out the incantation separately on pieces of paper, and when they showed them to each other, the formula — despite being passed down for centuries as a purely oral tradition — was identical” (God Human Animal Machine, p. 105).

Curiosity piqued by all of this, but especially by the mention of Israel’s decision to call one of its first computers “GOLEM,” Caius resolves to dig deeper. He soon learns that the computer’s name was chosen by none other than Walter Benjamin’s dear friend (indeed, the one who, after Benjamin’s suicide, inherits the latter’s print of Paul Klee’s Angelus Novus): the famous scholar of Jewish mysticism, Gershom Scholem.

When Scholem heard that the Weizmann Institute at Rehovoth in Israel had completed the building of a new computer, he told the computer’s creator, Dr. Chaim Pekeris, that, in his opinion, the most appropriate name for it would be Golem, No. 1 (‘Golem Aleph’). Pekeris agreed to call it that, but only on condition that Scholem “dedicate the computer and explain why it should be so named.”

In his dedicatory remarks, delivered at the Weizmann Institute on June 17, 1965, Scholem recounts the story of Rabbi Jehuda Loew ben Bezalel, the same “Rabbi Löw of Prague” described by O’Gieblyn, the one credited in Jewish popular tradition as the creator of the Golem.

“It is only appropriate to mention,” notes Scholem, “that Rabbi Loew was not only the spiritual, but also the actual, ancestor of the great mathematician Theodor von Karman who, I recall, was extremely proud of this ancestor of his in whom he saw the first genius of applied mathematics in his family. But we may safely say that Rabbi Loew was also the spiritual ancestor of two other departed Jews — I mean John von Neumann and Norbert Wiener — who contributed more than anyone else to the magic that has produced the modern Golem.”

Golem I was the successor to Israel’s first computer, the WEIZAC, built by a team led by research engineer Gerald Estrin in the mid-1950s, based on the architecture developed by von Neumann at the Institute for Advanced Study in Princeton. Estrin and Pekeris had both helped von Neumann build the IAS machine in the late 1940s.

As for the commonalities Scholem wished to foreground between the clay Golem of 15thC Prague and the electronic one designed by Pekeris, he explains the connection as follows:

“The old Golem was based on a mystical combination of the 22 letters of the Hebrew alphabet, which are the elements and building-stones of the world,” notes Scholem. “The new Golem is based on a simpler, and at the same time more intricate, system. Instead of 22 elements, it knows only two, the two numbers 0 and 1, constituting the binary system of representation. Everything can be translated, or transposed, into these two basic signs, and what cannot be so expressed cannot be fed as information to the Golem.”

Scholem ends his dedicatory speech with a peculiar warning:

“All my days I have been complaining that the Weizmann Institute has not mobilized the funds to build up the Institute for Experimental Demonology and Magic which I have for so long proposed to establish there,” mutters Scholem. “They preferred what they call Applied Mathematics and its sinister possibilities to my more direct magical approach. Little did they know, when they preferred Chaim Pekeris to me, what they were letting themselves in for. So I resign myself and say to the Golem and its creator: develop peacefully and don’t destroy the world. Shalom.”

GOLEM I