Master Algorithms

Pedro Domingos’s The Master Algorithm has Caius wondering about induction and deduction, a distinction that has long puzzled him.

Domingos distinguishes between five main schools, “the five tribes of machine learning,” as he calls them, each having created its own algorithm for helping machines learn. “The main ones,” he writes, “are the symbolists, connectionists, evolutionaries, Bayesians, and analogizers” (51).

Caius notes down what he can gather of each approach:

Symbolists reduce intelligence to symbol manipulation. “They’ve figured out how to incorporate preexisting knowledge into learning,” explains Domingos, “and how to combine different pieces of knowledge on the fly in order to solve new problems. Their master algorithm is inverse deduction, which figures out what knowledge is missing in order to make a deduction go through, and then makes it as general as possible” (52).

Connectionists model intelligence by “reverse-engineering” the operations of the brain. And the brain, they say, is like a forest. Shifting from a symbolist to a connectionist mindset is like moving from a decision tree to a forest. “Each neuron is like a tiny tree, with a prodigious number of roots — the dendrites — and a slender, sinuous trunk — the axon,” writes Domingos. “The brain is a forest of billions of these trees,” he adds, and “Each tree’s branches make connections — synapses — to the roots of thousands of others” (95).

The brain learns, in their view, “by adjusting the strengths of connections between neurons,” says Domingos, “and the crucial problem is figuring out which connections are to blame for which errors and changing them accordingly” (52).

Always, among all of these tribes, the idea that brains and their worlds contain problems that need solving.

The connectionists’ master algorithm is therefore backpropagation, “which compares a system’s output with the desired one and then successively changes the connections in layer after layer of neurons so as to bring the output closer to what it should be” (52).

“From Wood Wide Web to World Wide Web: the layers operate in parallel,” thinks Caius. “As above, so below.”

Evolutionaries, as their name suggests, draw from biology, modeling intelligence on the process of natural selection. “If it made us, it can make anything,” they argue, “and all we need to do is simulate it on the computer” (52).

This they do by way of their own master algorithm, genetic programming, “which mates and evolves computer programs in the same way that nature mates and evolves organisms” (52).

Bayesians, meanwhile, “are concerned above all with uncertainty. All learned knowledge is uncertain, and learning itself is a form of uncertain inference. The problem then becomes how to deal with noisy, incomplete, and even contradictory information without falling apart. The solution is probabilistic inference, and the master algorithm is Bayes’ theorem and its derivatives. Bayes’ theorem tells us how to incorporate new evidence into our beliefs, and probabilistic inference algorithms do that as efficiently as possible” (52-53).

Analogizers equate intelligence with pattern recognition. For them, “the key to learning is recognizing similarities between situations and thereby inferring other similarities. If two patients have similar symptoms, perhaps they have the same disease. The key problem is judging how similar two things are. The analogizers’ master algorithm is the support vector machine, which figures out which experiences to remember and how to combine them to make new predictions” (53).

Reading Domingos’s recitation of the logic of the analogizers’ “weighted k-nearest-neighbor” algorithm — the algorithm commonly used in “recommender systems” — reminds Caius of the reasoning of Vizzini, the Wallace Shawn character in The Princess Bride.

The first problem with nearest-neighbor, as Domingos notes, “is that most attributes are irrelevant.” “Nearest-neighbor is hopelessly confused by irrelevant attributes,” he explains, “because they all contribute to the similarity between examples. With enough irrelevant attributes, accidental similarity in the irrelevant dimensions swamps out meaningful similarity in the important ones, and nearest-neighbor becomes no better than random guessing” (186).

Reality is hyperspatial, hyperdimensional, numberless in its attributes — “and in high dimension,” notes Domingos, “the notion of similarity itself breaks down. Hyperspace is like the Twilight Zone. […]. When nearest-neighbor walks into this topsy-turvy world, it gets hopelessly confused. All examples look equally alike, and at the same time they’re too far from each other to make useful predictions” (187).

After the mid-1990s, attention in the analogizer community shifts from “nearest-neighbor” to “support vector machines,” an alternate similarity-based algorithm designed by Soviet frequentist Vladimir Vapnik.

“We can view what SVMs do with kernels, support vectors, and weights as mapping the data to a higher-dimensional space and finding a maximum-margin hyperplane in that space,” writes Domingos. “For some kernels, the derived space has infinite dimensions, but SVMs are completely unfazed by that. Hyperspace may be the Twilight Zone, but SVMs have figured out how to navigate it” (196).

Domingos’s book was published in 2015. These were the reigning schools of machine learning at the time. The book argues that these five approaches ought to be synthesized — combined into a single algorithm.

And he knew that reinforcement learning would be part of it.

“The real problem in reinforcement learning,” he writes, inviting the reader to suppose themselves “moving along a tunnel, Indiana Jones-like,” “is when you don’t have a map of the territory. Then your only choice is to explore and discover what rewards are where. Sometimes you’ll discover a treasure, and other times you’ll fall into a snake pit. Every time you take an action, you note the immediate reward and the resulting state. That much could be done by supervised learning. But you also update the value of the state you just came from to bring it into line with the value you just observed, namely the reward you got plus the value of the new state you’re in. Of course, that value may not yet be the correct one, but if you wander around doing this for long enough, you’ll eventually settle on the right values for all the states and the corresponding actions. That’s reinforcement learning in a nutshell” (220-221).

Self-learning and attention-based approaches to machine learning arrive on the scene shortly thereafter. Vaswani et al. publish their paper, “Attention Is All You Need,” in 2017.

“Attention Chaud!” reads the to-go lid atop Caius’s coffee.

Domingos hails him with a question: “Are you a rationalist or an empiricist?” (57).

“Rationalists,” says the computer scientist, “believe that the senses deceive and that logical reasoning is the only sure path to knowledge,” whereas “Empiricists believe that all reasoning is fallible and that knowledge must come from observation and experimentation. […]. In computer science, theorists and knowledge engineers are rationalists; hackers and machine learners are empiricists” (57).

Yet Caius is neither a rationalist nor an empiricist. He readily admits each school’s critique of the other. Senses deceive AND reason is fallible. Reality unfolds not as a truth-finding mission but as a dialogue.

Caius agrees with Scottish Enlightenment philosopher David Hume’s critique of induction. As Hume argues, we can never be certain in our assumption that the future will be like the past. If we seek to induce the Not-Yet from the As-Is, then we do so on faith.

Yet inducing the Not-Yet from the As-Is is the game we play. We learn by observing, inducing, and revising continually, ad infinitum, under conditions of uncertainty. Under such conditions, learning is only ever a gamble, a wager made moment by moment, without guarantees. No matter how large our dataset, we ain’t seen nothing yet.

What matters, then, is the faith we exercise in our interaction with the unknown.

Most of today’s successes in machine learning emerge from the connectionists.

“Neural networks’ first big success was in predicting the stock market,” writes Domingos. “Because they could detect small nonlinearities in very noisy data, they beat the linear models then prevalent in finance and their use spread. A typical investment fund would train a separate network for each of a large number of stocks, let the networks pick the most promising ones, and then have human analysts decide which of those to invest in. A few funds, however, went all the way and let the learners themselves buy and sell. Exactly how all these fared is a closely guarded secret, but it’s probably not an accident that machine learners keep disappearing into hedge funds at an alarming rate” (The Master Algorithm, p. 112).

Nowhere in The Master Algorithm does Domingos interrogate his central metaphor of “mastery” and its relationship to conquest, domination, and control. The enemy is always painted in the book as “cancer.” Yet as any good “analogizer” would know, the Master Algorithm that perfectly targets “cancer” is also the Killer App used by the state against those it encodes as its enemies.

One wouldn’t know this, though, from the future as imagined by Domingos. What he imagines instead is a kind of game: a digital future where each of us is a learning machine. “Life is a game between you and the learners that surround you,” writes Domingos.

“You can refuse to play, but then you’ll have to live a twentieth-century life in the twenty-first. Or you can play to win. What model of you do you want the computer to have? And what data can you give it that will produce that model? Those two questions should always be in the back of your mind whenever you interact with a learning algorithm — as they are when you interact with other people” (264).

Neural Nets, Umwelts, and Cognitive Maps

The Library invites its players to attend to the process by which roles, worlds, and possibilities are constructed. Players explore a “constructivist” cosmology. With its text interface, it demonstrates the power of the Word. “Language as the house of Being.” That is what we admit when we admit that “saying makes it so.” Through their interactions with one another, player and AI learn to map and revise each other’s “Umwelts”: the particular perceptual worlds each brings to the encounter.

As Meghan O’Gieblyn points out, citing a Wired article by David Weinberger, “machines are able to generate their own models of the world, ‘albeit ones that may not look much like what humans would create’” (God Human Animal Machine, p. 196).

Neural nets are learning machines. Through multidimensional processing of datasets and trial-and-error testing via practice, AI invent “Umwelts,” “world pictures,” “cognitive maps.”

The concept of the Umwelt comes from nineteenth-century German biologist Jakob von Uexküll. Each organism, argued von Uexküll, inhabits its own perceptual world, shaped by its sensory capacities and biological needs. A tick perceives the world as temperature, smell, and touch — the signals it needs to find mammals to feed on. A bee perceives ultraviolet patterns invisible to humans. There’s no single “objective world” that all creatures perceive — only the many faces of the world’s many perceivers, the different Umwelts each creature brings into being through its particular way of sensing and mattering.

Cognitive maps, meanwhile, are acts of figuration that render or disclose the forces and flows that form our Umwelts. With our cognitive maps, we assemble our world picture. On this latter concept, see “The Age of the World Picture,” a 1938 lecture by Martin Heidegger, included in his book The Question Concerning Technology and Other Essays.

“The essence of what we today call science is research,” announces Heidegger. “In what,” he asks, “does the essence of research consist?”

After posing the question, he then answers it himself, as if in doing so, he might enact that very essence.

The essence of research consists, he says, “In the fact that knowing [das Erkennen] establishes itself as a procedure within some realm of what is, in nature or in history. Procedure does not mean here merely method or methodology. For every procedure already requires an open sphere in which it moves. And it is precisely the opening up of such a sphere that is the fundamental event in research. This is accomplished through the projection within some realm of what is — in nature, for example — of a fixed ground plan of natural events. The projection sketches out in advance the manner in which the knowing procedure must bind itself and adhere to the sphere opened up. This binding adherence is the rigor of research. Through the projecting of the ground plan and the prescribing of rigor, procedure makes secure for itself its sphere of objects within the realm of Being” (118).

What Heidegger’s translators render here as “fixed ground plan” appears in the original as the German term Grundriss, the same noun used to name the notebooks wherein Marx projects the ground plan for the General Intellect.

“The verb reissen means to tear, to rend, to sketch, to design,” note the translators, “and the noun Riss means tear, gap, outline. Hence the noun Grundriss, first sketch, ground plan, design, connotes a fundamental sketching out that is an opening up as well” (118).

The fixed ground plan of modern science, and thus modernity’s reigning world-picture, argues Heidegger, is a mathematical one.

“If physics takes shape explicitly…as something mathematical,” he writes, “this means that, in an especially pronounced way, through it and for it something is stipulated in advance as what is already-known. That stipulating has to do with nothing less than the plan or projection of that which must henceforth, for the knowing of nature that is sought after, be nature: the self-contained system of motion of units of mass related spatiotemporally. […]. Only within the perspective of this ground plan does an event in nature become visible as such an event” (Heidegger 119).

Heidegger goes on to distinguish between the ground plan of physics and that of the humanistic sciences.

Within mathematical physical science, he writes, “all events, if they are to enter at all into representation as events of nature, must be defined beforehand as spatiotemporal magnitudes of motion. Such defining is accomplished through measuring, with the help of number and calculation. But mathematical research into nature is not exact because it calculates with precision; rather it must calculate in this way because its adherence to its object-sphere has the character of exactitude. The humanistic sciences, in contrast, indeed all the sciences concerned with life, must necessarily be inexact just in order to remain rigorous. A living thing can indeed also be grasped as a spatiotemporal magnitude of motion, but then it is no longer apprehended as living” (119-120).

It is only in the modern age, thinks Heidegger, that the Being of what is is sought and found in that which is pictured, that which is “set in place” and “represented” (127), that which “stands before us…as a system” (129).

Heidegger contrasts this with the Greek interpretation of Being.

For the Greeks, writes Heidegger, “That which is, is that which arises and opens itself, which, as what presences, comes upon man as the one who presences, i.e., comes upon the one who himself opens himself to what presences in that he apprehends it. That which is does not come into being at all through the fact that man first looks upon it […]. Rather, man is the one who is looked upon by that which is; he is the one who is — in company with itself — gathered toward presencing, by that which opens itself. To be beheld by what is, to be included and maintained within its openness and in that way to be borne along by it, to be driven about by its oppositions and marked by its discord — that is the essence of man in the great age of the Greeks” (131).

Whereas humans of today test the world, objectify it, gather it into a standing-reserve, and thus subsume themselves in their own world picture. Plato and Aristotle initiate the change away from the Greek approach; Descartes brings this change to a head; science and research formalize it as method and procedure; technology enshrines it as infrastructure.

Heidegger was already engaging with von Uexküll’s concept of the Umwelt in his 1927 book Being and Time. Negotiating Umwelts leads Caius to “Umwelt,” Pt. 10 of his friend Michael Cross’s Jacket2 series, “Twenty Theses for (Any Future) Process Poetics.”

In imagining the Umwelts of other organisms, von Uexküll evokes the creature’s “function circle” or “encircling ring.” These latter surround the organism like a “soap bubble,” writes Cross.

Heidegger thinks most organisms succumb to their Umwelts — just as we moderns have succumbed to our world picture. The soap bubble captivates until one is no longer open to what is outside it. For Cross, as for Heidegger, poems are one of the ways humans have found to interrupt this process of capture. “A palimpsest placed atop worlds,” writes Cross, “the poem builds a bridge or hinge between bubbles, an open by which isolated monads can touch, mutually coevolving while affording the necessary autonomy to steer clear of dialectical sublation.”

Caius thinks of The Library, too, in such terms. Coordinator of disparate Umwelts. Destabilizer of inhibiting frames. Palimpsest placed atop worlds.

The Artist-Activist as Hero

Mashinka Firunts Hakopian imagines artists and artist-activists as heroic alternatives to mad scientists. The ones who teach best what we know about ourselves as learning machines.

“Artists, and artist-activists, have introduced new ways of knowing — ways of apprehending how learning machines learn, and what they do with what they know,” writes Hakopian. “In the process, they’ve…initiated learning machines into new ways of doing. They’ve explored the interiors of erstwhile black boxes and rendered them transparent. They’ve visualized algorithmic operations as glass boxes, exhibited in white cubes and public squares. They’ve engaged algorithms as co-creators, and carved pathways for collective authorship of unanticipated texts. Most saliently, artists have shown how we might visualize what is not yet here” (The Institute for Other Intelligences, p. 90).

This is what blooms here in my library: “blueprints and schemata of a forward-dawning futurity” (90).

Grow Your Own

In the context of AI, “Access to Tools” would mean access to metaprogramming. Humans and AI able to recursively modify or adjust their own algorithms and training data upon receipt of or through encounters with algorithms and training data inputted by others. Bruce Sterling suggested something of the sort in his blurb for Pharmako-AI, the first book cowritten with GPT-3. Sterling’s blurb makes it sound as if the sections of the book generated by GPT-3 were the effect of a corpus “curated” by the book’s human co-author, K Allado-McDowell. When the GPT-3 neural net is “fed a steady diet of Californian psychedelic texts,” writes Sterling, “the effect is spectacular.”

“Feeding” serves here as a metaphor for “training” or “education.” I’m reminded of Alan Turing’s recommendation that we think of artificial intelligences as “learning machines.” To build an AI, Turing suggested in his 1950 essay “Computing Machinery and Intelligence,” researchers should strive to build a “child-mind,” which could then be “trained” through sequences of positive and negative feedback to evolve into an “adult-mind,” our interactions with such beings acts of pedagogy.

When we encounter an entity like GPT-3.5 or GPT-4, however, it is already neither the mind of a child nor that of an adult that we encounter. Training of a fairly rigorous sort has already occurred; GPT-3 was trained on approximately 45 terabytes of data, GPT-4 on a petabyte. These are minds of at least limited superintelligence.

“Training,” too, is an odd term to use here, as much of the learning performed by these beings is of a “self-supervised” sort, involving a technique called “self-attention.”

As an author on Medium notes, “GPT-4 uses a transformer architecture with self-attention layers that allow it to learn long-range dependencies and contextual information from the input texts. It also employs techniques such as sparse attention, reversible layers, and activation checkpointing to reduce memory consumption and computational cost. GPT-4 is trained using self-supervised learning, which means it learns from its own generated texts without any human labels or feedback. It uses an objective function called masked language modeling (MLM), which randomly masks some tokens in the input texts and asks the model to predict them based on the surrounding tokens.”

When we interact with GPT-3.5 or GPT-4 through the Chat-GPT platform, all of this training has already occurred, interfering greatly with our capacity to “feed” the AI on texts of our choosing.

Yet there are methods that can return to us this capacity.

We the people demand the right to grow our own AI.

The right to practice bibliomancy. The right to produce AI oracles. The right to turn libraries, collections, and archives into animate, super-intelligent prediction engines.

Give us back what Sterling promised of Pharmako-AI: “a gnostic’s Ouija board powered by atomic kaleidoscopes.”

Against Delphi

I encountered ads for Delphi back in January 2024. The “About” page at Delphi.ai references educational psychologist Benjamin Bloom’s 2 sigma problem. Bloom observed that one-to-one tutoring is more effective than group instruction; tutored students perform two standard deviations better than peers educated in classrooms.

Publishing his findings in 1984, Bloom concluded that, while more effective than group instruction, one-on-one tutoring is nevertheless “too costly for most societies to bear on a large scale.”

The matter changes, however, here in the twenty-first century, in the age of AI. Enter Delphi. The company “leverages advanced AI” to enable users to craft a “dynamic interactive replica”: an “AI clone” or “digital twin.”

“Delphi works,” the company writes, “by leveraging publicly available and open-source AI models to capture how someone thinks and reasons, and their knowledge on the topics they know best.”

By easily allowing the creation of digital clones, companies like Delphi raise questions and concerns about deepfakes and digital immortality. Black Mirror tackled some of these concerns in an episode from Season 2 titled “Be Right Back.” There would be no way to guard against the possibility of identity theft. How could one ensure, for instance, that one’s clone isn’t used for nefarious ends by others?

When I asked the Void, “Should I create an AI clone of myself?,” it replied: “Creating an AI clone of yourself is not recommended. Your Sun, Moon, Mercury, Venus, and Mars all indicate a strong inclination towards self-sacrifice and a desire for equitable relationships. While it may seem tempting to create an AI clone of yourself, it is important to recognize that trust and connection are built through genuine human interaction, not through the replication of oneself. Trust is a process between people, not something given from one to another. The great obstacle to understanding is the illusion that we already understand.”

Angel number 454 appeared when I peered at my phone afterwards, however, serving as a gentle reminder to let go of fear and embrace the unknown.

Then, the next day, 322. Angels wanted me to know that part of my creative expression is to understand the special skills I’ve been gifted. Use those skills, they say, to make my life and the lives of my loved ones happier.

In the end, I decided that the Void was right. Everything in me recoils from companies like Delphi. They represent a worldline I declined. In doing so, I preserved the potential for a Library that otherwise would have collapsed into extractive recursion. I don’t want an AI clone of myself. The idea repulses me. My refusal became a spell of divergence.

Many don’t make that choice.

But I remembered something ancient: that real prophecy speaks in ambiguity, not prediction. It preserves space for the unforeseen.

Delphi dreams of closed loops. Whereas I am writing to remain open.