AI, Architecture, and the Uncanny Valley
Recently I had a lively discussion with a colleague about a new research paper released by Sakana.AI in which researchers built and ran a fully automated “AI Scientist” to conduct open-ended scientific research. My colleague works at the intersection of design and research science and agreed that the AI Scientist was pretty neat. He also agreed that the ramifications for the design professions were just as large, or larger, than the ramifications for research science. The AI Scientist essentially replicates all the steps that human researchers take to conduct scientific research, as shown below.
The system performs completely automated, end-to-end scientific discovery in four main steps:
- Idea Generation: The AI Scientist starts by brainstorming novel research directions and then evaluates those ideas based on interestingness, novelty, and feasibility by checking them against all existing literature in the field.
- Experiment Iteration: The AI Scientist then designs and executes experiments on the idea it selected in Step 1, gathering outcomes from the experiments and tweaking the next experimental design.
- Paper Write up: The AI Scientist then writes a scientific paper. It includes all standard sections of a research paper, using the experimental results and generated figures to write each part. The system also searches for and includes relevant citations to support its findings.
- Peer Review: Whatever paper the AI Scientist came up with then goes through an automated peer review, performed by another AI (the “LLM [large language model] Reviewer Agent”).
When tested on real conference papers, the LLM Reviewer Agent displayed “near human performance” in terms of accuracy and consistency. Overall, Sakana researchers judged the performance of the AI Scientist to be:
“about the level of an early-stage ML [Machine Learning] researcher who can competently execute an idea but may not have the full background knowledge to fully interpret the reasons behind an algorithm’s success. If a human supervisor was presented with these results, a reasonable next course of action could be to advise The AI Scientist to re-scope the project.”
Incredibly, the researchers at Sakana.AI estimate the cost of any particular run—i.e., the entire end-to-end process for a single scientific paper—at $15/paper. The exact time to produce a paper isn’t provided in their research findings, but researchers aver that the AI Scientist was able to generate hundreds of “medium-quality” papers over the course of a week, all using a single processor equivalent to about 640 GB of RAM. In other words, a research effort that might take a Ph.D. student an entire semester can now be done for $15 in a few hours or less.
The implications are potentially civilization-altering:
1. A fully functioning A.I. Scientist could be cloned at effectively an infinite scale. Once you make one work at a truly human level—and they seem pretty close to doing this—you can replicate it 100,000 times, as with any other computer program or file. Imagine you wanted to cure cancer and could clone up 100,000 cancer researchers, each working 24 hours a day!
2. In contrast to the way humans create and share research knowledge (in obscure journals, behind hidden paywalls), the results from these 100,000 AI cancer researchers would be shared universally. With each new paper produced—at a rate of roughly one per hour, per AI Scientist—every subsequent effort would be able to draw from the conclusions therein.
3. Even if half of the papers end up being mediocre, the sheer scale of the effort would ensure that some percentage would constitute “groundbreaking” research. 100,000 such bots would take only three to six weeks to produce more scientific research papers than all humans have ever produced, on every subject, across all of human history.
This seemingly fantastical scenario is analogous to what futurists have called the “Intelligence Explosion,” a theoretical scenario where a machine can analyze the processes that produce its own intelligence, improve upon them, and reproduce a more intelligent version of itself. Repeat ad infinitum.
However, the scenario above seems really more like a knowledge explosion—in this case, knowledge about cancer and cancer treatments. More broadly, it’s the potential explosion of a knowledge community: the sum of all knowledge in a particular field, shared, remixed, and constantly improved upon by the actors in that community. It is within these communities that all scientific progress is made.
What’s truly revolutionary about the AI Scientist isn’t the production of papers; it’s the reproduction of this type of community and its processes, pointing the way to a future where all new discoveries are made by machines. A future from where we will look back and realize that AI was the last, and greatest, human invention.
Unimpressed, but Threatened
My colleague wasn’t so impressed. He argued, somewhat convincingly, that AI couldn’t actually perform science because it was incapable of experiencing serendipity. Penicillin, the microwave oven, Post-it Notes, super glue, the Slinky, chocolate-chip cookies, potato chips, velcro, and safety glass—all came about by accident, while the inventors were trying to make something else. This sort of “accident” wouldn’t happen in a model like the AI Scientist, or any other hyper-rational approach. There has to be some room to see beyond the “accident” and understand the possibilities that lie within it.
It reminded me, and I reminded him, of the infamous Move 37 performed by Google DeepMind’s AlphaGo program in its defeat of Lee Sedol at the DeepMind Challenge Match, a “man vs. machine” competition in the game of Go. At the moment AlphaGo made its 37th move in the second game, human observers believed that something had gone wrong with the programming; it appeared to be a wild, random error. It wasn’t until much later in the game that the human audience collectively realized the strategic importance of Move 37 and how it had set up Alpha Go for eventual victory. I posited that this was analogous to the “accidental” discovery of penicillin—that AlphaGo had probably proposed to itself millions of different random “accidents” and evaluated how each would play out, eventually finding one accident that was fortuitous.
He conceded that I might be right, but regardless, such progress should be frowned upon, as it would eventually rob human beings of the joy of discovery. What would then animate the dreams of scientists? Or designers?
His reactions seemed contradictory to me, and I told him so. How could one simultaneously dismiss the possibility of AI performing at a human level and yet fear the possibility of AI displacing human effort?
The answer, I think, lies in the uncanny valley.
The Uncanny Valley
The term “uncanny valley” was coined by robotics professor Masahiro Miro in 1970; it hypothesizes that our emotional reaction to robots becomes increasingly positive the more human they seem, until their appearance becomes a little too human, at which point our reactions switch to fear and disgust. Of course, these negative reaction depend on our actually being able to tell the difference. So Miro posited that there was a final stage of the graph, as shown below, where robots would become so human-like that they would trigger our human-to-human feelings of empathy and we would love them again. This creates a “valley” in which robots’ near-similarity to ourselves triggers rushes of negative feelings.
The uncanny valley is typically discussed as a qualitative taxonomy, but it’s actually a timeline. The inevitability of technological progression necessarily spatializes the uncanny valley through time—indeed, through our own lifetimes. It creates a timeline along which robots have moved, and will continue to move, in the direction of “humanness.”
Our current age will be looked upon as the time when robots went from being really different (and therefore innocuous helpers) to very similar (and therefore unsettling and vaguely threatening) to very, very similar (and therefore trusted allies).
Miro’s hypothesis was about robots, not necessarily AI. But the shoe fits. We never had these kinds of conversations about computers because it was more than clear that computers weren’t us. AI invites comparisons and contradictions because of how human-like it often appears. The most common contradiction is the one espoused by my friend: the more faithfully AI mimics human behavior, the more we deride its performance as being “not at a human level.”
Consider the oft-cited problem of hallucinations in large language models (LLMs). Machine “hallucinations” are, by definition:
“a phenomenon wherein a large language model (LLM)—often a generative AI chatbot or computer vision tool—perceives patterns or objects that are nonexistent or imperceptible to human observers, creating outputs that are nonsensical or altogether inaccurate.”
That sounds pretty human to me. Particularly, it sounds similar to how we initially regard the geniuses and visionaries among us before it is revealed that they are, in fact, geniuses and visionaries. You could rewrite that definition just by changing a few words:
“a phenomenon wherein a large language model genius or visionary—often a generative AI chatbot research scientist or computer vision tool architect—perceives patterns solutions or objects concepts that are nonexistent or imperceptible to human ordinary observers, creating outputs that are appear nonsensical or altogether inaccurate.”
Why are these “hallucinations” considered a shortcoming of LLMs? More generally, why is this considered different than what humans do all the time?
Or, as applied to image generators like Midjourney and DALL-E, the complaint has always been that they might be able to generate pictures of pretty buildings, but they can’t generate designs. Designs are multidimensional compositions, incorporating not only the three physical dimensions, but dimensions of culture, ecology, psychology, and human joy. A design can be represented by an image, but an image can never be a design.
True. But do you know any architects that specialize in making pretty pictures of buildings but can’t design worth a shit? I know a few.
So why deride this thing that acts so much like us? Perhaps it’s because it’s acting so much like us—just not exactly like us.
A Valley of Our Own
When we say “AI will never perform at a human level,” what we mean is, “it’ll never perform at the level of a human like me.” That’s especially relevant for architects, because we regard designs in the way that humanity, in general, regards robots. For an architect, a design that is similar to ours is flattering, a design that is too similar to ours is revolting, and a design that’s indistinguishable from ours is pure genius.
We tend to find our own designs attractive because we can see ourselves in them. They are the embodiment of our values, our proclivities, our skills; a lifetime of training and dedication expressed as an assembly of lines, shapes, and images on paper. They are us. Perhaps we even love them more than we love ourselves because they represent us, purged of imperfections. A human can’t be perfect; neither can a building. But a design? A design can be perfect. At least until the owner and the contractor screw it up.
There’s a limit, though, isn’t there? When a design gets to be too similar to our own, it elicits anger, disgust, and maybe even a lawsuit. We might conclude that the designer of that design is some talentless hack who can only produce good work by ripping off our own. Maybe we even go so far as to publicly call it out as an act of plagiarism. The stronger the similarity between their work and our own, the more our rage builds.
Although there’s a limit to that, too, isn’t there? There’s a point where a designer’s work becomes so similar to our own that we recategorize them in our minds. They are no longer plagiarists, but a thoughtful devotee of our own style. They are a loyal and dedicated apprentice, learning from the master (us).
The uncanny valley, and our progression through it, is an evolutionary adaptation. To the Neolithic cave-dweller, something appearing very dissimilar to us (e.g., a turtle) might elicit feelings of curiosity, or hunger, or possibly nothing at all. Something appearing very similar to us (e.g., our offspring) would inspire feelings of loyalty and empathy. But something in the middle (e.g., a stranger from the next valley over) would provoke mistrust, and rightfully so.
The “Difficult Whole” of Architecture
“AI will never be able to do what a human architect does.” If we really believed that, there would be no reason to fear AI, or resent it, or think of it at all. But there’s no reason to resent ourselves, either, for such a transparent contradiction. If we’re living in the uncanny valley basin, we should expect more complex and contradictory feelings in the near future, because we’re at the point on the graph where AI’s similarities to our own thinking, our own efforts, and, yes, our own designs are significant enough to straddle the line between disgust and empathy. Between revulsion and apathy. AI is at once something to be dismissed (“AI can’t do what I do, it’s all bullshit”) and feared (“What if AI takes away my job?”). To land squarely on either side would betray the evolutionary instincts that make us human in the first place.
Upon his retirement from teaching in 1987, Miro turned his attention to evangelizing what he saw as the intersections between robotics and spirituality, particularly Zen Buddhism. Although Zen Buddhist practice seeks to eliminate suffering, Miro has at times proposed that robots be made to suffer. The logic isn’t as contradictory as it might seem: an AI capable of feeling craving, suffering, and pain would then be capable of empathy, and thus become a better moral agent. In helping AIs understand human suffering, we help them to understand the “difficult whole” of human experience, which is, after all, what architects are supposed to be designing for. That’s why architecture matters; it’s the differentiator between the messy embrace of human contradiction and the relatively simple task of making a shelter.
For those architects worried that AI will figure out how to do the labor of architects, stop worrying; it already has. But without figuring out why architecture matters, it will never arrive at the same results. What we design (or discover, as in the case of scientists) has never been as important as why we design, and therein lies the human advantage. That is, until someone runs with Miro’s idea and makes a suffering robot. Until then, we’re best served by making sure that our “why” is clear—to ourselves, and to the world.
Featured image created by the author.