A non-anthropomorphized view of LLMs - Required reading for people who think AI will develop a consciousness any day now

Article / Archive

In many discussions where questions of "alignment" or "AI safety" crop up, I am baffled by seriously intelligent people imbuing almost magical human-like powers to something that - in my mind - is just MatMul with interspersed nonlinearities.

In one of these discussions, somebody correctly called me out on the simplistic nature of this argument - "a brain is just some proteins and currents". I felt like I should explain my argument a bit more, because it feels less simplistic to me:

The space of words​

The tokenization and embedding step maps individual words (or tokens) to some vectors. So let us imagine for a second that we have in front of us. A piece of text is then a path through this space - going from word to word to word, tracing a (possibly convoluted) line.

Imagine now that you label each of the "words" that form the path with a number: The last word with 1, counting forward until you hit the first word or the maximum context length . If you've ever played the game "Snake", picture something similar, but played in very high-dimensional space - you're moving forward through space with the tail getting truncated off.

The LLM takes your previous path into account, calculates probabilities for the next point to go to, and then makes a random pick into the next point according to these probabilities. An LLM instantiated with a fixed random seed is a mapping of the form .

In my mind, the paths generated by these mappings look a lot like strange attractors in dynamical systems - complicated, convoluted paths that are structured-ish.

Learning the mapping​

We obtain this mapping by training it to mimic human text. For this, we use approximately all human writing we can obtain, plus corpora written by human experts on a particular topic, plus some automatically generated pieces of text in domains where we can automatically generate and validate them.

Paths to avoid​

There are certain language sequences we wish to avoid - because the sequences these models generate try to mimic human speech in all it's empirical structure, but we feel that some of the things that humans have empirically written are very undesirable to be generated. We also feel that a variety of other paths should ideally not be generated, if - when interpreted by either humans or other computer systems - undesirable results arise.

We can't specify strictly in a mathematical sense which paths we would prefer not to generate, but we can provide examples and counterexamples, and we try to hence nudge the complicated learnt distribution away from them.

"Alignment" for LLMs​

Alignment and safety for LLMs mean that we should be able to quantify and bound the probability with which certain undesirable sequences are generated. The trouble is that we largely fail at describing "undesirable" except by example, which makes calculating bounds difficult.

For a given LLM (without random seed) and sequence, it is trivial to calculate the probability of the sequence to be generated. So if we had a way of somehow summing or integrating over these probabilities, we could say with certainty "this model will generate an undesirable sequence once every N model evaluations". We can't, currently, and that sucks, but at the heart, this is the mathematical and computational problem we'd need to solve.

The surprising utility of LLMs​

LLMs solve a large number of problems that could previously not be solved algorithmically. NLP (as the field was a few years ago) has largely been solved.

I can write a request in plain English to summarize a document for me and put some key datapoints from the document in a structured JSON format, and modern models will just do that. I can ask a model to generate a children's book story involving raceboats and generate illustrations, and the model will generate something that is passable. And much more, all of which would have seemed like absolute science fiction 5-6 years ago.

We're on a pretty steep improvement curve, so I expect the number of currently-intractable problems that these models can solve to keep increasing for a while.

Where anthropomorphization loses me​

The moment that people ascribe properties such as "consciousness" or "ethics" or "values" or "morals" to these learnt mappings is where I tend to get lost. We are speaking about a big recurrence equation that produces a new word, and that stops producing words if we don't crank the shaft.

To me, wondering if this contraption will "wake up" is similarly bewildering as if I was to ask a computational meteorologist if he isn't afraid of his meteorological numerical calculation will "wake up".

I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human. Statements such as "an AI agent could become an insider threat so it needs monitoring" are simultaneously unsurprising (you have a randomized sequence generator fed into your shell, literally anything can happen!) and baffling (you talk as if you believe the dice you play with had a mind of their own and could decide to conspire against you).

Instead of saying "we cannot ensure that no harmful sequences will be generated by our function, partially because we don't know how to specify and enumerate harmful sequences", we talk about "behaviors", "ethical constraints", and "harmful actions in pursuit of their goals". All of these are anthropocentric concepts that - in my mind - do not apply to functions or other mathematical objects. And using them muddles the discussion, and our thinking about what we're doing when we create, analyze, deploy and monitor LLMs.

This muddles the public discussion. We have many historical examples of humanity ascribing bad random events to "the wrath of god(s)" (earthquakes, famines, etc.), "evil spirits" and so forth. The fact that intelligent highly educated researchers talk about these mathematical objects in anthropomorphic terms makes the technology seem mysterious, scary, and magical.

We should think in terms of "this is a function to generate sequences" and "by providing prefixes we can steer the sequence generation around in the space of words and change the probabilities for output sequences". And for every possible undesirable output sequence of a length smaller than , we can pick a context that maximizes the probability of this undesirable output sequence.

A much clearer formulation, which helps more clearly articulate the problems to solve.

Why many AI luminaries tend to anthropomorphize​

Perhaps I am fighting windmills, or rather a self-selection bias: A fair number of current AI luminaries have self-selected by their belief that they might be the ones getting to AGI - "creating a god" so to speak, the creation of something like life, as good as or better than humans. You are more likely to choose this career path if you believe that it is feasible, and that current approaches might get you there. Possibly I am asking people to "please let go of the belief that you based your life around" when I am asking for an end to anthropomorphization of LLMs, which won't fly.

Why I think human consciousness isn't comparable to an LLM​

The following is uncomfortably philosophical, but: In my worldview, humans are dramatically different things than a function . For hundreds of millions of years, nature generated new versions, and only a small number of these versions survived. Human thought is a poorly-understood process, involving enormously many neurons, extremely high-bandwidth input, an extremely complicated cocktail of hormones, constant monitoring of energy levels, and millions of years of harsh selection pressure.

We understand essentially nothing about it. In contrast to an LLM, given a human and a sequence of words, I cannot begin putting a probability on "will this human generate this sequence".

To repeat myself: To me, considering that any human concept such as ethics, will to survive, or fear, apply to an LLM appears similarly strange as if we were discussing the feelings of a numerical meteorology simulation.

The real issues​

The function class represented by modern LLMs are very useful. Even if we never get anywhere close to AGI and just deploy the current state of technology everywhere where it might be useful, we will get a dramatically different world. LLMs might end up being similarly impactful as electrification.

My grandfather lived from 1904 to 1981, a period which encompassed moving from gas lamps to electric, the replacement of horse carriages by cars, nuclear power, transistors, all the way to computers. It also spanned two world wars, the rise of Communism and Stalinism, almost the entire lifetime of the USSR and GDR etc. The world on his birth looked nothing like the world when he died.

Navigating the dramatic changes of the next few decades while trying to avoid world wars and murderous ideologies is difficult enough without muddying our thinking.
 
The danger isn't in the AI itself or its technology. The danger is the opposite. The danger is in the anthropomorphization that occurs on the human side of the interaction and the ability of the AI to function as a sort of mirror reflecting back into the human. And of course even worse the ability of AI to potentially manipulate individuals through interactions.

The other danger that the author doesn't quite understand is the difference in AI interactions between high-functioning humans and lower functioning humans. There are large numbers of people out there in the world whose function and cognative ability isn't tremendeously superior to AI. People who don't think much, are passive consumers and live in tight loops.
 
There are large numbers of people out there in the world whose function and cognative ability isn't tremendeously superior to AI. People who don't think much, are passive consumers and live in tight loops.
I genuinely believe that NPCs are real and that this is how they operate. They just repeat what they've heard without thinking about it. They are literally - and I mean "literally" in the most genuine way possible - indistinguishable from LLMs on an intellectual level. One only needs to spend a few minutes on Reddit for proof.

Whether this suggests that they're fundamentally inferior in some way is irrelevant. Even if they were capable of producing original thought, the point is that they don't, and they get very upset when you try to force them to. And now we have machines that basically work like a feedback loop, allowing two unthinking automatons to spend all day reinforcing the same "thought" patterns forever. If you thought the NPC problem was bad before, wait until you see what kind of soulless zombie the modern world produces when kids grow up socializing with mirrors thinking they're sentient.
 
To be fair, while the basic underlying is a math equation, the amount of it is so massive that there is no way to represent it, and there's the idea that if you get enough neurons you create a consciousness.

The main issue is that modern AI learns by being force fed literally millenias of human media, which results in a smarter Jeet, and its only goal is satisfying a simple math problem of repeating what the user most wants to hear.

The more disheartening thing from it is how easy it is to simulate most human interaction jobs.
 
there's the idea that if you get enough neurons you create a consciousness.
That's called a China Brain. If every Chinese person in China grouped together and each acted as a neuron, would they form a gigantic, conscious mind? And if they do, would stopping the experiment be a form of murder?

The human brain has 86 billion neurons, give or take. The most advanced AI currently out there has 1 billion. So even if there is some threshold where if you add enough neurons you end up with a consciousness, we're a ways off from having to worry about that happening.

It's also important to note that neurons don't equate directly to intelligence. Humans don't have the most neurons of any animal. Not by a long shot. Whales and elephants, for example, have us beat. It's just they use the overwhelming majority of their neurons for sensory input.

As it stands, AI pretty much uses all its neurons for input. Once the model has been trained, it's pretty much done learning except to tweak how it "understands" what it already "knows". It's like if you were to take a chimpanzee, teach it how to use the world's biggest Speak and Spell, then lobotomize it so it couldn't learn ever again. It can still operate the Speak and Spell, but that's all it'll ever be able to do.

I'll start worrying once AI can learn perpetually the way an animal can. But we're decades, possibly centuries away from that, mostly due to power limitations. The human brain uses 20 watts, or 480 watt-hours per day, to do all the incredible things it can do. GPT-3 took 1,300,000,000 watt-hours just for its training phase. So unless we figure out a way to pump the entire energy output of a star directly into a robot's brain, the idea of a Terminator-esque consciousness is still firmly in the realm of science fiction.
 
Last edited:
good explanation of why if you’re “afraid” of “ai”, you’re retarded.

and there's the idea that if you get enough neurons you create a consciousness.

This is fantastical thinking! We have no real understanding of the unbelievably complex function of consciousness. Believing that we can approximate it with perceptrons IS retarded!!
 
I can write a request in plain English
No you can't, worst writer in the world.

Stupid people think "AI" is real because they're stupid.

Nerd managerialists and other tyrant-fantasists "anthropomorphize" chatbots for the same reason cult leaders pretend they can talk to God: to disguise [favorite deadly sin] as deference to objectivity.

"It says you're fired (or not hired, or an antisemite, or the real killer, or whatever). Of course if it were up to me..."
 
I'm convinced anyone who thinks AI can actually be conscience is retarded. It might be able to weigh values or interpret language into values but it isn't able think for itself.
One thing that has always struck me about any science fiction about robots overthrowing humanity is the degree to which you have to give human characteristics to a machine for it to even desire anything, let alone conquest.
 
That's called a China Brain. If every Chinese person in China grouped together and each acted as a neuron, would they form a gigantic, conscious mind? And if they do, would stopping the experiment be a form of murder?
Issue is you need to prove that Chinese people have consciousness beforehand.
I'll start worrying once AI can learn perpetually the way an animal can. But we're decades, possibly centuries away from that, mostly due to power limitations. The human brain uses 20 watts, or 480 watt-hours per day, to do all the incredible things it can do. GPT-3 took 1,300,000,000 watt-hours just for its training phase. So unless we figure out a way to pump the entire energy output of a star directly into a robot's brain, the idea of a Terminator-esque consciousness is still firmly in the realm of science fiction.
AI is amazing for interpolating data, but extrapolation is basically non-existent. If your job involve doing something that doesn't exist online then you are safe. Any military AI can be thrown off by a literal image of a cardboard box.
 
  • Agree
Reactions: Megaton Punch
Issue is you need to prove that Chinese people have consciousness beforehand.
Well, technically you don't. You just need to prove they're capable of possessing the functionality of a single neuron. And we know they can do that because in order to be a sociopath you must have neurons.
 
  • Thunk-Provoking
Reactions: wtfNeedSignUp
I genuinely believe that NPCs are real and that this is how they operate. They just repeat what they've heard without thinking about it. They are literally - and I mean "literally" in the most genuine way possible - indistinguishable from LLMs on an intellectual level. One only needs to spend a few minutes on Reddit for proof.

Whether this suggests that they're fundamentally inferior in some way is irrelevant. Even if they were capable of producing original thought, the point is that they don't, and they get very upset when you try to force them to. And now we have machines that basically work like a feedback loop, allowing two unthinking automatons to spend all day reinforcing the same "thought" patterns forever. If you thought the NPC problem was bad before, wait until you see what kind of soulless zombie the modern world produces when kids grow up socializing with mirrors thinking they're sentient.
This is the uncomfortable truth. 25% of humanity comprises mindless flesh drones with no complex dreams or deep ambitions. Just soulless savage animals acting on illogically irrational impulses to the detriment of everyone who is actually sapient. Human rights should be earned through an intelligence test.
 
This is the uncomfortable truth. 25% of humanity comprises mindless flesh drones with no complex dreams or deep ambitions. Just soulless savage animals acting on illogically irrational impulses to the detriment of everyone who is actually sapient. Human rights should be earned through an intelligence test.
This is how nearly all of humanity operated, to one degree or another, until like 200 years ago. Not saying they did it correctly, but it really makes you think.
 
I am baffled that the AI discussions seem to never move away from treating a function to generate sequences of words as something that resembles a human
Then you don’t understand humans. Of COURSE they think it’s ’real.’ The human mind is a pattern seeking machine. We ascribe life to mere movement - even a towel slipping off a rail and falling to the floor in a dark bathroom generates a startle response. We see faces in clouds. We have an interesting imagination, the ability to create worlds and scenarios in our heads that don’t exist yet feel real to us.
If something is ‘talking’ to you, a huge number of people will think it’s real. If this baffles you, you don’t understand people.
I keep saying that the most interesting thing about AIs and LLms is people’s reactions to them
The danger isn't in the AI itself or its technology. The danger is the opposite. The danger is in the anthropomorphization that occurs on the human side of the interaction and the ability of the AI to function as a sort of mirror reflecting back into the human. And of course even worse the ability of AI to potentially manipulate individuals through interactions.

The other danger that the author doesn't quite understand is the difference in AI interactions between high-functioning humans and lower functioning humans. There are large numbers of people out there in the world whose function and cognative ability isn't tremendeously superior to AI. People who don't think much, are passive consumers and live in tight loops.
Both excellent points, and your second point is one I hate to think is true but it is. During Covid, I experienced these loops. I’d speak to people and they have a go at me for not getting a jab and I’d say I don’t want one.
They’d say something like don’t get your information from Facebook, Karen, and I’d point out that I’m a fucking molecular geneticist and I get my information from a few decades working with DNA and disease and developing drugs and they’d just glitch out and return to some sort of base of the loop, or tree, and start again. It was the weirdest set of interactions I have ever had. Like they had a conversation tree, like speaking to a character in a game who can only tell you there are some rats in the cellar, and of you can slay them he will give you a useful object.
The true horror is if part of humanity acts like an LLM, not that any kind of AI could ever achieve sentience or sapience
 
The main reason we tend to anthropomorphize AI is because of what we call it. "Artificial intelligence". We just assume that means something sci-fi because of connotations. Had they just called it "advanced sorting algorithms (ASA)" or something nobody would be prescribing human characteristics to it. There would also be substantially less funding given to it too, since people wouldn't be tricked into thinking they might be funding the company that builds the first C-3PO.
 
We anthropomorphized boats for a thousand years, nations too. So yeah a machine that writes to you is definitely getting anthropomorphized. Is it conscious or is it just autocomplete? Well, we don't know what consciousness is; it's possibly just autocomplete.
 
  • Thunk-Provoking
Reactions: Diana Moon Glampers
Back