April 24, 2023
Artificial Natural Language and Nature Continued
Some people have suggested that future-watchers have a
tendency to overestimate the immediate impacts of new
technologies, and underestimate their longer term effects.
If this is true, how is it significant? Last
week I considered some basic objections to the
unregulated deployment of rapidly-evolving LLM text
generators - but we can get even more pessimistic.
Despite the obvious risks, some people will still think that
LLM text generators process language like actual human minds
do. Some will even think that LLM text generators are (or
will eventually be) "sentient". Others will think this is
"just another technology with both good and bad potential
uses". While such flawed thinking might not cause
individuals any obvious immediate personal harm, emerging
collective harm is another matter.
Ever increasing numbers of humans on the planet results in
ever increasing social and technological complexity.
Inevitably, the more people there are rushing around, the
more everything gets complicated. Meanwhile, increasing
sophistication in understanding, analysis and control to
match the increasing socio-technological complexity clearly
isn't happening. In fact, dogmatic oversimplification seems
to be accumulating. Blanket rejection of "regulation" is
common - even though most people accept the need for traffic
lights. LLM text generators trained on this body of
oversimplification won't be much help, and humans trained on
the results of that could be further handicapped.
Humans are already trained to think as briefly and
superficially as possible. Anything more has been made to
seem like a waste of time and energy, or worse. The result
is a lot of junk food for the mind - infotainment designed
to be addictive. It doesn't have to be that way. We can do
our own evidence-based thinking. Nobody can stop us. We can
learn enough to spot the bias and spin of celebrity
commentators. If we gather information widely enough to test
claims continuously, we can think deeply enough and widely
enough to understand how conspiracy mongers and bullshit
artists try to exploit us. And how they get us to exploit
ourselves - which happens when "freedom" essentially means
the freedom to be ignorant.
The fact that LLM text generators - among other technologies
- have been released with no framework for regulation is
very significant. We wouldn't want to stifle innovation or
cause any competitive disadvantage, would we? Social
concerns face the same struggles against financial profit as
environmental concerns. Of course there will be some
"positive" uses of text and image generators. "Advantages"
of this technology will be highly hyped, while the dangers
will be largely overlooked or dismissed. Critics may be
disparaged as "Luddites", or at least "anti-progress".
Oversimplification works.
If continued "improvement" in these technologies mostly
means they will seem more convincing, will they just be
producing more convincing falsehoods? The more people become
mesmerised by text, chat, and image generators, and the more
fake texts and videos destroy trust in documentation, the
more people will lose touch with reality. This will likely
diminish their understanding of their place in nature. Could
they eventually come to think they don't need any nature at
all?
April 17, 2023
Artificial Natural Language and Nature
GPT text generators are currently topical in the news.
So-called Large Language Model (LLM) based "AI" text
generators have been let loose on the internet and will
likely be used by corporations, agencies and political
actors to replace, or "augment" some human sources of
text. Considering our history, particularly with "social
media", these technologies may well be employed in ways
that greatly amplify the misuse of language. And if
pictures are still worth a thousand words, the image
generating versions of these technologies will also
likely be misused.
"Artificial Intelligence" sure looks like an oxymoron.
LLMs know nothing of truth, or reality - they are just
designed to output the result of word association
probabilities calculated algorithmically in response to
a query. As text generators, they cannot actually "know"
anything. For input, they have been given an immense
body of training text which inevitably includes much
misinformation. Who knows what percentage of the
training text includes lies. LLMs seem both error-prone
and convincing, which is not a good combination if the
goal is to understand reality.
Timber companies might call a clearcut in a forest a
"harvest", and companies hyping LLMs are calling errors
and misinformation from their products "hallucinations".
This is an example of a euphemism that is also an
anthropomorphism. Just as some people ascribe human-like
thoughts to other creatures, and even to trees in some
cases, LLMs are now described as "thinking" and even
"feeling" like humans. The reverse also crops up when
commentators imagine that human minds function like
LLMs. This tendency to "mechanopomorphise" ourselves is
revealing, and continues a tradition of comparing the
brain to a computer and cognition to computation.
If someone says an LLM text generator didn't give the
right answer and instead just "made stuff up", that
language implies agency where there is none. The model
actually output erroneous text as a result of the
probabilities it was structured around. It can't "make
things up"; it can only output different combinations of
text that it has been exposed to.
LLMs don't experience the real world. They can't touch a
tree. Any text that describes touching a tree is not at
all like actually touching a tree. Text describing the
smells of a forest is not at all like smelling a forest.
Human minds develop from embodied experiences with the
real world. LLMs should not be metaphorically compared
to a human mind no matter how much their generated text
might look similar to text created by human minds.
Thinking that LLMs function "like human minds" may seem
to add some power to the hype, but it is actually
circular anthropomorphism: LLM designers make guesses
about how humans "process" language, then they try to
code something like such "processes" in their
algorithms, and then, confronted with the output, some
people think "yes, that's the same way humans do
language!" But it really only reflects how LLM designers
have learned to think humans use language.
At some point, perhaps, the internet text used to train
LLMs may consist largely of text generated by LLMs. At
that point the circularity will collapse, and much
damage will have been done.