Early Days of AI

  • 🐕 I am attempting to get the site runnning as fast as possible. If you are experiencing slow page load times, please report it.

Cyning

kiwifarms.net
Joined
Apr 14, 2023
AI technologies like GPTs and stable diffusion have made such exponential progress that content about them from just a few years ago is like a time capsule of a completely different era. What was very limited and niche technology now pervades every facet of the internet. Here's Tom Scott and kliksphilip vids covering the ancestors of the AI slop generators we all know & love.



Post similar pre-boom content so we can all reminisce on better, more hopeful times.
 
I'm currently very lazily reading this:
The first systematic study of parallelism in computation by two pioneers in the field.

Reissue of the 1988 Expanded Edition with a new foreword by Léon Bottou

In 1969, ten years after the discovery of the perceptron—which showed that a machine could be taught to perform certain tasks using examples—Marvin Minsky and Seymour Papert published Perceptrons, their analysis of the computational capabilities of perceptrons for specific tasks. As Léon Bottou writes in his foreword to this edition, “Their rigorous work and brilliant technique does not make the perceptron look very good.” Perhaps as a result, research turned away from the perceptron. Then the pendulum swung back, and machine learning became the fastest-growing field in computer science. Minsky and Papert's insistence on its theoretical foundations is newly relevant.

Perceptrons—the first systematic study of parallelism in computation—marked a historic turn in artificial intelligence, returning to the idea that intelligence might emerge from the activity of networks of neuron-like entities. Minsky and Papert provided mathematical analysis that showed the limitations of a class of computing machines that could be considered as models of the brain. Minsky and Papert added a new chapter in 1987 in which they discuss the state of parallel computers, and note a central theoretical challenge: reaching a deeper understanding of how “objects” or “agents” with individuality can emerge in a network. Progress in this area would link connectionism with what the authors have called “society theories of mind.”
I was reading this paper and didn't understand some of it:
Deep reinforcement learning, applied to vision-based problems like Atari games, maps pixels directly to actions; internally, the deep neural network bears the responsibility of both extracting useful information and making decisions based on it. By separating the image processing from decision-making, one could better understand the complexity of each task, as well as potentially find smaller policy representations that are easier for humans to understand and may generalize better. To this end, we propose a new method for learning policies and compact state representations separately but simultaneously for policy approximation in reinforcement learning. State representations are generated by an encoder based on two novel algorithms: Increasing Dictionary Vector Quantization makes the encoder capable of growing its dictionary size over time, to address new observations as they appear in an open-ended online-learning context; Direct Residuals Sparse Coding encodes observations by disregarding reconstruction error minimization, and aiming instead for highest information inclusion. The encoder autonomously selects observations online to train on, in order to maximize code sparsity. As the dictionary size increases, the encoder produces increasingly larger inputs for the neural network: this is addressed by a variation of the Exponential Natural Evolution Strategies algorithm which adapts its probability distribution dimensionality along the run. We test our system on a selection of Atari games using tiny neural networks of only 6 to 18 neurons (depending on the game's controls). These are still capable of achieving results comparable---and occasionally superior---to state-of-the-art techniques which use two orders of magnitude more neurons.
I bought the book because I figured starting from a few decades ago would be a good place to correct this ignorance. It's a pretty good book so far, although I don't believe it's going to help me understand the paper necessarily.
 
Obligatory:
Grey pretty much got damn near everything spot on about this video from 2014 in regards to creative works being automated; from music to artistry.
 
I'd recommend searching for "AI winter", particularly in historical journal and conference papers, if you're interested in how the previous hype cycle played out.
 
  • Like
Reactions: nah and Vecr
The very first ML tool I was running locally was ESRGAN upscaling. It still has it's uses today since many of the models made for it still hold up, like all the 1x models for removing lossy compression from JPEG images, DXT textures and so on, as well as the Foolhardy Remacri 4x upscaler that still gets used in Stable Diffusion workflows.

Though if we're talking about old school ML shit, even before AI became a buzzword:
1744817063259.webp1744817073653.webp
Remember the granddaddies of LLM chatbots? This tech has been around for way longer than you think, it's just it only got exponentially better at the brink of this decade.
 
The very first ML tool I was running locally was ESRGAN upscaling. It still has it's uses today since many of the models made for it still hold up, like all the 1x models for removing lossy compression from JPEG images, DXT textures and so on, as well as the Foolhardy Remacri 4x upscaler that still gets used in Stable Diffusion workflows.

Though if we're talking about old school ML shit, even before AI became a buzzword:
View attachment 7229140View attachment 7229141
Remember the granddaddies of LLM chatbots? This tech has been around for way longer than you think, it's just it only got exponentially better at the brink of this decade.
Those weren't really machine learning. They just relied on massive databases of canned responses and had scripts to slightly edit them to fit the context of the conversation.
 
Not that much different from LLM's if you think about it.
It's funny to me that some people think LLMs are some sort of general AI. A general intelligence would be able to learn similar to how us organic intelligences learn, but none of us had to be fed the entire corpus of human knowledge to learn how to communicate. Most people don't seem to know that even though it can talk sort of like a human, the silicon is doing something completely different than the neurons in our heads.
 
  • Like
Reactions: 888Flux
Some tracks at least partially composed by AI made in 2016:
From what I can tell, the lyrics are written and sang by a real human, but it's still impressive for AI to compose a passable tune in a specific style like that almost a decade ago. The comments welcoming our new robot overlords wouldn't know how right they really were.
 
It's funny to me that some people think LLMs are some sort of general AI. A general intelligence would be able to learn similar to how us organic intelligences learn, but none of us had to be fed the entire corpus of human knowledge to learn how to communicate. Most people don't seem to know that even though it can talk sort of like a human, the silicon is doing something completely different than the neurons in our heads.
ChatGPT, Stable Diffusion, Claude, Midjourney, etc, are all downstream of a single paper called "Attention is All You Need" which defined the modern transformer architecture with the tokenization and self attention mechanism.

Combined with macroeconomic headwinds during the global COVID pandemic and a rapid flight of capital into companies like OpenAI, this is what caused the birth of current Machine Learning boom.

We are not anywhere close to general artificial intelligence. LLM models, while impressive, are not doing any actual reasoning. If you feed a transformer the entirety of the Internet + more it's not actually that suprising that you can get cohesive responses. Likewise for "text to image" models, if you feed a transformer the entirety of Google Images with accurate text descriptions it's not that suprising you can get cohesive image generation.

A more accurate understanding of LLMs should be "grep on crack". The models do not have any understanding of the output they give. They are token predictors. When you type a sentence into ChatGPT, it is predicting the next word in a sequence. And that is the fundamental problem with LLMs. They are black boxes.

Anyone interested in understanding how LLM models work should watch this video btw.
 
The recent advancements in AI are pretty interesting and fun to mess with, but I still haven't found it to be useful for any large/ambitious projects, and it's going to be a long time before it starts replacing humans en masse. As I understand it, one of the biggest problem right now is context window size (which you can think of as analogous to "working memory" in humans) and the lack of superior long-term memory solutions. RAG (Retrieval-Augmented Generation), while useful in some contexts, is insufficient for serious projects. It relies too much on imperfect semantic search algorithms, and still requires that info to be fed into context. Even Gemini 2.5's alleged 2M token context window (which is 10x the size of the more mainstream models) it isn't enough. One because AI suffers from recency bias (where newer tokens are prioritized, and older tokens become "fuzzy" in memory even if they're still technically in context), and because context scales quadratically in terms of VRAM requirements so we're very close to hitting a brick wall of what's feasible to maintain.

Until something revolutionary is invented to overcome this limitation, you can forget about using it for automating huge programming projects, complex 3D animations, videos longer than a minute, and so on. The reason being that you're inevitably going to end up with Escher-ification of the output. What I mean by that is: think of the famous "Impossible Trident," or other impossible optical illusions. With a limited context window, the AI can start out making something coherent (e.g. the right side of the fork) but as it moves left it "forgets" what's on the right side, and ends up hallucinating the left end, such that it becomes disjointed and unrealistic.

Thus, even if you can create small snippets of good, local detail, humans are still very necessary to clean up the mess and piece things together. When AI is capable of maintaining global coherent mental models this will surely change.

1744835027646.webp
 
Back