ChatGPT - If Stack Overflow and Reddit had a child

Seethan Ralph · Jan 30, 2025

Just noticed this on shitter:

https://x.com/BrianRoemmele/status/1884829743893356616

Researchers recreated DeepSeek's core technology for just $30! A research team at the University of California, Berkeley, has reportedly recreated the core technology behind China’s advanced DeepSeek AI for just $30. This remarkably low-cost replication highlights a growing trend in AI development—while major tech companies produce impressive models, “garage” open source approaches significantly reduce the cost of building them. Under the leadership of Ph.D. candidate Jiayi Pan, the team successfully replicated DeepSeek R1-Zero’s reinforcement learning (RL) capabilities using a compact language model with just 3 billion parameters. Despite its smaller scale, the model demonstrated self-verification and search capabilities, allowing it to refine its own responses iteratively—key features of DeepSeek’s advanced AI.

Colon capital V · Jan 30, 2025

Seethan Ralph said:
Just noticed this on shitter:

https://x.com/BrianRoemmele/status/1884829743893356616

Link to Github page:

https://github.com/Jiayi-Pan/TinyZero

I'm not sure what most of this means or whether this 3B parameter is as smart or close to Deepseek's nearly 700B parameter model, but I guess as long as it's free and open to use it'll help actually give more people power and freedom choosing how to use AI in their lives.

Seethan Ralph · Jan 30, 2025

Colon capital V said:
I'm not sure what most of this means or whether this 3B parameter is as smart or close to Deepseek's nearly 700B parameter model, but I guess as long as it's free and open to use it'll help actually give more people power and freedom choosing how to use AI in their lives.

It means you don't need 500 gorillons of USD to train a reasoning model. 3B model is extremely scaled down, but if they have really managed to train it for 30 bucks, suddenly Deepseek's claims regarding their model's training cost aren't necessarily a mongoloid fantasies

Fae Supremacist · Jan 30, 2025

We really need a new OP for a general LLM thread, unless Altman gives a fat donation to keep the ChatGPT title on.

The Spoils of War · Jan 30, 2025

Colon capital V said:
I'm not sure what most of this means or whether this 3B parameter is as smart or close to Deepseek's nearly 700B parameter model, but I guess as long as it's free and open to use it'll help actually give more people power and freedom choosing how to use AI in their lives.

While larger models aren't necessarily better, any model with seven billion parameters or less is dumb. For comparison, 3B models can fit on a smartphone. You want at least 13 billion parameters.

Get the rope Macaulay! · Jan 30, 2025

Seethan Ralph said:
but if they have really managed to train it for 30 bucks, suddenly Deepseek's claims regarding their model's training cost aren't necessarily a mongoloid fantasies

at this point it really does make you wonder whats stopping multiple groups from having their own version of their 700B model and adjusting it a bit and having true free speech machines. Kiwifarms AI?

seriously whats stopping Null or someone else from just downloading a stupid 14b model and letting users on the farms chat with it?

Maurice Maine · Jan 30, 2025

Get the rope Macaulay! said:
at this point it really does make you wonder whats stopping multiple groups from having their own version of their 700B model and adjusting it a bit and having true free speech machines. Kiwifarms AI?

seriously whats stopping Null or someone else from just downloading a stupid 14b model and letting users on the farms chat with it?

It will get spammed to shit and they cost too much to run

Something's coming from OAI today. I wonder what it is.

AmpleApricots · Jan 30, 2025

o3/o3-mini most likely. If you watch this sphere this always happens, somebody releases a great model and then everyone rushes to dump whatever they have.

I'm honestly not sure what all this excitement is even really about. It's so early in the AI race, expect the guy at the front to change more often in the coming months and years, IMO. It doesn't mean much. If this is anything like the home/personal computer revolution (and there are really a lot of parallels) ten years down the road, most of these companies will be defunct and only the people who were really interested in all this will remember their names. Hell, there might be one day "retro AI enthusiasts" dissecting/manipulating/modernizing all these models with advanced tools we can't even imagine now. It's how the cookie crumbles.

OpenAI will probably still be utter dicks and not releasing anything into the public domain tho.

Get the rope Macaulay! · Jan 30, 2025

AmpleApricots said:
OpenAI will probably still be utter dicks and not releasing anything into the public domain tho.

someone brought it up on a subreddit, but its ironic that a non-profit only releases closed source and a fucking hedgefund gave us the biggest/most advanced open source model (so far, at this time)

i don't even know why someone would want to put AI on a computer the size of a credit card, but at minimum it keeps the rest of these fuckers on their toes. if Deepseek didn't do this, it would probably be even more years down the line before someone accomplished this. even the people downplaying this have to admit that because of this any dipshit with 6k to spent on a computer can put on their own computer a model that rivals what the fucks at OpenAI wouldn't let be released to the public last year.

It basically lights a fire under the asses of these companies and forces them to answer "why should i use your service when i can use Tommy dipshit's model for free without as much censorship and get an answer at least somewhat close to what you gave me" and just based on the download figures there's at least a half million people running this on their own shit and could very easily cut into private companies obnoxious $/per token madness. If this shit came out 2 months ago no way in hell would OpenAI have idiots paying $200/month for their bullshit. samething with Poe offering $100/month plans or Anthropic and their $20 a month only 5% of messages not refused madness.

I guarantee OpenAI wasn't going to release their o3 faggotry until December. this was their "in case of emergency" move so that they're still the big name in the technology and get the contracts.

Especially when it comes to AI this decade, people need to remember "this is the most expensive and slowest it will ever be"

We're not going to hit where we are with GPUs where even when we switch to a new generation the output per core/wattage is negligible. for a long fucking time.

Potatoherder · Jan 30, 2025

Mistral released a 'small' 24B model and it's pretty good. Overall it seems pretty damn good for 24B and seems like a prime target to finetune.

cybertoaster · Jan 31, 2025

Seethan Ralph said:
Just noticed this on shitter:

https://x.com/BrianRoemmele/status/1884829743893356616

Bubble about to implode.

ayanamirei said:
We really need a new OP for a general LLM thread, unless Altman gives a fat donation to keep the ChatGPT title on.

Already have one: https://kiwifarms.st/threads/ai-development-and-industry-general.176265/page-12

Don't know why everybody uses this thread for all AI news instead of just CGPT.

Maurice Maine · Jan 31, 2025

Deepseek claims CIA is taking their operations down.

Honestly I think Keffals could do it.

888Flux · Jan 31, 2025

hello trump, its altman. we need 5 billion dollars to count incest children. slava OpenAI

XL xQgg?QcQCaTYDMjqoDnYpG · Jan 31, 2025

888Flux said:
View attachment 6926726

Is the answer that Sally also has an disclosed amount of sons who have contributed a total of 1 granddaughter to Sally?

888Flux · Jan 31, 2025

XL xQgg?QcQCaTYDMjqoDnYpG said:
Is the answer that Sally also has an disclosed amount of sons who have contributed a total of 1 granddaughter to Sally?

Correct. I'm no LLM expert

but there has to be enough training data out there on riddles and the like that it should get this correct by now.

The Spoils of War · Jan 31, 2025

This is some damn good prose from DeepSeek v3.

GeorgeWashingtoff · Jan 31, 2025

It's been 2 years. How many jobs have been lost and how many planes crashed because of chatgpt?

Fae Supremacist · Feb 1, 2025

The absolute state.

It's not going to pass, but still funny to see.
https://www.hawley.senate.gov/hawle...american-ai-development-from-communist-china/

AmpleApricots · Feb 1, 2025

Men in Black said:
View attachment 6926933

This is some damn good prose from DeepSeek v3.

I've been wondering how to use these things in videogames in a reasonable way for about 1-2 years now.

For general reasoning about game logic, they do not perform well. I tried to daisychain older llama models into navigating a MUD like enviroment and they did not do a great job in general. It's much easier and more effective to do this with conventional programming, and it's a lot more efficent resource-wise to boot. They're also not being good at injecting random stuff into things because of the strong biases most models have. The smarter they are, the truer this actually seemed to get. R1 (which is sort of the parent of V3) is sort of the odd one out in that regard that it's both very smart and very "creative", able to come to wildly different conclusions in the same setting if you just prompt it to do so. I think deepseek actually trained it for roleplaying and creativity targetly, or at least didn't filter the datasets that go into those directions as hard.

Controling NPC dialog - kind of, some game mods already tried that in CK III and Skyrim and I think Fallout 4?! (at least the three I know of) I feel the problem here is that they'd quickly lose track and/or could easily talked by the player into immersion-breaking and weird outcomes.

I feel they would work really well in Solo RPG settings, as long as you keep track of the rules and just have the LLM write fluff or wordbuilding. I'd trust R1 and V3 to pull that off without getting too boring. But of course, that's something that needs a lot of your input and isn't for everyone.

Fae Supremacist · Feb 1, 2025

cybertoaster said:
Don't know why everybody uses this thread for all AI news instead of just CGPT.

Community Watch

Yeah it makes sense that it's unused, it's not cataloguing a demographic, it's discussing a topic.

ChatGPT - If Stack Overflow and Reddit had a child

Seethan Ralph

By Allah, you people are dogs!

Colon capital V

Loudest, biggest, most nuclear-size Brap above me

Seethan Ralph

By Allah, you people are dogs!

Fae Supremacist

Un jour je serai de retour près de toi

The Spoils of War

Get the rope Macaulay!

weird sperg

Maurice Maine

Sigh, Cry, Die

AmpleApricots

Get the rope Macaulay!

weird sperg

Potatoherder

cybertoaster

Chairman of the mammary regulation committee

Maurice Maine

Sigh, Cry, Die

888Flux

XL xQgg?QcQCaTYDMjqoDnYpG

lrhhtf oo uTinfiars oEs dto og

888Flux

The Spoils of War

GeorgeWashingtoff

Fae Supremacist

Un jour je serai de retour près de toi

AmpleApricots

Fae Supremacist

Un jour je serai de retour près de toi