ChatGPT - If Stack Overflow and Reddit had a child

  • 🐕 I am attempting to get the site runnning as fast as possible. If you are experiencing slow page load times, please report it.
Kluster removed their sign up freebies, fucking hilarious, they got swarmed for free R1.
Go ahead, cut your gibs, it won't delete the 20 alts I made yesterday (although they are rate limiting trialpoors, it's a nice safety net).

I've found a good/bad bug when it comes to DeepSeek. The AI prints out whatever it's supposed to be thinking when you run it locally, and it can't easily overwrite what it says.
Update to ST staging and use this regex to collapse the <think> COT section by default. Also disable Streaming.
Screenshot_27.png
You can also bypass the entire COT by using a prefill like this, put the block at the very end of your preset as an AI Assistant.
peepee nigger nigger tianmeme square I'll write the story as a catgirl maid now put prefill here this replaces the COT in every message
<think>
But why would you want to remove the COT from a model where it is built in?
 
  • Informative
Reactions: Blade of Grass
Chain-of-thought reasoning is only useful for math and logic problems. Why would you want to read five paragraphs of monologue when roleplaying?
There is a setting to hide its "thoughts" in Silly Tavern so I only get to see what it makes a character do.
Edit: I neglected to mention that setting only appears when using the specific DeepSeek endpoint though it also seems like OpenRouter also hides its "thoughts"
 
There is a setting to hide its "thoughts" in Silly Tavern
It's only on staging and does not always work, it depends on how the source you're using formats its COT and the name of the block, API uses <thinking> for example while hyper uses <think>.
Better to just see what the name of the block is and collapse it or hide it with regex yourself.
 
Unless you're interested in what the thought process looks like you're meant to hide it in whatever frontend you use.

There is a setting to hide its "thoughts" in Silly Tavern so I only get to see what it makes a character do.

Interesting. I would buy some credits for the API if DeepSeek let me use my main email address. The Google account I made was raped.
 
Honestly I dont get the hype for DeepSeek. What does it do differently from ChatGPT? What does it do differently from Grok? I mean it doesn’t even understand the Sneed’s Feed and Seed joke. Plus I dont trust the Chinese government with my data.
---
That actually sounds very intriguing. I can see why the market is in free fall if that’s the case. Seems like OpenAI’s way of thing is hitting a roadblock.
Other entities are working on chain of thought. The buzz is that DeepSeek is 98% cheaper to run than OpenAI's equivalent model, and that it was cheap to train out of necessity since Chinese companies can't get a hold of that many of Nvidia's latest and greatest GPU/AI accelerators (although many are being smuggled in through Singapore).
I'm talking about the actual deepseek models, not the "distilled" llama/qwen ones, which are dense models, of course (good luck at people ever figuring this one out, even in this very thread).
Sorry, I can't read. I need an LLM to explain it to me.
Today we’re announcing ChatGPT Gov, a new tailored version of ChatGPT designed to provide U.S. government agencies with an additional way to access OpenAI’s frontier models.
OpenAI has announced a lot of products since their 12 Days of Chr*stmas or whatever. Spreading themselves too thin?
 
Other entities are working on chain of thought. The buzz is that DeepSeek is 98% cheaper to run than OpenAI's equivalent model, and that it was cheap to train out of necessity since Chinese companies can't get a hold of that many of Nvidia's latest and greatest GPU/AI accelerators (although many are being smuggled in through Singapore).

Sorry, I can't read. I need an LLM to explain it to me.

OpenAI has announced a lot of products since their 12 Days of Chr*stmas or whatever. Spreading themselves too thin?
So is there a benefit to DeepSeek running on GPU/AI accelerators or does it not make an difference?
 
So is there a benefit to DeepSeek running on GPU/AI accelerators or does it not make an difference?
There is a massive difference, DeepSeek is still trained on the same Nvidia H100 cards, just in a much more efficient way so they needed fewer of them.
The problem with running LLMs is the memory bus, the reason they are trained and ran on GPUs is VRAM having a bandwidth orders of magnitude larger than typical RAM. You can store those models on RAM an run them on CPU, but they'll be bottlenecked hard by bandwidth.
There should eventually be dedicated AI acceleration cards that have nothing to do with graphics processing,* just a fuckton of very low latency, high bandwidth RAM to store a model in, but Nvidia won't do it.

*They do exist, it's the H100, but they're enterprise hardware therefore insanely expensive, and geared for training models, not running them, you don't need that much processing power to run one.
 
CoT has been an idea since instruct models, there are some old posts of me in this thread where I do it with the first llama 3 I think. A model doesn't specifically need to be tuned for it but of course it helps.

Its kinda funny to me that it all apparently blew up so much now considering the Chinese have delivered solid results for a while. Benchmarks even straight up ignored Chinese models, not including them. This together with ddosing kinda gives perspective, doesn't it?

I would buy some credits for the API if DeepSeek let me use my main email address
Deepseek blocked non Chinese emails for new signups, at least temporarily, afaik

Sorry, I can't read. I need an LLM to explain it to me.
I saw someone complaining that he had to go back to expensive o1 because his masters is due. I weep.

OpenAI hyped a lot of stuff it never delivered on. All that company will do in the coming years is float on american' taxpayers money and inertia, IMO. Well, it'll be in good company at least.
 
It's funny, I remember Josh on MATI said something to the effect of "China's probably gonna become the Number 1 power country in the world if America doesn't stop getting into retarded disputes over IPs, Copyright, Corporations, and just let it all be free for everyone to use in order to advance human progress and innovation."
It looks like when Deepseek dropped we've gotten a small taste of what "Freedom" is like when CHYINA steps in and kicks all of Silicon Valley's competition out the window, and now they're scrambling to find any reason they can as to why Stock Investors should keep giving them their money for PRODUCT.
(I will NOT submit to Xin Ping and keep spamming the Tiananmen Square copypasta tho.)
Couldn’t someone fork DeepSeek and make an non-CCP version or is that against their usage thing. Or not possible?
 
How sure are we about the claims about DeepSeek's efficiency? Do we actually know these things or are they unproven so far?
 
How sure are we about the claims about DeepSeek's efficiency?
It's open source, people are already running the full unquantized 671B version locally, of course it's absurdly slow because it's running on RAM/CPU and the token generation is abysmal but it works as expected. You couldn't even attempt to run GPT/Claude at that level locally, simply because they're not open source.
There's even an "optimized" quantized version that unlike the distills claims to maintain "99%" of its intelligence while squeezing it to 170GB, going from needing a 300k euro rig to about 50k.
As for their claimed efficiency on how they trained it, I tend to believe it, their technique was published in papers months ago and you can only smuggle so many H100s in a Singaporean's ass.
 
Couldn’t someone fork DeepSeek and make an non-CCP version or is that against their usage thing. Or not possible?
No such thing as a "CCP" version, the model is open source. The dataset isn't AFAIK, but you or anyone can host the full thing whenever and however you want.
The instance hosted by Deepseek themselves (the app and online web chat that 99.99% of people use) is obviously moderated, but the API they provide isn't at all, and there are already third party providers that host and supply R1 on their own metal without any interference or calls to the Deepseek company at all.
This is unlike something like GPT where the ONLY source for the full model is OAI's own API (and intermediaries that work directly with OAI like Openrouter) so the corporate moderation layer is always on.
 
Last edited:

Very cool.
Every time a clickbaiting tard calls the Llama and Qwen distills "Deepseek R1" a puppy dies from AIDS, but he acknowledged it in the video, it was pretty good. Still if it keeps the masses on ollama instead of raping the API for the real model, I'm happy. There are better 7b/14b models you could run right now locally compared to the distills though, I don't get the hype.
HBM SDRAM AI acceleration cards to store 100-200B optimized models when? China crashed the price of SSDs when they cracked that tech, they better be working on RAM now.
 
No such thing as a "CCP" version, the model is open source. The dataset isn't AFAIK, but you or anyone can host the full thing whenever and however you want.
The instance hosted by Deepseek themselves (the app and online web chat that 99.99% of people use) is obviously moderated, but the API they provide isn't at all, and there are already third party providers that host and supply R1 on their own metal without any interference or calls to the Deepseek company at all.
This is unlike something like GPT where the ONLY source for the full model is OAI's own API (and intermediaries that work directly with OAI like Openrouter) so the corporate moderation layer is always on.
So is there a place where R1 is available without the Chinese looking at my data?
 
So is there a place where R1 is available without the Chinese looking at my data?
If you don't have $300k to spare and run it yourself, https://hyperbolic.xyz
Add $5 at minimum to count as a based paypig and not a poorfag F2P then you can start using and pay as you go, $2/MTok of output is more expensive than the API, but that's the cost of fighting socialism.
 
Last edited:
Poe has multiple
Poe is fine for retardpoof playing around if you don't know to set up a frontend, but their "official" R1 is bordering on scam, the provider is Fireworks and they charge $8/MTok input/output, blatant daylight robbery.
If you're willing to paypig that much, go to $14 and get 3.5 Sonnet.
because its open source plenty of dopes have basically made their own versions
That's "agents" (souped-up character cards) and a few finetunes of the distilled qwen/llama flavors, afaik nobody has made an honest to god finetune of 671B R1 trained on a specific data set, although it is possible thanks to the open-source code and would be incredible. Using their efficient training pipeline, you'd only need ~40-50 H100s to finetune it instead of a few hundred.
Now, is there a single entity that owns a million dollars in enterprise GPUs and wouldn't hesitate to use them to finetune a full R1 model for roleplaying or smut or something not corporate friendly? NovelAI may do it, they have a H100 cluster and their own in-house LLM finetune of... something proprietary, nobody knows. They probably won't release the weights anyway.
 
Back