- Joined
- Aug 14, 2022
Smells more like it got viewbotted. It has 55 dislikes as of right now.3.1M views but only 84 Likes
Follow along with the video below to see how to install our site as a web app on your home screen.
Note: This feature may not be available in some browsers.
Smells more like it got viewbotted. It has 55 dislikes as of right now.3.1M views but only 84 Likes
It might be used for ads on YouTube itself.Smells more like it got viewbotted. It has 55 dislikes as of right now.
Lenovo touting AI image generation for non-binary girls who "first started using the Internet seven years ago [when I was 12]" in order to be their Authentic Selves. Including creepy uncanny-valley avatar with Annoying Orange style lips.
View attachment 6144292
Why Lenovo, a business-focused vendor, thought this was a good ad for them I don't know. 3.1M views but only 84 Likes. This is a text book example of why YouTube took away downvotes.
Well, scratch Lenovo off the list of any future purchases! I hope the kid isn't too scarred by this in the future.
hotpot.ai has a low max res but allows virtually unlimited free use (50 per day but simply clearing your cache resets it).Seems like almost every online AI tool or generation site has "sign in with Google" (never works), is nonfunctional, has limited free use, or some combination of such BS.
Still using AUTOMATIC1111; I'm on version v1.9.3. Did you do a clean installation?I'm having problems running automatic1111 webui. I use to be able to run this thing but now I'm getting this error:
File "/home/name/Documents/automatic/stable-diffusion-webui/modules/launch_utils.py", line 386, in prepare_environment
raise RuntimeError(
RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check
I've installed the latest torch version, my gpu has the latest cuda version etc....
Do you use comfyUI or automatic 1111?
and whats this about a config file? What is it do you need when training stable diffusion or any AI like this for say silly tavern? I got images that should have captions in .txt files, stable diffusion 1.5 pruned safetensorfile. What else do you need?
I don't recall anyone mentioning SD instead of copilot. I do remember the MSFT CEO had to go out to apologize and promise they would fix it.What's hilarious is how SD ate shit over the Taylor Swift deepfakes thing, when most of the images were generated using Microsoft's tools by abusing their shitty filtering
That fatso hasn't been relevant in like 20 years, why hire her?And also apparently you can use an AI Queen Latifah
I thought japanese waters were whale-free...Looks like they did another with a fat japaneese woman:
LLMs are bound by memory speed. TPUs generally use system memory which is slower than GPU memory modules. They are not good for LLMs of high quality. In addition they can supposedly only do 8bit ops. That makes them good for some inference, but not for training which generally uses fp16 or fp32.So what's the deal with those new AI accelerator chips? are those good for all AI or just local LLM? I saw one dude running LLaMA on an Rpi5 with a coral TPU which is just 4 TOPS and yet it ran decent enough.
In both cases it's someone who feels they have a "real" self inside that is truly them. Fantasy is an important part of growing up, maybe an idealised self-image can help that. But I think AI tools that assist on building that alternate life into some kind of facsimile of reality and perpetuating it, encouraging someone to try and live out that life and relate to their family members via it rather than their actual self is unhealthy. The crux of it is that "actual self". The girl sees her "Spider" identity as her actual self and the real biological reality as not her true identity. Very transhumanist, transcend the flesh and all that. But it can be a trap, same way some dudes live out their fantasy female self and eventually troon out.Both of these are really sad. A conversation is an exchange of emotions and ideas. When you remove yourself from that conversation and use a proxy to communicate, you are robbing the other person of the ability to affect you, to change your mind. All while expecting them to bow to your ideology or emotions.
Like the NPUs built into new processors? They're adequate for small models - they use system RAM and they're really for mobile devices that lack discrete GPU so that the OS can still do light AI tasks. Microsoft have set 40TOPs as a minimum and good enough for minor enhancements like handling visual filters on video calls, doing eye positioning adjustments, some natural language processing to work with Recall (when they re-add it). But not really up to the challenge of serious image generation. They also only accommodate low-bit maths, so 8-bit, I believe. Whereas many models use higher such as 16 or 32. So the models need to be heavily quantised down to 8bit. So the NPUs are pretty cool and look like they'll be pretty good for their intended purpose. But not so much for running LLMs or Image Generation locally. At least for now. Obviously all this refers to inference. Actually training models on one would be right out.So what's the deal with those new AI accelerator chips? are those good for all AI or just local LLM? I saw one dude running LLaMA on an Rpi5 with a coral TPU which is just 4 TOPS and yet it ran decent enough.
They're coming. I think GeekBench have added itBTW are there any benchmark programs to see how many TOPS your GPU can get?
Hey now. Is there any reason to be mean to her? She was great as Mama Morton in Chicago.That fatso hasn't been relevant in like 20 years, why hire her?
Step 1 for debugging is to do a "git pull" or download the latest version of the UI, and then delete your venv and rebuild it.I'm having problems running automatic1111 webui. I use to be able to run this thing but now I'm getting this error:
File "/home/name/Documents/automatic/stable-diffusion-webui/modules/launch_utils.py", line 386, in prepare_environment
raise RuntimeError(
RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check
I've installed the latest torch version, my gpu has the latest cuda version etc....
Do you use comfyUI or automatic 1111?
and whats this about a config file? What is it do you need when training stable diffusion or any AI like this for say silly tavern? I got images that should have captions in .txt files, stable diffusion 1.5 pruned safetensorfile. What else do you need?
If it's an autoplay video on another website or used as an advertisement, that's the reason why it has a bunch of views and barely any interaction. Nobody would viewbot a video and then not pay for engagement along with it. Dislike viewer extensions aren't accurate and only reflect the dislikes of people who use the extension, there's no way of accessing the like/dislike ratio anymore unless you're the uploader.Smells more like it got viewbotted. It has 55 dislikes as of right now.
Nah I mean these tiny boards you can put on an M.2 socket or a mPCIe in other versions:Like the NPUs built into new processors?
Recall is a neat feature I just don't trust msft with that, and there are open source alternatives like Openrecall but I still need the horsepower to run those, hence the accelerators.ome natural language processing to work with Recall (when they re-add it).
Well that sucks, and I checked and none of these boards have built-in memory, no idea why since there's space for a RAM chip there.They also only accommodate low-bit maths, so 8-bit, I believe. Whereas many models use higher such as 16 or 32.
Which is what I wanted, oh well...But not so much for running LLMs
I'm quite surprised at hearing LLama working well on a Rasberry Pi. Really? Link at all? I'm presuming this is something heavily cut down.
Yeah I overdid it, my bad.Hey now. Is there any reason to be mean to her?
Oh well in that case take everything I said with a pinch of salt. I've no experience with them though I imagine much of what I said holds true. I'd be quite careful around driver support and making sure it will work with your choice of OS. But great if it does.
Maybe. I'd want to know MS explicitly supported cards like this for that purpose though because unless they do, things like the Windows 11 AI features wont use it whether it can handle it or not.Seems like a good way to add AI acceleration to "old" (as in right now as most CPUs don't come with NPUs yet) PCs,
I would imagine they're designed to use system memory, same as an onboard NPU is.Well that sucks, and I checked and none of these boards have built-in memory, no idea why since there's space for a RAM chip there.
Recall is a neat feature I just don't trust msft with that, and there are open source alternatives like Openrecall but I still need the horsepower to run those, hence the accelerators.
Rpi is selling a kit too, comes with a hailo accelerator tho which I heard the documentation is limited and I don't like companies founded by ex-spooks, specially foreign ones, but on paper its over 3x better than the coral TPU which for some reason google has not updated in years.
Still most of the demos are CV stuff which quite frankly is not that impressive anymore. Guess the only way to have a mostly usable local LLM on the go is a gaming laptop...
Do these services only charge you while your AI is running? like if I've an LLM in there its only charging me when I actually use it? I'm asking because these cloud companies can be a real POS with pricing, recall one that offered easy wordpress hosting a friend used once, they wanted to charge him like $15,000 because he forgot to cancel the service even tho he had never had anything running there at all.I've used Runpod.io for some of my playing around and for a fraction of the price of actual hardware I can play around with some pretty high end stuff for hours at a time. Sure, I don't own it but if I can pay £10 and be playing around with some 48GB VRAM card for a week or so, it would take me months and months of that just for the price of a medium discrete GPU. Might be something to consider and I certainly would recommend if you're dipping your toe in the water to consider trying that. I mean, it's far better to spend £10-20 and see if you're actually interested in this stuff or if you get bored after a month before sinking hundreds into your own hardware.
So I have used Runpod.io so will speak specifically about that. You front-load your balance with a set amount paying at the time. It ticks down as you incur charges and you can see your current balance. So it's not like AWS where you give them your credit card and they tell you later that you incurred way more charges than you thought because you left some fucking service running or loaded it up more than you thought and bang there goes your money. If you only want to spend £10 with runpod, then only put £10 on your account. This was the immediate and most appealing thing about them and it's been good.Do these services only charge you while your AI is running? like if I've an LLM in there its only charging me when I actually use it? I'm asking because these cloud companies can be a real POS with pricing, recall one that offered easy wordpress hosting a friend used once, they wanted to charge him like $15,000 because he forgot to cancel the service even tho he had never had anything running there at all.
The right play is to spin up an ubuntu VM with the lowest GPU available, and then wget all of your models to persistent storage. Then stop it, and start up your intended GPU VM when you have enough downloaded to start. If you are running anything that needs splicing, splice it locally then SCP/SSH it to the server and stash it in persistent storage. They also have CPU VMs, but those are hard to procure in the right regions.For only being charged whilst you're running the AI - almost. That part is true but they will also charge you (at a much lower rate) for persistent storage. Which is optional, but you may not want to start every session by downloading the same 7GB model to your instance. So you might pay for 20GB of storage and that will continue to trickle down your balance over time for as long as you have it, using it or not. But the larger cost is actually running things and that applies only the time you are using it. Nor by processor cycles to be clear - if you fire up an instance with a 4090 and don't do anything with it, you're still paying for it for as long as you have it reserved. But why would you - just shut it down when done. It's quick to fire something up and attach your storage to it and vice versa.
That... is very solid advice.The right play is to spin up a VM with the lowest GPU available, and then wget all of your models to persistent storage. If you are running anything that needs splicing, splice it locally then SCP/SSH it to the server and stash it in persistent storage. They also have CPU VMs, but those are hard to procure in the right regions.
Good.If you only want to spend £10 with runpod, then only put £10 on your account. This was the immediate and most appealing thing about them and it's been good.
I take they don't let you use offsite storage.So you might pay for 20GB of storage and that will continue to trickle down your balance over time for as long as you have it, using it or not
I'm about half way through your Rp5 video, btw. I'm surprised how effective it is. I've noticed the uptime value in the top-right jumping not in time with the video so he's editing the response time for his viewers, but still, surprised it's this capable.
Yeah its speed up, he seems to be using the coral which is nearly obsolete and the USB version too so you have to add the delays from the interface. That company hailo is going to offer a new M.2 module with 40TOPS, the magic number msft wants. How well it will do on the Rpi I've no idea, would be interesting to see what these accelerator chips could make if combined in series with some fast RAM on a PCIe card. They seem to be way cheaper and more power efficient than GPUs for AI, tho it makes you wonder why there aren't more companies churning these out given how much money nvidia is making.Interesting video. A testament to how good Rasbery Pi's are. That said, the editing on the video hides how long some of those earlier responses take. I saw the uptime value in the top right jump by nearly eight minutes on one of them. And it looks like I was right about the AI accelerator. Though I couldn't tell from the video if he was actually using it - I'm not wholly convinced he actually set it up to use it.
How expensive is to train a model on these services? because that's the kind of stuff you need to spend a lot on hardware to get it done in non-glacial time. I remember some guy on reddit said he trained his LLaMA model on runpod but he didn't specify which llama version, size, what the model was or more importantly in this case the cost, just that it took 3 days but given all the tiers its meaningless unless he used the highest one in which case it was (I guess) $352.The right play is to spin up an ubuntu VM with the lowest GPU available, and then wget all of your models to persistent storage. Then stop it, and start up your intended GPU VM when you have enough downloaded to start. If you are running anything that needs splicing, splice it locally then SCP/SSH it to the server and stash it in persistent storage. They also have CPU VMs, but those are hard to procure in the right regions.
With rpis you have to be more specific, I seen any number of things made with those, tho its mostly retrogaming tbh.Not for AI though, for other things.
They don't block it (that I know of, but you have an SSH tunnel to the instance so I don't think they could anyway) but would it be useful? Could you have some local drive and open up a connection and do some sort of network mount from the instance in Runpod of your local file system? Sure. But when the GPU needs to load the model you have stored locally into its VRAM, that's going to be a looooong process. You need storage local to the GPU. And if you're talking just loading it up to the included instance storage you get when you fire the instance up, that returns us to my initial point - do you want to be doing this every time you want to play around with this stuff? Storage is a cost but it's a low cost compared to actually running instances. Either way, that's something you have to accept if you aren't doing things locally. Everything has upsides and downsides.I take they don't let you use offsite storage.
I'm not sure he is. He shows it at one point later on but I don't think he was using it. Also, my admittedly very cursory look suggested that you needed to explicitly use their Python package for it and I don't know that he tweaked anything to do so.Yeah its speed up, he seems to be using the coral
If they're cheaper than GPUs it'll be because they don't have a load of high-speed memory next to the chip that a discrete GPU card has. And other factors.which is nearly obsolete and the USB version too so you have to add the delays from the interface. That company hailo is going to offer a new M.2 module with 40TOPS, the magic number msft wants. How well it will do on the Rpi I've no idea, would be interesting to see what these accelerator chips could make if combined in series with some fast RAM on a PCIe card. They seem to be way cheaper and more power efficient than GPUs for AI, tho it makes you wonder why there aren't more companies churning these out given how much money nvidia is making.
I'm afraid training models is outside my experience. Though I can say it's something of a "how long is a piece of string" question. To answer a question like that, the question needs to contain specifics, really. I can say that if you can find some stats from someone they're going to translate 1:1 to something like runpod. Or at least should. If someone says "This model and this amount of data took me X hours on my MI300X" then X hours should be the same on runpod if you get an MI300X.How expensive is to train a model on these services? because that's the kind of stuff you need to spend a lot on hardware to get it done in non-glacial time.
Yeah, but they're not remotely comparable in terms of power. And if you're doing something where they were comparable then you should just be using the 3080 in the first place. I can't see a 3080 available on there right now the closest they have is a 3090 for $0.44/hr. So three days of that would be $32.You can buy a good used 3080 with 12GB of VRAM for that, not the same than that MI300X but you own it.
The memory required scales with the size of the model. Supposedly you used to be able to train llama2 13B with 24 VRAM with some hacks. There's no way to do that now with llama3 as the context is twice as large even if it is a 8B now. That's not even getting into Illama2 70B which supposedly takes several hundred to a thousand GB of VRAM. It's unfathomable how much memory you would need for Illama 3 70B.How expensive is to train a model on these services? because that's the kind of stuff you need to spend a lot on hardware to get it done in non-glacial time. I remember some guy on reddit said he trained his LLaMA model on runpod but he didn't specify which llama version, size, what the model was or more importantly in this case the cost, just that it took 3 days but given all the tiers its meaningless unless he used the highest one in which case it was (I guess) $352.
Good point.But when the GPU needs to load the model you have stored locally into its VRAM, that's going to be a looooong process. You need storage local to the GPU. And if you're talking just loading it up to the included instance storage you get when you fire the instance up, that returns us to my initial point - do you want to be doing this every time you want to play around with this stuff?
On a side note here's another guy running LLMs on a rockchip which does have an NPU:I'm not sure he is. He shows it at one point later on but I don't think he was using it. Also, my admittedly very cursory look suggested that you needed to explicitly use their Python package for it and I don't know that he tweaked anything to do so.
The chips look way smaller too, no way those are as expensive to manufacture.If they're cheaper than GPUs it'll be because they don't have a load of high-speed memory next to the chip that a discrete GPU card has. And other factors.
I seen people running SDXL on a 3080 just fine, no idea about training SD models tho. The main limiting factor is that gaming cards rarely go over 12GB of VRAM with the new $1000 4080super having only 16GB. I'm surprised nobody has been modifying these cards with bigger RAM chips like they did with some modded radeons I seen for sale.Yeah, but they're not remotely comparable in terms of power. And if you're doing something where they were comparable then you should just be using the 3080 in the first place. I can't see a 3080 available on there right now the closest they have is a 3090 for $0.44/hr. So three days of that would be $32.
Fuuuuuuuck, so the trillion dollar datacenter wasn't a joke. Are LLMs about to hit a ceiling? how can this keep going?That's not even getting into Illama2 70B which supposedly takes several hundred to a thousand GB of VRAM. It's unfathomable how much memory you would need for Illama 3 70B.