Kiwifarms dataset or proposals for data-related projects - Some sort of restricted KF API or dataset to help create new forum features

cocozitu

kiwifarms.net
Joined
Oct 1, 2023
Hey guys!

I have searched about this subject in the forum, a few days ago, to avoid making any reposts. So far, nothing.

For people interested in helping the forum with data-related projects (fine-tuning LLMs, text classifiers, improving indexing, search, automated knowledge graph creation from raw text): is there a sample open to the public that doesn't push us to hit the forum with a batch of automated scrapers? Are there any ongoing projects on the topics I mentioned?
 
Ask Josh, and maybe he'll think of something. At best, you could get permission from him for how much scraping he wants to let you do. The userbase of the Farms seems generally uninterested in ML beyond looking at the pretty pictures and cool poems it spits out, sadly.
 
Ask Josh, and maybe he'll think of something. At best, you could get permission from him for how much scraping he wants to let you do. The userbase of the Farms seems generally uninterested in ML beyond looking at the pretty pictures and cool poems it spits out, sadly.
Thanks for the hint! I can see a bunch of emails in the several websites. Which one should I use? Or should I just DM our dear leader through the forum?
 
Thanks for the hint! I can see a bunch of emails in the several websites. Which one should I use? Or should I just DM our dear leader through the forum?
I'd DM him on the forum, but you could use any method. Hopefully not the legal addresses, though. The Programming Thread is probably the best place to talk about AI on the Farms if you want to ask questions or something.
 
I'd DM him on the forum, but you could use any method. Hopefully not the legal addresses, though. The Programming Thread is probably the best place to talk about AI on the Farms if you want to ask questions or something.
Thank you! I will DM him through the forum.

Have a good night!
 
Multimodal input?
A single picture would work better than random selection if I had to guess. It would be a start. Now get to finetuning that ViT or whatever you're supposed to use for image classifiers!
 
A single picture would work better than random selection if I had to guess. It would be a start. Now get to finetuning that ViT or whatever you're supposed to use for image classifiers!
Only one picture as input might not be enough. But it's a start. Is there any kind of dataset for that? Or a site where I can get that data without scavenging too much?
 
Screenshot_20231108-194700.png

I made an offtopic post to get his attention. This was his answer. I guess I have my answer now.
 
What exactly do you plan on doing with an API? Your post sort of looks like it was Ctrl C + Ctrl V'd from the first google search "What can I do with an API"
 
  • Agree
Reactions: dvnc
What exactly do you plan on doing with an API? Your post sort of looks like it was Ctrl C + Ctrl V'd from the first google search "What can I do with an API"
One very obvious think is automating the creation of a big knowledge graph for people and events using a LLM. Having access to a knowledge graph can help A LOT with implementing A LOT of features. A very obvious one that all of you know: when you search for a well known topic on Google, either a person or event, you usually get a panel with very relevant fields describing that person/event. Josh can use a KF Graph for that to summarize things. This is one small example of many new features that can be powered by such tech (I know about them because I work with those systems).
 
(I know about them because I work with those systems).
You joined in Oct, made a couple of meh posts, then made this thread, and breached common forum etiquette by trying to pull another thread off track.

I guess my questions are: how did you come about the Farms and why the sudden insistence on doing this, to the point of trying to derail a thread?

Genuinely curious.
 
You joined in Oct, made a couple of meh posts, then made this thread, and breached common forum etiquette by trying to pull another thread off track.

I guess my questions are: how did you come about the Farms and why the sudden insistence on doing this, to the point of trying to derail a thread?

Genuinely curious.
I have been watching Jersh streams since 2019. Derailing was a bad option on my part motivate by not wanting to wait too long for an answer.
I watch his streams every single week either while (1) working, (2) dinner/lunch or ... (3) assembling Legos (😭😂). Seeing the shit Liz xing xong and his allies have been doing to this site has been bothering me. So...instead of complaining and doing nothing, I'm trying to lend a hand on shit I know about.
 
My money is on an LLM to specifically profile our posting patterns in conjunction with other large data sets like facebook, youtube, et. al. to attempt to dox sufficiently verbose users. The question is what kind of access this fucker has to those other data sets.
If I was trying to be malicious, I would just act without saying a word. I could automate chrome or firefox using selenium to crawl and scrape what I want. But I don't want to had fuel to the fire by doing that.
 
  • Dislike
Reactions: Archie_Kimkicker
Anyway, even if Jersh isn't open to this (can't blame him, the guy is surrounded by loonies), feel free to ask anything here about technical stuff in this domain.
 
  • Autistic
Reactions: Archie_Kimkicker
I'm trying to lend a hand on shit I know about.
I mean if anything, provide a solid outline. Everything within this thread just doesn't track right. What 777Flux said sounds more apt. Everything you're saying is just generic and almost Chat GPT'd.

The enthusiasm is cool, though.
 
Maybe you should spend some time silently lurking the website and reading in the background before posting regularly. You could even cherry-pick some posts manually to establish a small but high-quality custom dataset if you wanted to spend the effort. Also, don't derail random threads. There's no reason to do that.

Also, try not to triple post.
 
Back