How does language design affect various programmer metrics? - Categorizing programming languages

MysticLord · Jan 6, 2022

I read this piece on ESR's blog, and this quote jumped out at me.

Another is project scale. Every language also has an expected rate of induced defects per thousand lines of code due to programmers tripping over leaks and flaws in its abstractions. This rate runs higher in machine-centric languages, much lower in programmer-centric ones with GC. As project scale goes up, therefore, languages with GC become more and more important as a strategy against unacceptable defect rates.

Has there been any attempt to categorize programming languages based on the results that their language features have on various metrics, and if so where can I read about it?

What effect do various decisions about programming language design have on overall productivity, broken down into various metrics? And what are those metrics?

edit

I see that I need to say everything as explicitly as possible for the autists (we have the best autists, we love them don't we folks) in the crowd. Let's try this again.

How do specific implementations of language features affect various programmer metrics? What are the drawbacks of these implementations?

Take for example strings and common operations (splitting, combining, finding, comparing, sorting)

What is the most intuitive way to handle strings? For an experienced programmer coming from C? For someone new to programming?
What is the least intuitive?
What are the pros and cons (of all sorts) of various programming languages implementations of strings and string handling?

Other features to consider with the same questions (and any you can think of that are related):

File I/O
Networking
Parallelism
Concurrency
IDEs (implied to be the ones with official support or most commonly used and recommended)
Ease of debugging and debuggers (again, implied to be those with official support or most commonly used/recommended) in general
GUIs and CLIs
Control structures
Syntactic sugar (compare Java to, well, anything)
Type systems
How well it maps to actual hardware.
Standard libraries
Database integration support
Memory management
Ease of understanding, coherence, and internal logic of the language structures (does the language lend itself to common mistakes because the same symbol is used for multiple operations, and how many hidden gotcha moments does it spring on you?)
Ease of cross-compiling, or writing and debugging code for platforms other than the one you're on.
Ease of porting the language itself to other platforms.

Milkis · Jan 6, 2022

First off, ESR has been riding off his own fumes from writing Cathedral and Bazaar for the past like 3 decades, and has not really contributed anything else of note. I take what he says with a grain of salt.

Second off, the idea that languages have "expected defect rates" is an extremely, extremely 1990s - early 2000s idea, from the same era as MTBF/MTTF and KLOC counting, when I think software engineering was trying to get legitimate, and tested out borrowing random stuff from real physical engineering processes. In a qualitative sense, I think ESR is right: Rust and Go are more productive than PHP or Perl if you are programming in the large. If you're writing a one-off script (that's not going to somehow end up as part of your critical infrastructure) the opposite might be true. Can this be translated to a quantitative measurement of defects per KLOC? They thought so in the 90s, but now it's seen as a fool's errand.

tl;dr ESR is a product of his time. Not to say that all software writers from the 90s are dated -- I think Joel Spolsky is timeless.

Besachf Jhakut · Jan 6, 2022

One warning I'll make is such studies need to control for quality of the programmers which is of course very hard. For one take on this, see the Blub paradox, which has indeed been cited by Joel Spolsky.

Milkis said:
First off, ESR has been riding off his own fumes from writing Cathedral and Bazaar for the past like 3 decades, and has not really contributed anything else of note.

Except various interesting software in the GPS/time, NTP, and repository conversion to Git domains. Perhaps you're among the group of people who've canceled him from most FOSS work??

And to say his insights about for example languages with and without GC is not still true, be it C/C++ or unsafe regions of Rust vs. Python and Go which he's been using for the repository work is obsolete is obviously silly. For languages in between like PHP and Perl, all I'll say is that I've sworn to never write another program in the latter.

Strange Looking Dog · Jan 6, 2022

Besachf Jhakut said:
the Blub paradox

That's just a stupid idea though, what constitutes a powerful feature? Meet my new language which assigns "name" metadata to every value, it makes debugging so easy! And also wildly bloats memory usage... What if we just abstract away all those messy machine concerns? But now we can't reason about how our program will actually perform without first understanding how the platform maps it's abstractions onto the machine.

There is no free feature, and there is no objective "most powerful" language, the only sane grading is how well languages fulfill their intent.

Besachf Jhakut · Jan 6, 2022

ConcernedAnon said:
That's just a stupid idea though, what constitutes a powerful feature?

Thank you for providing a textbook example of a Blub programmer, I could not ask for a better illustration for Graham's essay.

ditto · Jan 6, 2022

Besachf Jhakut said:
One warning I'll make is such studies need to control for quality of the programmers which is of course very hard. For one take on this, see the Blub paradox, which has indeed been cited by Joel Spolsky.

Worth noting the Paul Graham is notable for writing his 90s startup with Common Lisp and that Hacker News (written in ~~Bel~~ Arc his very own Lisp) is constantly struggling and falling over during major news events.

Edit: confused Bel with Arc

El Goblina · Jan 6, 2022

What effect do various decisions about programming language design have on overall productivity, broken down into various metrics? And what are those metrics?

The #1 influence on overall productivity is the familiarity of the coder with the language / API / project, abstracting away the skill of the programmer himself. There are too many variables for metrics to be particularly useful. Change the context, everything changes, metrics become invalid.

ConcernedAnon said:
the only sane grading is how well languages fulfill their intent.

Even this is a lousy grading, because top-quality programmers can do things with languages that language designers fail to imagine.

Knight of the Rope · Jan 6, 2022

Milkis said:
If you're writing a one-off script (that's not going to somehow end up as part of your critical infrastructure)

The longer I do software development, the more I start to think that this is an impossible scenario. For real, it's absurd how many times I've cranked out a short, shitty, write-only script that "couldn't possibly ever make its way into one of our work pipelines or production code", only to be dumbstruck when once again, exactly that happens.

Besachf Jhakut · Jan 6, 2022

ditto said:
Worth noting ... that Hacker News (written in Bel [Paul Graham's] very own Lisp) is constantly struggling and falling over during major news events.

You're saying Hacker News has been rewritten from Paul Graham's very own Lisp V1.0 Arc into Bel?? And that he still has anything to do with running it?

Thought he'd generally checked out of YC, and for Hacker News 8 years ago turned it over to a cow named Daniel Gackle, user ID dang.

While I only check it a couple of times a day, I haven't noticed it falling over in years.

ditto · Jan 6, 2022

Besachf Jhakut said:
While I only check it a couple of times a day, I haven't noticed it falling over in years.

My mistake I confused Arc with Bel. When the site gets under load the
pagination and comment submission fails and Dang asks people to log out to use the cached version. AFAIK PG is still involved and they still push YC companies on the front page. But I wouldn't listen to him for programming advice.

MysticLord · Jan 7, 2022

@Milkis
@Besachf Jhakut
@ConcernedAnon
@ditto
@dak
@Knight of the Rope

I've edited the original post for clarity, please reread.

I don't care if your preferred language feature has power levels greater than 9000, I want to hear you explain why that feature is objectively useful, what circumstances in which it's useful, and tradeoffs it makes even if they are obvious or minor.

For an example of an obvious, minor, and worthwhile tradeoff, consider everything that you can do with Fortran compared to assembly. While there is a tradeoff in using the former over the latter, almost every sane programmer (and most of the insane ones too) agree that it's worthwhile.

dak said:
The #1 influence on overall productivity is the familiarity of the coder with the language / API / project, abstracting away the skill of the programmer himself. There are too many variables for metrics to be particularly useful. Change the context, everything changes, metrics become invalid.

What language features make it easier for one to familiarize himself with a language, API, or project?

dak said:
Even this is a lousy grading, because top-quality programmers can do things with languages that language designers fail to imagine.

What language features tend to create this sort of emergent phenomena?

Kosher Dill · Jan 7, 2022

dak said:
top-quality programmers can do things with languages that language designers fail to imagine.

I see we have a fan of Duff's device and template metaprogramming in the house.

El Goblina · Jan 7, 2022

MysticLord said:
I've edited the original post for clarity, please reread.

In all honesty, this feels suspiciously like an undergrad homework assignment.

MysticLord said:
What language features make it easier for one to familiarize himself with a language, API, or project?

Familiarity is about the only thing I can conclusively point at.

MysticLord said:
What language features tend to create this sort of emergent phenomena?

Features with orthogonal complexity, not to put too fine a point on it.

You're pushing hard for concrete examples. I'm not a good person to give good feedback on this, because I'm very biased, and what's more, I am very biased against empty complexity and contemporary novelty. My favorite project in 2021 used plain ol C and even compiles in TCC. That said, my beloved "complicated" language is Prolog. With Prolog, a lot of power comes from reflective metaprogramming. In Prolog, you can construct your code such that it reflects the mathematical realities/models underpinning whatever you're building. Your code can examine itself and modify itself in a very straightforward manner at run-time. The downside is that you need to be able to conceptualize these mathematical realities, and so Prolog selects against familiarity here, as the intersection of people with deep familiarity with CS and mathematical formalism is much smaller than it ought to be. For a concrete example of this, consider a DFA: https://www.cpp.edu/~jrfisher/www/prolog_tutorial/2_14.html

Prolog's big downside is that it is not strongly typed. This isn't a big problem because you can add explicit type checking at the cost of a line of code per typecheck. The Mercury project resolves this, but the cost is additional complexity. https://mercurylang.org/ Increasing complexity negatively affects programmer metrics.

As for optimizing for familiarity, my sweetspot is Ruby. Ruby's close enough to C/C++ that most of your intuition transfers over, can be extended using C with very little pain, and yet provides a lot of syntactic sugar and potential for composing higher-level abstractions. I do most of my prototyping in Ruby. Python has fixed most of the ways in which Ruby is superior in the 3.x series, so I'm afraid that Ruby is now destined strictly for the sidelines of history, given how aggressively Python's dominating this space, but work on Ruby 3 moving towards better multithreading might give it an edge in a few years of development.

In the context of ESR, ESR is known for being a Go fan and a bit of a Rust skeptic. This reality reflects the "programmer metrics" I'm describing here. Go was designed to be straightforward and easy for programmers to grasp. Rust was designed with a lot of high-level complexity so that the compiler can automate away a lot of complexity out of the codebase. Practically, this means that both Go and Rust should provide superior metrics to C/C++. The challenge comes in familiarity.

Kosher Dill said:
I see we have a fan of Duff's device and template metaprogramming in the house.

Duff's device would be a good example, but I am very much not a fan of template metaprogramming, at least the C++ style.

Besachf Jhakut · Jan 7, 2022

dak said:
In the context of ESR, ESR is known for being a Go fan and a bit of a Rust skeptic.

Let me clarify ESR's biggest issue with Rust, that it doesn't handle graphs at all well, its ownership model and abstractions don't work for them unless you turn off safety. And graphs are very important for a number of fields like chip design and verification, as well as his use case of groking a source code control system's repository where the connections between commits are necessary to handle, and quickly.

Python and Go are fine for safe graph usage because they're garbage collected, and his use case like the others I mentioned care about performance but they're not real time-ish like a lot of systems programming that Rust is aimed at, see for example device drivers in Linux. He moved to Go because he ran out of raw machine power after previously acquired a very beefy system for this and optimization tricks for Python, he had some huge, decades old repos to convert where Go's much greater efficiency helped it fit the bill.

MysticLord · Jan 7, 2022

@dak my personal issue is that I know like 1 language that's relatively useful, I'm trying to forget another (Java) because Oracle is making it impossible for users to run shit with it, and whenever I try to learn another I last maybe a week before I get too irritated to continue. I don't fucking care about every language's special snowflake gotcha moments, I don't want to spend 10,000 hours playing in their sandbox only for them to shut it down and open up a new one nextdoor with entirely different traps. I just don't care anymore. Thus I'm searching for a language where everything that is retarded is at least retarded in a consistent, predictable manner.

Go sounds really nice, but I'm sure it has plenty of nastiness for me as usual.

Besachf Jhakut · Jan 7, 2022

MysticLord said:
Thus I'm searching for a language where everything that is retarded is at least retarded in a consistent, predictable manner.

You owe it yourself to try a Lisp, and for that Scheme is the choice out of it, Common Lisp (CL) and Clojure for "least retarded in a consistent predictable manner," it doesn't have almost all the historical warts from being the second surviving major computer language, or the compromises that Clojure makes as a hosted language on the JVM or JavaScript for its major versions.

If you'd also like to learn a lot of basic and useful computer science, Structure and Interpretation of Computer Programs is highly recommended. Short of that gem which implicitly teaches no longer entirely idiomatic Scheme there's many alternatives, and the basics of Lisp can be learned "in 15 minutes," the syntax looks weird and needs a smart editor to make it easy to use, but is very regular. No Python lover should hate the syntax, but most people do, perhaps not allowing for how that's part of it being one of the ultimate non-Blub languages through extremely powerful macros.

MysticLord said:
Go sounds really nice, but I'm sure it has plenty of nastiness for me as usual.

It's an extremely opinionated language. Things I don't like about it include:

Standard implementation is purely compile, link and run paradigm, no REPL interpreted environment. There are some REPLs for Go out there.
Error reporting is done by return values; being able to throw exceptions saves a lot of boilerplate code, and I'd except your rejection of Java is in part based on its issues with that.
A huge issue for serious or would be serious users is no generics until like this week or last (really).
If you don't like Java's basic syntax which is also C derived that would be a negative.
It's very tied to Google as of now, as I recall was intended to fill a gap between Python and C++, they don't allow very many full fledged languages in their monorepo last time I checked.
The C heritage makes it generally a bit old fashioned.
Extraordinarily bad name for the language in the age of search engines, inexplicable for a Google project no matter how much freedom its storied developers were granted.
The developers are not viewed as playing nice and fair with their community last time I checked, which was more than a year or two ago.

Knight of the Rope · Jan 8, 2022

MysticLord said:
I just don't care anymore. Thus I'm searching for a language where everything that is retarded is at least retarded in a consistent, predictable manner.

What's your typical use case, bro? Are you programming for fun? Programming for work? Do you want to get into the open-source community? Are you studying computer science at college? Studying computer science autodidactically? Are you simply resume-padding (no judgement, in fact it's probably the best reason to be language-hopping)? Or perhaps some combination of any of those things?

Your question as to what the best language is (whether the metric is 'fun', 'productivity', 'defects per line of code', etc) will always depend on the context that you're using it in, as odd as that sounds. For instance, programming in Scheme as @Besachf Jhakut suggested is usually fun, as Lisps are. But it's only productive if:

The scale of the program you're writing is relatively small and/or from 'first principles'. e.g. if you want to code a feed-forward neural network from scratch in Scheme as a toy to help you grasp ML ideas by building them up piece-by-piece on a blank slate (as you will absolutely have to in Scheme, since the core of the language is so streamlined), then that's fun and worthwhile. Doing the same work because you expect it to be a viable product and work for real-world data at even 1% of the speed/efficiency of TensorFlow sounds like a good way to minecraft yourself.
You're expecting to be the only one who will ever use the codebase. Go to any congregation of software developers. Look to your left, then look to your right, then look behind you, then look in 20 other directions, then look in the mirror. Statistically, only one of those people is enough of an autist to have even heard of Scheme, much less code in it. And that guy is also positive that he could write your code better than you could, and since he's a Lisper, he'll just do that instead rather than bother trying to collaborate.
You're not coding for work. Generally bosses hate their programmer plebs doing shit in meme languages (even when it's just 'code-that-writes-code', something Lisp and Scheme are actually pretty good for). Why do they hate it? Because when you pack your shit and leave the company, all of a sudden they've got a bunch of shitty, cryptic Scheme scripts stuck in their workflow that they can't find anyone able to read, much less use. ("Woohoo! Job security!" you might think, if you're a moron. Sure, your boss might not be able to easily fire you, but he'll fucking resent you for setting this shit up and you won't ever get promoted/recommended/discussed/offered professional development opportunities. I truly don't understand the Reddit faggots that brag about making themselves "irreplaceable" because they've written serious parts of their work pipeline in Elixir or Haskell or something.)

Conversely, Java and Python are fucking gay. Coding in them always feels like a chore, even with all of the bells-and-whistles that those languages have thanks to their huge libraries. But you know what? The Python and Java coders will literally never go hungry. If you can code decently in even one of them you'll always be able to feed yourself. And besides, if you want "consistently retarded in a consistent, predictable manner"? What better codifies that sentiment than having such a big and gay collection of library code that you have a specially-coded tool and development workflow just to combat dependency hell? "Productive"? From a point of view of work output, what could be more productive than knowing that if you take a sick day, there's statistically at least 10 other people in the room that could fill in for you and do a passable job? Etc.

MysticLord · Jan 8, 2022

Knight of the Rope said:
What's your typical use case, bro? Are you programming for fun? Programming for work? Do you want to get into the open-source community? Are you studying computer science at college? Studying computer science autodidactically? Are you simply resume-padding (no judgement, in fact it's probably the best reason to be language-hopping)? Or perhaps some combination of any of those things?

What I use it for, in no particular order:

Server side web scripting.
Mobile app dev.
Making GUI tools to edit games in such a way that I distribute something that just works for my users, requiring no additional downloads on their part.
Something for which there are likely to be API bindings for an open source 2D or 3D game engine. My goal there is to create the perfect skinner box and data harvester for people I loathe (Redditors, Mainland Chinese people, Subcontinentals, leftists in general), and to extract as much money from them as possible.

I went to a community college for 2 years, got my associates, did one quarter at a university and gtfo because I'm not insane. Got:

Intro to CS
Java 1, 2, 3
Discrete Math 1
Data Structures 1
Assembly Language
A class that was an overview of networking protocols for IT students
Operating Systems 1
Client & Server Side Web Dev
Android App Dev

My previous job was freelance web/mobile app dev. I made enough money to almost buy a house during the shutdowns, making mobile apps and websites for stores to take orders. Most of it is copy and paste at this point, I spend more time trying to figure out IT/sysadmin shit so I can DIY that to save a lot of money. Though I do spend a lot of time trying to break what I make and reading up on the foibles of Java and Python.

I currently work for a guy who does consulting, wherein I unfuck and streamline their software dev processes. This is 90% bullyciding linux computer janitors you see on /g/, unfucking code made by Indians, unfucking code made by malevolent people who were replaced by Indians, and unfucking malicious spyware made by Chinese people who inexplicably disappeared one day. By "unfucking" I mean "divine the original intent, determine what security oversights were introduced, and get all their ducks in a row so the other guys on the consultants payroll can get started coding". Most of it is reading other people's code, adding comments when it's usable or when there's a particularly illuminating piece of idiocy, writing specifications based on what it seems the code was supposed to do, researching, technical writing, effective communication, and being available and personable to answer questions and write FAQs. I'm essentially the coding team's mother.

I still do a bit of freelance when I have time, and that consists of updating stuff for my previous clients with whom I have an agreement. Security upgrades, bug reports, the occasional feature request. That work is being taken over and deskilled by companies that build web stores and mobile apps at scale though, I doubt it will be around in 10 years, except for high end bespoke web publishing and app dev.

Knight of the Rope said:
Your question as to what the best language is (whether the metric is 'fun', 'productivity', 'defects per line of code', etc) will always depend on the context that you're using it in, as odd as that sounds. For instance, programming in Scheme as @Besachf Jhakut suggested is usually fun, as Lisps are. But it's only productive if:

The scale of the program you're writing is relatively small and/or from 'first principles'. e.g. if you want to code a feed-forward neural network from scratch in Scheme as a toy to help you grasp ML ideas by building them up piece-by-piece on a blank slate (as you will absolutely have to in Scheme, since the core of the language is so streamlined), then that's fun and worthwhile. Doing the same work because you expect it to be a viable product and work for real-world data at even 1% of the speed/efficiency of TensorFlow sounds like a good way to minecraft yourself.

You're expecting to be the only one who will ever use the codebase. Go to any congregation of software developers. Look to your left, then look to your right, then look behind you, then look in 20 other directions, then look in the mirror. Statistically, only one of those people is enough of an autist to have even heard of Scheme, much less code in it. And that guy is also positive that he could write your code better than you could, and since he's a Lisper, he'll just do that instead rather than bother trying to collaborate.

You're not coding for work. Generally bosses hate their programmer plebs doing shit in meme languages (even when it's just 'code-that-writes-code', something Lisp and Scheme are actually pretty good for). Why do they hate it? Because when you pack your shit and leave the company, all of a sudden they've got a bunch of shitty, cryptic Scheme scripts stuck in their workflow that they can't find anyone able to read, much less use. ("Woohoo! Job security!" you might think, if you're a moron. Sure, your boss might not be able to easily fire you, but he'll fucking resent you for setting this shit up and you won't ever get promoted/recommended/discussed/offered professional development opportunities. I truly don't understand the Reddit faggots that brag about making themselves "irreplaceable" because they've written serious parts of their work pipeline in Elixir or Haskell or something.)

I'm aware that Lisps are meme languages for coddled academics, but I'd spend a season or two on it just so I can better grasp recursion and various functional language things of which I am currently clueless.

What I really need is an informational class on various programming language features and their tradeoffs as per my OP, so I can evaluate languages without wasting time learning them.

Knight of the Rope said:
Conversely, Java and Python are fucking gay. Coding in them always feels like a chore, even with all of the bells-and-whistles that those languages have thanks to their huge libraries. But you know what? The Python and Java coders will literally never go hungry. If you can code decently in even one of them you'll always be able to feed yourself. And besides, if you want "consistently retarded in a consistent, predictable manner"? What better codifies that sentiment than having such a big and gay collection of library code that you have a specially-coded tool and development workflow just to combat dependency hell? "Productive"? From a point of view of work output, what could be more productive than knowing that if you take a sick day, there's statistically at least 10 other people in the room that could fill in for you and do a passable job? Etc.

RE Java:

The existing Java GUI libraries are buggy pieces of shit, and in order to use them I need to be one of those "lifetime learner" idiots who constantly gargles Oracle's curry scented asshole. If I have to not just learn everything there is to know about a language to avoid its defects, but I must do this with every release of their virtual machine and make my users install a runtime environment that comes with an adware installer, then I'm just not gonna use your gay little language. Not only that, I need to force my users to install OpenJDK somehow, when almost all of them run Windows 10 and IIRC OpenJDK for the newest version of Java doesn't come with an installer.

I'm tired of people coming down from on high to "improve" things that worked just fine, or as fine as possible under the circumstances.

Besachf Jhakut · Jan 8, 2022

MysticLord said:
I'm aware that Lisps are meme languages for coddled academics....

While a lot of Knight of the Rope's comments about Lisps and especially Scheme are correct, this is only partly true for Scheme, there's even a Gambit-C based mobile app platform that's getting a fair amount of use. Aside from a few special case oddballs you might say, I get the impression that only a few programing language academics and of course Scheme lovers are still using it, for ABET level Computer Science instruction it's been replaced by Java which is not even wrong for CS, and Python. Not sure Common Lisp is at all a language for academics anymore, but I don't follow that community.

Clojure is very much the opposite, is intended to be a very practical language thus it being hosted on other language runtimes including the JVM which I assume you're trying to get away from. But when Clojure is "Java with a human face" i.e. when it's used to wrap existing Java libraries or you do string manipulation you have a big advantage over myself since I never went further than "Hello, World." with plain Java and never learned the ecosystem.

I don't recommend it as a first Lisp unless you really want to focus on functional programming or maybe lazy evaluation, or gain experience in something you can get paid to do. For GUIs I don't think it has a good story for the reasons you outline about the Java ecosystem, but there's always the JavaScript hosted version for the tar pit of browser front end development.

Your particular intentions for spending a little time with one or more Lisps are spot on, I'd only add learning Lisp macros to the list, and taking a gander at the Common Lisp metaobject protocol, for which the book might be entirely sufficient (since you didn't mention REPL based development I assume you've already done that with one of the currently more popular languages that have that). Depends on how much you still may like object oriented programming, I've decided it's a mistake for all but a few domains where simulation is a major factor, which does includes GUIs.

Dick Justice · Jan 8, 2022

I use the longest language because the more lines I write the bigger my bonus is. I'm the most valuable developer in the company!

How does language design affect various programmer metrics? - Categorizing programming languages

MysticLord

China must be destroyed.

Milkis

and 50 anonymous John Does

Besachf Jhakut

Irredeemable Deplorable

Strange Looking Dog

Besachf Jhakut

Irredeemable Deplorable

ditto

El Goblina

Knight of the Rope

Besachf Jhakut

Irredeemable Deplorable

ditto

MysticLord

China must be destroyed.

Kosher Dill

Potato Chips

El Goblina

Besachf Jhakut

Irredeemable Deplorable

MysticLord

China must be destroyed.

Besachf Jhakut

Irredeemable Deplorable

Knight of the Rope

MysticLord

China must be destroyed.

Besachf Jhakut

Irredeemable Deplorable

Dick Justice

If you say "normie" you are that which you condemn