Programming thread

This reminds me of something Joe Armstrong has said. Don't design your system for 10 users and scale it up. Design it for 10M users then scale it down.

Nice quote, but I'd argue it's not really necessary all the time. Premature optimization and letting perfect be the enemy of good are real problems too.

One of the problems with typed languages is that besides ossifying the code base over time, they don't really solve the problem, not without plenty of case and nice features, of making illegal states impossible to represent.
Example: I have type representing a segment. x1 should always be > x0.
I have a collection of segments which should always be in rising order.
Can your type system represent this? Can it generate example data for this?
I don't quite get what you're going for here. Both of those cases would be relatively simple to implement in a contemporary OOP-ish language where you can create custom classes/types. For the segment type, have its constructor take x0 and x1 as parameters and throw an exception if x0 is larger than x1 (or swap their values if that is acceptable). For the latter, have the collection type's "addSegment()" method throw an exception if the x0 of a new segment is smaller than the x1 of the previous segment (or just have "getAllSegments()" sort the segments when called, again depending on what is acceptable behavior). I don't see what this has to do with static typing, either.
 
  • Horrifying
Reactions: hundredpercent
I have a type representing line segments where x0 and x1 are both integer encodings of Turing machines that halt.
 
  • Horrifying
Reactions: Shoggoth
Nice quote, but I'd argue it's not really necessary all the time. Premature optimization and letting perfect be the enemy of good are real problems too.
Is it an optimization? The scaled down system might perform worse than one designed for 10 users. The idea is that some tools make designing for scalaility (almost) trivial, such as Erlang, as Armstrong will gladly tell you.
I don't quite get what you're going for here. Both of those cases would be relatively simple to implement in a contemporary OOP-ish language where you can create custom classes/types. For the segment type, have its constructor take x0 and x1 as parameters and throw an exception if x0 is larger than x1 (or swap their values if that is acceptable). For the latter, have the collection type's "addSegment()" method throw an exception if the x0 of a new segment is smaller than the x1 of the previous segment (or just have "getAllSegments()" sort the segments when called, again depending on what is acceptable behavior). I don't see what this has to do with static typing, either.
That's exactly the point of a good/strong type system.
I don't want a runtime exception. I want illegal state to be impossible in my system.
If your solution is to throw an exception on an illegal state it means you have holes in your type system and you can't have guarantees regarding them. Some knowledge about the types is guaranteed by the compiler, for example, that the fields of the struct are floats. Great. But I have no guarantee some struct I'm looking at is legal. I have no guarantee the collection of segments I illustrated previously is legal.
In very rough pseudo code:
Code:
Segment :: {x0 :: double, x1 :: double}
Better
Segment :: {x0 :: double, x1 :: double}, x1 > x0
Stronger if you can also specify an ordering semantic

Segments :: List[Segment] // Meh
Segments :: List[Segment], Monotonic increasing // Yeah
I hope this demonstrated the differences.
The fact that most static type systems don't have dependent types makes them almost useless for detecting bugs in anything but purely scalar code.
On the other hand, some dynamic languages and wierdos like F# come with type systems, optional for the dynamic langs ofc, which can represent these constraints, making them way more useful.
 
  • Informative
Reactions: albert chan
Nice quote, but I'd argue it's not really necessary all the time. Premature optimization and letting perfect be the enemy of good are real problems too.
Hah, I was just thinking about that Knuth quote.

Is it an optimization? The scaled down system might perform worse than one designed for 10 users. The idea is that some tools make designing for scalaility (almost) trivial, such as Erlang, as Armstrong will gladly tell you.
I don't think the point of the saying is that you shouldn't try to make things efficient, but that iteration will usually give you better results than trying to presuppose your future needs and structuring everything under that assumption from the getgo.
 
  • Agree
Reactions: Least Concern
Hah, I was just thinking about that Knuth quote.


I don't think the point of the saying is that you shouldn't try to make things efficient, but that iteration will usually give you better results than trying to presuppose your future needs and structuring everything under that assumption from the getgo.
You could argue that a good language, environment and tooling will support it from the get-go. Will the extra 5-10% of work early on be worth it down the line? Unless you're rushing to market, why not?
This reminds me of the graph about the programmer who spends 5 hours automating a 30 minutes manual task.
 
Show me a good type system
Why?

I should've just leave my reply with that one word, just like the last time, because what you're writing and/or asking for has nothing to do with what are you replying (quoting) to. I know that you're trying to nudge the discussion into more academic flavor and I respect your knowledge and experience on the matters at hand, but that's not what I'm really interested in here.

Look, I know that in a perfect world you can painstankingly prove some correctness properties if you perfectly "spec out" stuff (which is probably nontrivial and/or undecidable in most cases), interfaces and whatnot. That has absolutely nothing to do with real, non-perfect world (where there's quite a lot of bugs due to type errors happening inside modules, not on the boundaries) nor my observation, that these "dynamic" "technologies" are admitting failure. Virtually every widespread-used (yes, I'm hedging here a bit) dynamic language has already or will introduce some kind of type hinting. It's inevitable and it does not take a genius to notice the pattern.

This is an admission of failure in the fundamental design.

Your asking for a good type system is such a non-sequitur that I don't even know where to start, so I'll just throw randomly ordered points:
  • You're moving the goalpost.
  • Define "good" type system, oh you can't because it's a quite subjective endeavour. Point in case - I don't consider the example for segments with x1 > x0 a desirable property of a type system, I'm not a big Prolog fan.
  • I don't believe type systems can by well-ordered on their "goodness".
  • The discussion at hand has nothing to do with type systems themselves, but the design of the programming language on whether to set on type system being static or dynamic. That's like, one bit of information right there. I don't care that Python has a lot stronger (it has) and superior (IMO it has) type system to C if it's dynamic. For this only reason, I'ma choose C for development every single time when faced with a choice between the two.
[EDIT]
typo
 
Last edited:
  • Like
Reactions: Knight of the Rope
I don't want a runtime exception. I want illegal state to be impossible in my system.

It could be made impossible in the examples I gave. If an exception is thrown when you pass an x1 smaller than x0, the segment was not created and is thus not illegal. Same with the collection example if you do checking in the "addSegment()" method and throw an exception from there if it's wrong. What would you rather have instead? The program halts (which is what would happen if the exception isn't caught anyway)?

Code:
Segment :: {x0 :: double, x1 :: double}
Better
Segment :: {x0 :: double, x1 :: double}, x1 > x0
Stronger if you can also specify an ordering semantic

Segments :: List[Segment] // Meh
Segments :: List[Segment], Monotonic increasing // Yeah
So are you basically saying that you want this "legality checking" to be a function of the language itself rather than expressed in the language? If so, I can see that, but if you have rather complex requirements for what is "legal," I think that could get awkward.
 
So are you basically saying that you want this "legality checking" to be a function of the language itself rather than expressed in the language? If so, I can see that, but if you have rather complex requirements for what is "legal," I think that could get awkward.
Not only complex, but it's more trouble than it's worth.

What you'll see in practice is that more and more logic would be poured into such checks and this solves fuck-all, merely the complexity would be transfered from imperative statements into a more functional or declarative approach. Which is precisely what this idea is all about - make the world Functional™.

And what should happen if the checks are being violated, for example when reading input from user, database, file or malicious actor? Crash and burn? Exception thrown? How are errors handled? You can't handwave the issue away by saying "type system doesn't allow for invalid segments to exist", unless you're planning on either not taking any input at all or handling it in some "unsafe" interface block/module boundary, in which case - once again - the complexity does not disappear, is just transferred.

It's all so tiresome.
 
  • Agree
Reactions: Least Concern
It could be made impossible in the examples I gave. If an exception is thrown when you pass an x1 smaller than x0, the segment was not created and is thus not illegal. Same with the collection example if you do checking in the "addSegment()" method and throw an exception from there if it's wrong. What would you rather have instead? The program halts (which is what would happen if the exception isn't caught anyway)?


So are you basically saying that you want this "legality checking" to be a function of the language itself rather than expressed in the language? If so, I can see that, but if you have rather complex requirements for what is "legal," I think that could get awkward.
I've had some pretty complex requirements in the past. If you're careful, it doesn't get hairy, and you don't have to deal with a zoo of inheritance and iterfaces
Not only complex, but it's more trouble than it's worth.

What you'll see in practice is that more and more logic would be poured into such checks and this solves fuck-all, merely the complexity would be transfered from imperative statements into a more functional or declarative approach. Which is precisely what this idea is all about - make the world Functional™.

And what should happen if the checks are being violated, for example when reading input from user, database, file or malicious actor? Crash and burn? Exception thrown? How are errors handled? You can't handwave the issue away by saying "type system doesn't allow for invalid segments to exist", unless you're planning on either not taking any input at all or handling it in some "unsafe" interface block/module boundary, in which case - once again - the complexity does not disappear, is just transferred.

It's all so tiresome.
What will happen is the same thing that happens when you try to put a string where an int should go. It's an exceptional situation and should be represented as such. It will throw, probably. The difference is that relations and types are way better than your badly hand crafted ad-hoc validation.
It matters because you can have guarantees regarding data percolating inside the system. It lets you generate data and test your assumptions.
If you're tired, go back to sleep.
 
What will happen is the same thing that happens when you try to put a string where an int should go. It's an exceptional situation and should be represented as such. It will throw, probably. The difference is that relations and types are way better than your badly hand crafted ad-hoc validation.
It matters because you can have guarantees regarding data percolating inside the system. It lets you generate data and test your assumptions.
If you're tired, go back to sleep.
This is just more handwaving, "way better" - says who? Why not the other way around - hand crafted validation is way better than badly constructed ad-hoc relations? Two can play that game, because it's the same complexity, but transferred to a different semantic construct. It's broadly the same as making a checked setter in a OOP fashion, just as @Least Concern stated. You get the same guarantees as you would in this hypothetical Prologish type system.
 
This is just more handwaving, "way better" - says who? Why not the other way around - hand crafted validation is way better than badly constructed ad-hoc relations? Two can play that game, because it's the same complexity, but transferred to a different semantic construct. It's broadly the same as making a checked setter in a OOP fashion, just as @Least Concern stated. You get the same guarantees as you would in this hypothetical Prologish type system.
"Any sufficiently advanced type systems contains a slow bug ridden partial implementation of Prolog"
The good thing about having a relational type system is that the relations can propagate throughout your system. Finite domain constraints are unpleasant, but wouldn't it be nice if the type system "knew" that not only is x>y for every x>y, but that after every linear transform of both of them, given that its the tame transform, Tx > Ty?
It's a pretty minute example but it generalizes well.
Any ad-hoc validation is just a dynamic (i.e. runtime) extension of the type system. The problem is that static languages, besides the ML family and Haskell, don't come with good facilities for "interesting" types. So it's both ad-hoc and dynamic, which is the worst of both world, in a sense.
Do we disagree on ad-hoc imperative validation being worse than the concentrated efforts of library or language maintainer(s) on implementing an expressive and flexible validation framework?
 
Do we disagree on ad-hoc imperative validation being worse than the concentrated efforts of library or language maintainer(s) on implementing an expressive and flexible validation framework?
I'm not entirely sure on what do we actually agree or disagree about. And I mean that in an honest, good faith manner.

I think you're speaking from a position of a person who had to deal with very specific types of data and validity of thereof, in very specific types of problems, where you had to fight with validity issues of this data constantly, on many layers of the software system. This is the impression I'm getting, correct me if I'm wrong.

I'm not seeing the same issues in the same light as you do, maybe because I wasn't touched as hard in a no-no place by such problems (and therefore not seeing some particular, potential bugs) or maybe I consider OOP approach as sufficiently strong to enforce the same guarantees. And yes, those guarantees can and will propagate throughout the system - how they cannot, if they're baked into the objects themselves?

Consider the segment example, where you modify the x0 or x1 coordinates using a setter function. Provided that you perform all modifications throughout the setter, that's literally the one and only place where you need to place an assert() or if () throw(); statement to enforce the guarantee.

Is it more cumbersome than marking a x0 < x1 relation in the type system? Sure! Is it more vulnerable to coding errors? Possibly, the programmer has to remember that every x0 or x1 modification has to be performed throught the setter method or else the guarantee doesn't hold anymore. So is it worse than a relation in type system? No. Is it better? No.

OK, so how is it not worse even though I myself am admitting that there are vulnerabilities? Because of the following caveats, in no particular order of importance:
  • What if there are logic errors in the relations? How does that differ from making a coding mistake in a method?
  • How would one check if the relations are satisfied and non-contradictory when some compounding is added to the types? Would such checked types be simply verboten from being compounded in more complex types? Think: I have a Segment type, now I want a Polygon type being a collection of Segments. Should each Segment in a Polygon obey the x0 < x1 relation? That's not very convenient. Should programmer define another Segment type which does not enforce the relation? Then it's even worse because you have another similar type and you might use the "incorrect" Segment type in some place.
  • Validity checking is a lot more than just simple relations on numeric values. Back to the Polygon example: I want to ensure that the points are ordered CCW. How would such complex checks be handled? Either the type system must evolve to contain a Turing-complete language (and then you get C++) or be constrained to almost trivial cases and types in which its usefulness gets immediately exhausted and back to imperative algorithms you go.
  • A bit of a goalpost moving: what about the performance hit (branches, branches everywhere) of doing such complex checks at each and every object modification (and maybe also access).
 
Nice quote, but I'd argue it's not really necessary all the time. Premature optimization and letting perfect be the enemy of good are real problems too.
Hah, I was just thinking about that Knuth quote.
I don't want to get too involved in this clash of autism, but the sentiment of "don't design your system for 10 users and scale it up; design it for 10M users then scale it down" is hardly a "premature optimization" when rebuilding an established web service that is constantly growing and subject to DDoS by hostile parties. I believe the full quote from Knuth is:
The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming.”
--------------
Python is "slow" in the sense that if you were to compare
Python is "slow" in the sense that it is fucking slow.
 
I don't want to get too involved in this clash of autism, but the sentiment of "don't design your system for 10 users and scale it up; design it for 10M users then scale it down" is hardly a "premature optimization" when rebuilding an established web service that is constantly growing and subject to DDoS by hostile parties

Choosing your language/framework based off of some hypothetical sky-high user count IS premature optimization. It's currently running off of PHP. There's an average of like 5k concurrent users. It's a forum. You don't NEED anything in particular to make that work effectively, and the main drivers for choice IMO should be ease of transitioning from PHP, support for any modules that might need to be replaced from losing Xenforo, and breadth of tutorials or stack overflow answers since he'll be going into anything other than PHP relatively blind.
 
Choosing your language/framework based off of some hypothetical sky-high user count IS premature optimization. It's currently running off of PHP. There's an average of like 5k concurrent users. It's a forum. You don't NEED anything in particular to make that work effectively
Making a forum run smoothly for 5k concurrent users is a very different load than making one run smoothly for 5k concurrent users while it's also under a significantly large DDoS attack.
 
I'm not entirely sure on what do we actually agree or disagree about. And I mean that in an honest, good faith manner.

I think you're speaking from a position of a person who had to deal with very specific types of data and validity of thereof, in very specific types of problems, where you had to fight with validity issues of this data constantly, on many layers of the software system. This is the impression I'm getting, correct me if I'm wrong.

I'm not seeing the same issues in the same light as you do, maybe because I wasn't touched as hard in a no-no place by such problems (and therefore not seeing some particular, potential bugs) or maybe I consider OOP approach as sufficiently strong to enforce the same guarantees. And yes, those guarantees can and will propagate throughout the system - how they cannot, if they're baked into the objects themselves?

Consider the segment example, where you modify the x0 or x1 coordinates using a setter function. Provided that you perform all modifications throughout the setter, that's literally the one and only place where you need to place an assert() or if () throw(); statement to enforce the guarantee.

Is it more cumbersome than marking a x0 < x1 relation in the type system? Sure! Is it more vulnerable to coding errors? Possibly, the programmer has to remember that every x0 or x1 modification has to be performed throught the setter method or else the guarantee doesn't hold anymore. So is it worse than a relation in type system? No. Is it better? No.

OK, so how is it not worse even though I myself am admitting that there are vulnerabilities? Because of the following caveats, in no particular order of importance:
  • What if there are logic errors in the relations? How does that differ from making a coding mistake in a method?
  • How would one check if the relations are satisfied and non-contradictory when some compounding is added to the types? Would such checked types be simply verboten from being compounded in more complex types? Think: I have a Segment type, now I want a Polygon type being a collection of Segments. Should each Segment in a Polygon obey the x0 < x1 relation? That's not very convenient. Should programmer define another Segment type which does not enforce the relation? Then it's even worse because you have another similar type and you might use the "incorrect" Segment type in some place.
  • Validity checking is a lot more than just simple relations on numeric values. Back to the Polygon example: I want to ensure that the points are ordered CCW. How would such complex checks be handled? Either the type system must evolve to contain a Turing-complete language (and then you get C++) or be constrained to almost trivial cases and types in which its usefulness gets immediately exhausted and back to imperative algorithms you go.
  • A bit of a goalpost moving: what about the performance hit (branches, branches everywhere) of doing such complex checks at each and every object modification (and maybe also access).
tbh I have no idea why we were having a fight either. Pax.
The approach I'm suggesting can be useful for OwOP, imperative and functional programming. What's nice with optional/incremental type systems is the ability to turn them on, off, and partially at different points. So for example, at the edges of the system where you ingest data you'll probably want it on. It'll probably be compiled as part of the constructor.
If those guarantees are satisfied, and on during testing, you can pretty calmly turn them off at production for the system "internals". Why? Because you know the constraints are satisfied for the input, and the type checker has validated that your program is correct, i.e. they have been propagated throughout your code.
To top it off, you can write "specs" for outputs as well as inputs. Then, use generative testing - generate a bunch of data based on input specs, run it through a function, test against output spec. If it fails, it means your implementation is wrong (and that the type checker has a bug). But in a dynamic language you don't have a decent type checker, so this works. Still, this technique was developed in Haskell, which is far from dynamic.
Regarding types of data I had to deal with, it's mostly just the stuff I think about and bothers me. I see people having flame wars over dynamic/static type systems and it seems backwards to me. We have types because we want to make sure our shit is correct. They should work for us. Having Maybe<Thing> or Some[Shit] makes me work for the type system, not the other way around, and still doesn't solve the problem that correct data isn't just an aggregate of types, but lots of times a relationship between them.
Have you ever had an opportunity to use logic programming languages like Prolog? They make stuff like type checking almost trivial, and it will propagate across your code base and let you know if you fucked up, including contradictory facts. Imagine you add one type or function then you get "error, fact T1 contradicts fact T2 via F".
You can see this example, which is even verbose relative to the prolog version
The segments example I provided was a pretty degenerate case. They were one dimensional, so order had meaning. Buckets might have been a less confusing term. In your example with a polygon I would represent segments as a 4-tuple of cartesian and circular coordinates. This makes writing assertions about a directions very easy.
Prolog is Turing complete but tbh you won't need 5% of it for most use cases.
Performance I believe is addressed partially by the capability of turning some features off at certain points of your program. The rest can be compiled pretty efficiently and called by your data constructors. Code generation for this isn't too hard (speaking from experience). Why would there be plenty of branches then? you're not going to run the entire prolog at runtime, that would be insane.
 
Hello fellow kiwis who have Learned to Code™, allow me join the 'sperging.

The fact that most static type systems don't have dependent types makes them almost useless for detecting bugs in anything but purely scalar code.

I strongly disagree with the "almost useless" part.

When I compare my experiences with Python on the one hand and C++/Java on the other, the static typing in the latter has been very helpful in detecting bugs at compile time that could have easily slipped into the finished software when using Python.

Sure, the static type system can't prevent all buggy code, but it can prevent a lot of it.

In very rough pseudo code:
Code:
Segment :: {x0 :: double, x1 :: double}
Better
Segment :: {x0 :: double, x1 :: double}, x1 > x0

You'd implement this as a custom type Segment with a public constructor that swaps the incoming values as needed to enforce the "x0 ≤ x1" invariant. Java-ish pseudo code:

Code:
class Segment {
    private double from;
    private double to;

    public Segment(double from, double to)
    {
        if (from <= to)
        {
            this.from = from;
            this.to = to;
        }
        else
        {
            this.from = to;
            this.to = from;
        }
    }

    // ...
}

This way, users of your type cannot accidentally construct an object of it that does not satisfy the desired invariant. No need for exceptions or other run-time error handling.

Code:
Stronger if you can also specify an ordering semantic

Segments :: List[Segment] // Meh
Segments :: List[Segment], Monotonic increasing // Yeah

Same deal: You'd define a custom type, say MonotonicList<Segment>, whose public interface does not provide a way to insert an element in an arbitrary position - instead, its public insert method would always automatically insert the element in the correct position. The "monotonic increasing" invariant is thus ensured, and again without the need for run-time error handling.

Is there a real benefit to defining such invariants in the type system itself (as you endorse) rather than in the implementation of individual custom types (as I've shown above)?

I'm not a CompSci academic, so it's quite possible I'm missing something here.
 
Last edited:
  • Informative
Reactions: albert chan
Hello fellow kiwis who have Learned to Code™, allow me join the 'sperging.



I strongly disagree with the "almost useless" part.

When I compare my experiences with Python on the one hand and C++/Java on the other, the static typing in the latter has been very helpful in detecting bugs at compile time that could have easily slipped into the finished software when using Python.

Sure, the static type system can't prevent all buggy code, but it can prevent a lot of it.



You'd implement this as a custom type Segment with a public constructor that swaps the incoming values as needed to enforce the "x0 ≤ x1" invariant. Java-ish pseudo code:

Code:
class Segment {
    private double from;
    private double to;

    public Segment(double from, double to)
    {
        if (from <= to)
        {
            this.from = from;
            this.to = to;
        }
        else
        {
            this.from = to;
            this.to = from;
        }
    }

    // ...
}

This way, users of your type cannot accidentally construct an object of it that does not satisfy the desired invariant. No need for exceptions or other run-time error handling.



Same deal: You'd define a custom type, say MonotonicList<Segment>, whose public interface does not provide a way to insert an element in an arbitrary position - instead, its public insert method would always automatically insert the element in the correct position. The "monotonic increasing" invariant is thus ensured, and again without the need for run-time error handling.

Is there a real benefit to defining such invariants in the type system itself (as you endorse) rather than in the implementation of individual custom types (as I've shown above)?

I'm not a CompSci academic, so it's quite possible I'm missing something here.
The big sell of having this constraints as part of the type system (you'll want to compile them into the constructors like you've hand coded, too) is constraint propagation. You only know to > from at construct time. Imagine this knowledge went with the object's type everywhere you used it in the system. You could infer pretty complex and useful stuff regarding data floating around in your system.
I'm not a compsci, either, but I think it's pretty cool, and I get the impression there are certain predilections towards OOP and ALGOL family from the industry and schools which I'm just not attached to. Makes for a weird perspective, but it works for me.
Edit: an example of how this could work IRL - imagine you knew for some reason some list can't have more than 5 elements, and code in another place in your system tries to take 6 elements from it. Why should it be a runtime exception and not a compile type error if you 100% know it couldn't happen?
 
Choosing your language/framework based off of some hypothetical sky-high user count IS premature optimization. It's currently running off of PHP. There's an average of like 5k concurrent users. It's a forum. You don't NEED anything in particular to make that work effectively, and the main drivers for choice IMO should be ease of transitioning from PHP, support for any modules that might need to be replaced from losing Xenforo, and breadth of tutorials or stack overflow answers since he'll be going into anything other than PHP relatively blind.
I disagree that the main driver for language choice should be ease of transitioning from the previous language: the driver should be the benefits gained from transitioning weighted against the cost of the transition. I fail to see any benefits in Python given this scenario.

Your arguments presumption is that there must be a transition from PHP into anything. This is false. The situation is that XenForo is at risk of being lost, and it is an opportune time to transition into a better foundation than PHP. I disagree that Python forms a better foundation.

As for tutorials and Stack Overflow answers for Python I will argue that the good information is drowned out by the contradictory postings of semi-literate retards. This is true for PHP as well, to a lesser degree, but the accessibility and popularity among non-programmer academics has had a detrimental effect. We must also account for the fact that the subject is already proficient in PHP.
 
Back