Programming thread

  • Want to keep track of this thread?
    Accounts can bookmark posts, watch threads for updates, and jump back to where you stopped reading.
    Create account
Does anybody else see and hate these people with github profiles absolutely stuffed with "personal projects" that are just static github pages sites? They can't even bother to follow a tutorial for a bullshit CRUD app. Commit calendar fully greened out with unsolicited style tweaks or AI slop pull requests. It really just puts a sour taste in my mouth to see hordes of people that appear to have zero passion or interest in software, cynically filling a portfolio with copied homework. These motherfuckers always have their profile picture as their headshot with every social media link on earth too. Are you getting hired? Who's hiring you to deploy template sites???
When we wanted to hire a few mid-levels at my company this is the exact thing the two non-technical managers fawned over to no end. You're absolutely right, it's every Muhammed, Ranjeet and Xanjuai that stuffs their GitHub profiles like tissues in a bra. You scratch the surface and it's all trash. Gatekeep, gatekeep, gatekeep.
 
Does anybody else see and hate these people with github profiles absolutely stuffed with "personal projects" that are just static github pages sites? They can't even bother to follow a tutorial for a bullshit CRUD app. Commit calendar fully greened out with unsolicited style tweaks or AI slop pull requests. It really just puts a sour taste in my mouth to see hordes of people that appear to have zero passion or interest in software, cynically filling a portfolio with copied homework. These motherfuckers always have their profile picture as their headshot with every social media link on earth too. Are you getting hired? Who's hiring you to deploy template sites???
Eh, I mean, I enjoy programming and I enjoy it as a hobby as well as a day job.

But if it's much more of a professional interest to someone, I can't really blame them for that alone. You do need at least a bit of passion in it, or else you're almost certainly incompetent and terrible to work with. But if it's not your main hobby, that's fine too.

And a consequence of how the market works, is that to get ahead, it may be effective to glitz your professional persona up with dumb useless busywork. I'm terrible with that, but I can't blame someone for trying.

I've met competent people with that nonsense and I've met abject morons with that nonsense. Although I guess if someone has absolutely of technical interest on your their profile and it's 100% slop, yeah, they're probably a dumbass.

I'm reminded, the jeetery I work at is saying they want us to get useless AWS certifications in the next few months. If I didn't have dumb esoteric nerd bullshit like Lisp or functional programming on my socials, I'd probably come off publicly like one of those morons too.

I guess that's sorta what you're getting at.
 
Interesting fallout from the march of 'AI'. Bug bounties are becoming nonviable because its getting too easy to automate submission of superficially passable but ultimately frivolous bug reports

https://youtube.com/watch?v=PG5sv20Jiic
Not the way I expected this to end, tbh. I had assumed a lot bounties would dry up due to code review becoming trivial with LMMs, but it makes sense that they're getting jugaad'd to death. This is essentially taking all the existing problems in most bug bounty platforms and multiplying them by 100.
 
So I think I found a use of AI in development that's delivered actual significant value to something I'm working on.

I'm working on writing a CapnProto implementation for Scheme, in Scheme. CapnProto is like Protobuf. It's a language and tooling for specifying binary formats/protocols. It's used a lot for internal infrastructure. Often backend microservices talking to each other will use protobuf.

If you're not familiar, the workflow is that you write up the various structures and types you need for your protocol in the language, run the CapnProto compiler, and it'll generate parsing and writing code for whichever programming language you're going to be using it with.

Repos I've worked on before have used protobuf or capnproto or similar things and the build process will often generate bindings for the same protocols in golang, python, node, etc.

The wire format it produces isn't usually self documenting. You do need the schema written in the original langauge to understand it. But that's fine in a lot of use cases. You save a bunch of resources and performance not wasting time with JSON parsing for internal services.

Anyway, I'm wanting to write a code generator plugin for CapnProto for Chicken Scheme. And part of that requires that I be able to parse the wire format.

Here's their documentation on that. It's pretty readable and I was able to do a lot of it without assistance. But still, I did hit a point where I had a sample binary file in a blob in the chicken interpreter, and I had that same file open in Bless, and I was going over and over and over the documentation and I had no clue why I wasn't able to read the header of a given struct.

I was able to ask chatgpt for some help, by posting the octets and asking about capnproto. And amazingly, it was able to get me past my issues.

(My first issue was that the documentation mentions data segments, but I had no clue that's what my code generation plugin would be receiving. My code started immediately once I first parsed out the segment table header.)

I'm still writing the code. I don't know if I'd trust AI to write anything significant. But just helping fill in gaps in my understanding, very useful.
 
i am totally ignorant on coroutines
can someone explain to me why i would want to use coroutines instead of just spawning and managing worker threads myself the regular way and handing them whatever task i want done concurrently in the form of some 'task' or 'executable' object?
It's been a while since I've thought about this so my reasoning is going to be less than airtight, but let me try. Keep in mind that I speak from the perspective of someone who's interested in designing concurrent systems. Coroutines have other uses besides enabling concurrency -- see for example "generators", as in Python's.

Coroutines and OS threads have one crucial thing in common: both store the execution context of a logical task.

For an OS thread, it's easy: the stack is where the execution context is stored. From the IP to the state of local variables, everything is stored in the stack.

For a coroutine, well, the particulars get really different depending on your programming language/framework of choice, but ultimately, it's still the same: just a chunk of memory where the code that implements the coroutine stores its state (its "locals", the "instruction pointer" -- the point at which the coroutine yielded last, -- etc.)

As an aside: this video is really really interesting if you want to understand what a "coroutine" really looks like, in terms of the data structure that represents its state, and in terms of how the coroutine's source code is lowered to a regular function (which is typically a state machine). Note that Rust's async functions are coroutines. https://www.youtube.com/watch?v=ZHP9sUqB3Qs

The main difference then between coroutines and OS threads is in how and by who that "logical task" is scheduled to actually execute, and most importantly: cost.

When you create an OS thread, you rely on the OS scheduler to dispatch the subroutine designated as its entry point. Simple as, it's completely outside of your control.

Coroutines however require you to build a scheduler within your program. And unlike threads, they are cooperatively scheduled -- a coroutine only yields control at predefined points in its execution body, which means that if you screw up and write a coroutine that ever blocks on anything (I/O, a mutex, etc) your whole program grinds to a halt.

Why would you ever prefer coroutines, when OS threads do everything they can, and do not require you to reimplement a scheduler within your program? The answer is cost.

>and handing them whatever task i want done concurrently in the form of some 'task' or 'executable' object?

Well here's the thing, when your "tasks" are utterly trivial but massive in volume (a perfect case study are instant messaging servers, which are saturated by volume of tasks, but each task is no more than storing and relaying a tiny message from A to B), the real-world overhead imposed by OS threads becomes the limiting factor.

In the IM example, you can distill your task down to "Sender, Recipient, Message body" -- let's say all together 1KiB in size.

A typical OS will get 1MiB of stack allocated to it, nevermind the kernel-internal data structures to keep track of it and schedule it. And the scheduling is going to be much more heavy-weight, too. And you're going to have to call into the kernel who knows how many times (at the minimum you'll need to mmap a stack, clone() the thread, then glibc will probably do a bunch of syscalls in its prologue, then the thread needs to exit(), ....)

So there you have it. If your tasks are heavy-weight, then the benefits of using coroutines decrease -- the added complexity is not worth the returns. If your tasks are light-weight however, modelling them using OS threads wastes resources, unacceptably so depending on your requirements.
 
Often backend microservices talking to each other will use protobuf.
Moderately off topic, but what’s the deal with microservices? I’ve heard people justify them by comparing them to the Unix shell and its utilities. However, when you use the shell to build a script, mostly you’re using utilities that you didn’t have to write, but in every project I’ve seen that uses microservices, they end up writing their own for each of the services. That’s kinda like writing your own shell just to write a single script.
Maybe they’re useful if you’re trying to build highly distributed services or something, but apart from that I don’t really get it.
 
Moderately off topic, but what’s the deal with microservices? I’ve heard people justify them by comparing them to the Unix shell and its utilities. However, when you use the shell to build a script, mostly you’re using utilities that you didn’t have to write, but in every project I’ve seen that uses microservices, they end up writing their own for each of the services. That’s kinda like writing your own shell just to write a single script.
Maybe they’re useful if you’re trying to build highly distributed services or something, but apart from that I don’t really get it.
The more common, legitimate use case is when you're working on a big corporate web app and you run into tasks that really deserve special attention because scaling will get messy. You can have a whole separate team handle it and only coordinate over a shared protocol, and not actually have to be balls deep in each other's code.

Like in the same way databases or storage services like S3 became their own thing.

Yeah, lots of dipshits started thinking they could just split the whole project up into a bajillion little services without any serious justification.
 
Moderately off topic, but what’s the deal with microservices? Maybe they’re useful if you’re trying to build highly distributed services or something
The more common, legitimate use case is when you're working on a big corporate web app and you run into tasks that really deserve special attention because scaling will get messy.
Yes, this, except from a slightly more insidious angle. Like in the corporate push for the object-oriented paradigm, the actual motivation behind divide-and-conquer microservice design is to arrive at a service backend which is composable and scalable by adding these paint-by-the-numbers lego blocks, which are at an idealized level of simplicity for enabling a cookie-cutter development process where you can just schedule N interchangeable developer hours to each as needed.
 
Last edited:
It’s also removed from POSIX and neither glibc nor musl impliments it.
I will never stop being mad about this. They deprecated it with no fucking replacement whatsoever and said "just use threads lmao" as if that's the same thing. If I remember correctly, the only reason they deprecated ucontext.h is because of the type signature of makecontext(). They could have just fixed that instead.
 
i want this refactored by end of day today
Reminds me of my first programming job, where I was to rewrite a ~10 kloc VBA script (all in a single function) into C#. Not so fun* times lol, though still better than the development hell that followed...

* Looking back, it was actually pretty funny. A 2D point array was a float array, and you had to make sure to keep the odd/even alignment when reading it, for example. Also a loop, helpfully commented "planes", that was around 5 kloc long, that made me literally hallucinate while trying to understand and rewrite it, the only time a piece of code made me have such an experience.
 
Anyway, I'm wanting to write a code generator plugin for CapnProto for Chicken Scheme. And part of that requires that I be able to parse the wire format.
So I'm still working on this.

I have a handwritten, inefficient parser for encoded CapnProto objects, and specifically, the data structure their tool hands off to the language-specific code generator. And now I'm starting to generate code.

Scheme/Lisp is great for writing these kinds of code generation tools, because you're already used to this kind of code. It's just writing a macro where the input tree has a slightly clunkier API than normal sexprs.

So the low hanging fruit, enums. Here's what an enum looks like in CapnProto:
Code:
enum Status {
  unknown @0;
  submitted @1;
  pickedUp @2;
  succeeded @3;
  failed @4;
  canceledPrePickup @5;
  canceledPostPickup @6;
}
(My sample CapnProto schema is for a hypothetical job scheduling microservice. Like I want to be able to say "here's a video file, preprocess it through ffmpeg" and schedule that job and then step away.)

Here's the code my generator is producing right now:
Code:
(define (Job.Status->number value)
  (case value
    ((unknown) 0)
    ((submitted) 1)
    ((pickedUp) 2)
    ((succeeded) 3)
    ((failed) 4)
    ((canceledPrePickup) 5)
    ((canceledPostPickup) 6)
    (else (error "bad Job.Status value" value))))
(define (number->Job.Status value)
  (case value
    ((0) 'unknown)
    ((1) 'submitted)
    ((2) 'pickedUp)
    ((3) 'succeeded)
    ((4) 'failed)
    ((5) 'canceledPrePickup)
    ((6) 'canceledPostPickup)
    (else (error "bad numeric Job.Status value" value))))
(define +Job.Status-to-number-alist+
  (list (cons 'unknown 0)
        (cons 'submitted 1)
        (cons 'pickedUp 2)
        (cons 'succeeded 3)
        (cons 'failed 4)
        (cons 'canceledPrePickup 5)
        (cons 'canceledPostPickup 6)))
(define +number-to-Job.Status-alist+
  (list (cons 0 'unknown)
        (cons 1 'submitted)
        (cons 2 'pickedUp)
        (cons 3 'succeeded)
        (cons 4 'failed)
        (cons 5 'canceledPrePickup)
        (cons 6 'canceledPostPickup)))

Here's the generator portion that produced it:
Code:
(define (capnp:generate-enum req enum-node)
  (let ((type-sym (string->symbol
                   (capnp:node-trim-shared-prefix req enum-node))))
    `(begin
       (define (,(symbol-append type-sym '->number) value)
         (case value
           ,@(map (lambda (enumerant)
                    `((,(string->symbol (capnp:enumerant-name enumerant)))
                      ,(capnp:enumerant-code-order enumerant)))
                  (capnp:node-enumerants enum-node))
           (else (error ,(format #f "bad ~s value" type-sym) value))))
       (define (,(symbol-append 'number-> type-sym) value)
         (case value
           ,@(map (lambda (enumerant)
                    `((,(capnp:enumerant-code-order enumerant))
                      ',(string->symbol (capnp:enumerant-name enumerant))))
                  (capnp:node-enumerants enum-node))
           (else (error ,(format #f "bad numeric ~s value" type-sym) value))))
       (define ,(symbol-append '+ type-sym '-to-number-alist+)
         (list
          ,@(map (lambda (enumerant)
                   `(cons
                     ',(string->symbol (capnp:enumerant-name enumerant))
                     ,(capnp:enumerant-code-order enumerant)))
                 (capnp:node-enumerants enum-node))))
       (define ,(symbol-append '+number-to- type-sym '-alist+)
         (list
          ,@(map (lambda (enumerant)
                   `(cons
                     ,(capnp:enumerant-code-order enumerant)
                     ',(string->symbol (capnp:enumerant-name enumerant))))
                 (capnp:node-enumerants enum-node)))))))
It's a little messy and hard to read if you're not familiar.

Probably the most complicated syntax is the quasiquote syntax. Quasiquote lets you create a complicated list tree with individual elements substituted in with evaluated code. So `(a (list here) and 3 plus 3 is ,(+ 3 3)) evaluates to (a (list here) and 3 plus 3 is 6). The comma is "unquote" and it selectively evaluates the expression. The comma and the @ symbol in the enum code above is unquote-splice, which expects that the expression will return a list, and it expands the list inside the parent quasiquote.

A little confusing, but it makes the enum code (and any kind of macro code) a lot easier to write. As always, balancing terseness and readability. I might clean up my enum code later for better readability.

I'm just warming up with the enums though. This struct will be where things get tricky:
Code:
struct Job {
  struct Metadata {
    id @0 :UInt64;
    status @1 :Status;

    union {
      processVideo :group {
        inputKey @2 :Text;
      }
      archiveVideo :group {
        id @3 :UInt64;
      }
    }
  }
}
So this is where it gets tricky. The real struct is Metadata, Job is empty and is essentially just a namespace.

A struct in CapnProto has a section for flat data like ints and floats, and a separate section for pointers for nested structs (or arrays or anything requiring more alocation) after that.

So the challenge will be to generate code that manages a blob and lays out all the space for dynamically sized fields.

I think I'll provide two functions for each struct. One function that takes up front, all the fields you want to fill in, including any needing any pointer fields (and therefore some additional storage tacked on after the toplevel struct. And then another function that simply takes a pointer to a blob (say read over the network) and wraps it for read access.

I'll generate getters and setters for all the fields too. A setter for a flat value like an int is easy. That'll compile more or less down to the C equivalent of pointer->value = whatever. But if you want to set a text field, that's complicated. I can probably support setting pointer values if the new replacement value is the same size or smaller than whatever was in the struct when you originally parsed it. But if I want my setters to support completely arbitrary replacement values, that'll get complicated if I want to balance efficiency and usability of the generated API.

So! For now, for the above Job.Metadata struct, here's the empty code I'm generating:

Code:
(defstruct wrapped-Job.Metadata blob byte-start byte-length resize?)
(define (blob->Job.Metadata blob #!key (byte-start 0) byte-length resize?)
  (let ((byte-length (or byte-length (blob-length blob))))
    (make-wrapped-Job.Metadata
      #:blob
      blob
      #:byte-start
      byte-start
      #:byte-length
      byte-length
      #:resize?
      resize?)))
(define (make-Job.Metadata)
  (error "todo finish implementing make-Job.Metadata"))
(define (Job.Metadata-id obj)
  (error "todo finish implementing Job.Metadata-id"))
(define (Job.Metadata-id-set! obj val)
  (error "todo finish implementing Job.Metadata-id-set!"))
(define (Job.Metadata-status obj)
  (error "todo finish implementing Job.Metadata-status"))
(define (Job.Metadata-status-set! obj val)
  (error "todo finish implementing Job.Metadata-status-set!"))
(define (Job.Metadata-processVideo-inputKey obj)
  (error "todo finish implementing Job.Metadata-processVideo-inputKey"))
(define (Job.Metadata-processVideo-inputKey-set! obj val)
  (error "todo finish implementing Job.Metadata-processVideo-inputKey-set!"))
(define (Job.Metadata-archiveVideo-id obj)
  (error "todo finish implementing Job.Metadata-archiveVideo-id"))
(define (Job.Metadata-archiveVideo-id-set! obj val)
  (error "todo finish implementing Job.Metadata-archiveVideo-id-set!"))

Just throws errors, not done yet. But I can experiment by writing the parsing code by hand and getting an idea of what kind of generated code I ultimately want to produce.

Edit: Well also I should probably also look at what Go or Python are doing for their CapnProto generators. Get an idea of what sort of features a typical generated API supports.
 
Back
Top Bottom