Programming thread

  • The Kiwi Farms was targeted with a historic DDoS attack last week. Confirmed reports of 2.8Tbps on an individual provider, topping 4.8Tbps when known volumes are added together, and possibly up to 8Tbps if the attacker is believed. This is the third of fourth largest recorded.
  • Want to keep track of this thread?
    Accounts can bookmark posts, watch threads for updates, and jump back to where you stopped reading.
    Create account
You want this to be accessible to non-programmers, but you also want features like inlines and callables?
We already have Python tbh

To elaborate on this:
Languages shouldn't cater to normies. There's another project like this led by some guy @ MIT, essentially Scratch for Javascript.
I think (not sure) I made the lead project mad by dismissing his project during the live presentation but I have a valid merit to do so.
All of those projects are done from the perspective of a nuanced programmer, NOT from an average person's point of view, let alone a K12 schooler.
Personally, as a zoomer my problem used to be that we had forced to deal with babyshit tools like Scratch, instead of real-world stuff I almost use daily as an adult (Python, TS, C#/Java, C).
The problem is twofold:

1) Most teachers outside of academia are amazingly retarded.
This is a systemwide problem and it would require some kind of a network connecting professionals, academics and kids.

2) Brain matter doesn't fucking grow on trees. Sometimes you need to omit some stuff or oversimplify in order to get your point across. It's easy enough to understand basic concepts like recursion (or mathematics), provided you are smart enough.
Kids aren't there yet, this is why they're here, not at fucking Oxford, Stanford or Priceton.
The curriculum would have to be shaped appropriately in order to compensate.

Now, how does this apply to normies?
Well, they're kind of like mentally stunned children.
I recall watching some bad (yet popular!) YT videos on stuff like Game Theory only to discuss a 10- minute lecture topic in such a bad way it was hilarious.


Speaking of Python, I have recently been doing some algorithms for my graph theory class.

https://en.wikipedia.org/wiki/Graph_coloring

I'll later post the algorithm from my GH.


Also, how the hell do you write a webserver? We only had some semblance of that in Java were we used the Socket Listener to accept requests on port XYZ. I assume the only way to do this right would he to read through the RFCs (like an engineer would) and handle the standard precisely.
Someone in the thread mentioned it's every programmers toy project, but considering that it's just insanely hard I doubt everyone wrote a web server by themselves.

Then again I kinda suck at programming. I limited my procrastination to almost minimum and I find myself sometimes just writing Test cases instead of actual code or program.
I feel like sometimes my brain wants some off time.
 
Last edited:
Principles and Practice Using C++",
GOD NO.
No.
20260601_010122(1).jpg
Do NOT pick this book up if you have ADHD.
This book also has very infamous exercises such as "bake the blueberry pie" to understand the importance of natural language as opposed to fucking programming C++.

20260601_010645(1).jpg

It's okay if it's your first CS course book. Otherwise, I would consider something else.

idk in my college we started off first semestre with some python then we had mfing MATLAB!!!! and only then we had c/c++ and after that we had verilog and embedded c
Seems like your uni is decent.
DM me, I'd like you to ask you a personal question.

one of my college friends, total ee GOD (would pick electronics out of trash and repair them and such), had ABYSMAL coding practices
Verilog tends to do that to you.
 
This book also has very infamous exercises such as "bake the blueberry pie" to understand the importance of natural language as opposed to fucking programming C++.
i got a "program in unity" book (really shit idea i dont recommend it) and in it there was an apple pie recipe to compare natural language instructions to machine instructions
my mom and i made the recipe (with some slight adjustments) and it turned out pretty cool - she put it in the family heirloom ass recipe book as "C++ apple pie" and im nowhere near autistic enough to correct her
 
Languages shouldn't cater to normies.
Some languages should cater to normies. You communicate with normies in a language, and sometimes you have to choose or invent a domain-specific language for better communication (clean presentation, automation, and less bullshit). Chess notation, sportsball notation, timetables, math, accounting. I don't know about you, but where I am, cooking the books is traditionally a woman's job.

There's another project like this led by some guy @ MIT, essentially Scratch for Javascript.
I think (not sure) I made the lead project mad by dismissing his project during the live presentation but I have a valid merit to do so.
Scratch is not a "normie" domain-specific language for normies, it's an initiative to get illiterate children ("visual learners") started on Real Programming:
Scratch, developed by the MIT Media Lab, is a free, web-based platform that introduces programming in a fun and accessible way.
The goal of Scratch is to let children have fun with their stupid throwaway pregnant elsa games and eventually graduate to Building (always this word) Killer Apps in Real Languages. It's a failure because
(1) they're fucking illiterate and no amount of pregnant elsa is going to teach them to read, and
(2) no one is interested in pregnant elsa games, not even the parents, otherwise they'd have taught them to read and write by 5.

My goal is to have a clean readable way to write interactive stories, as the end product. There *is* a market for stories. I don't care if the authors "graduate" to Real Programming, in fact I'd rather they didn't.
(A domain-specific language for stories is different from a traditional programming language. The "code" is not supposed to be reused, nor read (as code) significantly more often than written, nor collaborated on by more than a handful of people, and it is in fact good if a lot of it is interdependent.)

I am *not* targeting "visual learners", "visual learners" can go pound sand. I'm not interested in their creative output, and I'm not interested in running a commercial or "charitable" scam that promises to teach them Real Programming.
 
My goal is to have a clean readable way to write interactive stories, as the end product. There *is* a market for stories. I don't care if the authors "graduate" to Real Programming, in fact I'd rather they didn't.
You could have just told us you're doing a VN engine.
Nothing shameful in that.

Also, the kids are pretty much literate here in Poland by five or six.
I agree with your take on visual learners.
 
Alright, I gotta vent a bit.

I was introduced to a legacy codebase at work, for a website with user settings. The kind that every interactive website has; Nothing special, but a lot of them (around 70 settings).
It seems obvious how to do handle this correctly, right?
Every date-valued setting is stored, read, displayed (date picker), validated, and modified the same way.
Every percentage-valued setting is stored, read, displayed (slider), validated, and modified the same way.
And so on.
So you create (or reuse) an abstraction layer in both the back-end and the front-end that can handle all setting data-types that you need, and then you merely have to define the list of relevant settings (and their data types / properties) in a single place, and pass this list around as data. Seems obvious, right?

Well, not to the schmucks (employees before my time) who built this website.
Here's what they did:
  • The procedural back-end code spells out for each setting, how to read it from storage (which involves some data-type-specific parsing/conversion).
  • Then the back-end code builds up the settings HTML page step by step, spelling out for each setting what kind of HTML form element it should use, how to fill that element, and what custom ID to give each one.
  • On the client-side, a hand-written JavaScript function gets called whenever the focus leaves any of these form fields. It reads every single setting individually based on its HTML element ID, and explicitly calls the appropriate (data-type specific) validation function for each one.
  • When the user clicks the Save button, another JavaScript function builds up an Ajax request with a FormData request body - and you guessed it, it spells out, for every single setting, from which HTML element ID it should be read from and which bespoke field name it should have in the request body.
  • Back on the server side, the HTTP path which accepts this Ajax request is defined using a web service framework that can perfectly well handle "accept all form fields as a map", but instead, they opted to define 70+ hard-coded form field parameters using the same field names that the aforementioned JavaScript sends.
  • It then spells out for each one of these incoming parameters, how it should be validated (again) and stored.
Fuuuck meeee.
Predictably, there are bugs from when settings were added or changed and it wasn't done consistently in all of these places. How could there not be!
Did it at no point occur to these people to ADD A FUCKING LAYER OF ABSTRACTION? Fuck!

I don't get programmers like this at all.
I got into programming *because* I wanted to do things the clever way, rather then the "mindless drone" way.
I originally learned regular expressions because it was cooler and more engaging than manually searching & replacing lots of similar lines of text, and from there I went on to console scripting and then to real programming languages. I thought this mindset was somewhat universal among people who gravitate to this hobby & profession, but apparently not?

If you're faced, over and over again, with a system that is positively *begging* you to add a simple layer of abstraction to get rid of error-prone, hard-to-maintain severalfold duplication of information, and your attitude is, over and over again: "Nah, I'll just keep extending it the rote, mindless way, thus long-term wasting my time and my employer's money, because doing it the right way would require me to put a little bit of up-front thought into how to structure things, and might even (shock! horror!) make me learn something." If that's your attitude, then how did you even end up as a computer programmer?

And you know what's most shocking about the codebase in question?
As far as I could find out, the culprits weren't even Pajeets.
 
Hello. Programmer of nearly 15 years here.

I never 'got' recursion as in doing it. And it was only in school, because it's foundational to CS itself and parts of compilers (ASTs). In work, I literally never, ever do non-trivial recursion. I don't do coding interview bench racing to get jobs.

How the fuck do you stop the mental infinite recursion loop when you do this shit anyway? Figured better late than never.
 
How the fuck do you stop the mental infinite recursion loop when you do this shit anyway? Figured better late than never.
Go grind DFS problems on leetcode.

Edit:
Generally you want to start with conditions that end function.
I.e for Fibonacci:
Code:
f 0 = 0
f 1 = 1
f n = f (n - 1) + f (n - 2)
or
Code:
int fibonacci(int n) {
  if (n == 0) return 0;
  if (n == 1) return 1;
  return fibonacci(n-1) + fibonacci(n-2)
}
 
Last edited:
I never 'got' recursion as in doing it.
i mean as far as i know the current cpu architecture is pretty unfriendly to recursive functions
How the fuck do you stop the mental infinite recursion loop when you do this shit anyway?
uhh find a condition when the function should stop and then stop the function when it reaches that condition (i struggle with recursion as well, but some problems are easier for me to do recursively than with a loop)
 
Also, how the hell do you write a webserver? We only had some semblance of that in Java were we used the Socket Listener to accept requests on port XYZ. I assume the only way to do this right would he to read through the RFCs (like an engineer would) and handle the standard precisely.
Someone in the thread mentioned it's every programmers toy project, but considering that it's just insanely hard I doubt everyone wrote a web server by themselves.
It's not conceptually hard.
HTTP requests and responses are languages with a grammar. A really shitty, horrible grammar.

So first you're writing a parser and unparser for those languages. Either by hand or with parser generator tools or parser combinators or whatever works.

Strictly speaking, for a web server, depending on how you want to use it, I guess you can get by with just writing a request parser and a response unparser, but doing both ways for both is convenient for debugging. Or proxying requests to another backend server.

Once you have the parsers/unparsers, then you just hook them up to a TCP listener and loop. Start by accepting a connection. Then run the request parser, then read the body (length specified by a header in the request). Then handle the request and its optional body however you want. And reply by constructing a response and unparsing it out.

There's some complications, like connection keepalives and other nonsense like that.

Most of this stuff is specified in standards like you mentioned. There's occasionally common practices you'll need to google.

Hell, at this point, adding TLS isn't super hard. Pick an accessible TLS library and slap it in front of your TCP connection and voila.

I think it's recommended as a good lesson for beginners because you get the full engineering rollercoaster of "oh, great, there's standards" and "why the hell did they do it that way? I'm going to have to refactor everything" and then "wait, why are no actual web clients doing it the way the standard says??".

When you finally get it working, you get the programmer equivalent of the thousand yard stare.

And as I write all this nonsense up, I'll admit I've never done it directly myself. I've read simple HTTP implementations, but never written one myself. But I think I've done comparably involved things. Like I wrote a parser (and renderer) for one of the MPEG video standards.

It's a parser, except you're not dealing with textual characters, but bits. I wrote the parser by hand and I had to implement backtracking for sequences of bits if they didn't match, and there's all kinds of elaborate shit in the parse tree. And then I had to dive into the math that actually enables image and video compression. I am not particularly a mathematical guy. But now when I see video compression artifacts, I actually know why they exist.

This was pretty cool when I first got a frame to partially render:
1780359116727.png


All the gibberish at the bottom was because some buffer I was using didn't get cleared at the right time.

And this is when I actually got that first frame to render:
1780359160132.png


That's the first frame from the movie Super Troopers.

If I did it again, I'd start backwards and start with designing the APIs between the modules around the final result, a rendered frame, and then gradually get back to the obnoxious binary MPEG structures later.

I also learned the value of color spaces other than RGB. YCbCr is really interesting because it dedicates a whole component (the Y) to grayscale, and because the rod cells in our eyes are more sensitive and only see light/dark, you can still get a decent picture visually by compressing the Cb and Cr aggressively as long as you keep the Y high resolution.
 
It's been a while since I have done web stuff (anyone else remember Silverlight?) but I've been meaning to get back into it and I thought that this would be fun.

2026-06-01-001843_369x47_scrot.png
2026-06-01-001900_125x44_scrot.png
2026-06-01-001917_158x85_scrot.png
Some things never change, do they?

It (as much as there is an it) depends (loosely) on pandoc, python, and python-lxml, it's pretty shit, and I don't like it, but it's amazing how much you can do with so little when someone else has already done all of the hard work for you. I don't think that it would be too much effort to rewrite this in C with treesitter. If only I were any good at writing stories, or remotely interested in making visual novels...

The "game script" is written in markdown, and consists of a number of "scenes". Each scene has a header (which should be unique across the whole script), and whataver markdown you want (typically some text to display, some code to run, and a list of choices). The targets for the links are header names of other sections. The only "special" syntax is the code tags, these contain javascript which will be evaluated on each "scene transition".

Simple enough for simple tasks, complex logic is probably going to be absolutely hideous though.

Markdown (GitHub flavored):
work_left
---------

Oh good, the Farms is up!

> `player.name`, are you going to do any coding today?
>
> -- Boss

Your have `player.work_left` work to do in `player.time_left` hours.

Go back to the cagie, wagie?

- [Yes. Whoever does not work, does not eat.](#do_work)
- [Nah screw this, let's laugh at some freshly butchered trannies.](#relax)
- [Shop](#shop)

Triple backticks (which will convert to <pre><code>) are not replaced with the "value" of the code. Since this is markdown, and parsed with a full markdown parser, HTML is also supported and can be used to get input from the user. The "start" scene is the entry point.

Markdown (GitHub flavored):
start
-----

```
player = new Player()
shop = new Shop()
```

<form method="dialog">
<input id="player_name">
<button onclick="player.name = player_name.value;">Set Name</button>
</form>

- [new game](#work)

Anything in <script> tags will be executed on page load.

Markdown (GitHub flavored):
setup
-------

<script>
class Player {
	name = "Nigger"
	shekels = 0

	work_speed = 1
	work_streak = 0
	wages = 1

	work_goal = 20
	time_goal = 20

	work_done = 0
	time_taken = 0

	time_left = 20
	work_left = 20
}

class Shop {
	coffee_price = 10
	shotgun_price = 100
}

work_speed_max = 3
work_streak_max = 20
</script>


start
-----

```
player = new Player()
shop = new Shop()
```

<form method="dialog">
<input id="player_name">
<button onclick="player.name = player_name.value;">Set Name</button>
</form>

- [new game](#work)


work
----

```
if (player.work_left <= 0) {
	replace_scene_by_id("#work_over")
} else if (player.time_left <= 0) {
	replace_scene_by_id("#out_of_time")
} else {
	replace_scene_by_id("#work_left")
}
```

work_left
---------

Oh good, the Farms is up!

> `player.name`, are you going to do any coding today?
>
> -- Boss

Your have `player.work_left` work to do in `player.time_left` hours.

Go back to the cagie, wagie?

- [Yes. Whoever does not work, does not eat.](#do_work)
- [Nah screw this, let's laugh at some freshly butchered trannies.](#relax)
- [Shop](#shop)


work_over
---------

You did `player.work_done` work in `player.time_taken` hours.

`
if (player.time_left / player.time_goal >= 0.1) {
	player.wages += 0.005;
	"You got a raise! You now earn " + player.wages + " shekels an hour."
} else {
	"If you work harder, you might get a raise..."
}
`

```
player.work_goal *= 2
player.time_goal *= 1.5

player.work_done = 0
player.time_taken = 0

player.work_left = player.work_goal
player.time_left = player.time_goal
```

Your new goal is `player.work_goal` work in `player.time_goal` hours.


- [back to work](#work)


do_work
-------
```
work_done = player.work_speed

player.work_speed = Math.max(0, player.work_speed - 0.0125)
player.work_streak++
player.shekels += player.wages

player.work_done += work_done
player.time_taken += 1

player.time_left -= 1
player.work_left = Math.max(0, player.work_left - work_done)
```

You have done `work_done` work and earned `player.wages` shekels.

You now have `player.shekels` shekels.

You have `player.work_left` work to do in `player.time_left` hours.

```
if (player.work_streak > work_streak_max) {
	replace_scene_by_id("#overworked")
}
```

- [done](#work)

overworked
----------

```
player.time_left = Math.max(0, player.time_left - 10)
player.work_speed = 1
player.work_streak = 0
```

You have done `work_done` work.

You have earned `player.wages` shekels.

You now have `player.shekels` shekels.

You have `player.work_left` work left.

You worked too hard and collapsed from exhaustion.

You have `player.time_left` hours left.

- [done](#work)


relax
--------

```
player.time_left -= 1
player.work_streak = 0
if (player.work_speed < 1) {
	player.work_speed = Math.min(1, player.work_speed + 0.025)
}
```

There's one that looks like a rotisserie chicken. You throw up.

- [done](#work)


shop
----

You have `player.shekels` shekels.

- [buy coffee](#buy_coffee) (`shop.coffee_price` shekels)
- [buy_shotgun](#buy_shotgun) (`shop.shotgun_price` shekels)
- [done](#work)


too_poor
--------

Work harder, wagie.

- [done](#work)


buy_coffee
----------

```
if (player.shekels < shop.coffee_price) {
	replace_scene_by_id("#too_poor")
} else {
	player.shekels -= shop.coffee_price
	player.work_speed += 1
}

if (player.work_speed >= work_speed_max) {
	replace_scene_by_id("#aneurysm")
}
```

You purchased some coffee!

- [done](#work)


buy_shotgun
----------

```
if (player.shekels < shop.shotgun_price) {
	replace_scene_by_id("#too_poor")
} else {
	player.shekels -= shop.shotgun_price
}
```

You have finally earned enough shekels to escape this hell.

- [You can check out any time you like, but you can never leave.](#start)


out_of_time
-----------

You have failed to complete your work in time.

- [You are fired!](#start)

aneurysm
--------

You drank too much coffee and had an aneurysm.

- [try again](#start)

The output from pandoc needs to be tweaked slightly, shoving each section into a div, and packing everything into a single HTML file.
Python:
#!/usr/bin/env python3
import argparse
import lxml.etree

parser = argparse.ArgumentParser()
parser.add_argument("infile")
parser.add_argument("-o", "--outfile", default="out.html")
args = parser.parse_args()

with open(args.infile) as f:
    root = lxml.etree.parse(f, lxml.etree.HTMLParser())

new_root = lxml.etree.Element("html")
head = lxml.etree.SubElement(new_root, "head")
body = lxml.etree.SubElement(new_root, "body")

# script in own file for syntax highlighting
script = lxml.etree.SubElement(head, "script")
with open("engine.js") as f:
    script.text = f.read()

# hidden div for data storage
data_div = lxml.etree.SubElement(body, "div")
data_div.attrib["id"] = "data"
data_div.attrib["style"] = "display: none;"

# working div to display
working_div = lxml.etree.SubElement(body, "div")
working_div.attrib["id"] = "current_scene"

# each header is converted to a div to hold subsequent elements
div = None
for e in root.xpath("//body/*"):
    if e.tag == "h2":
        div = lxml.etree.SubElement(data_div, "div")
        div.attrib["id"] = e.text
        continue

    div.append(e)

with open(args.outfile, "wb") as f:
    f.write(lxml.etree.tostring(new_root, method="html", doctype="<!DOCTYPE html>"))

The javascript "engine" is surprisingly lightweight. Probably because most of the code is in the game script.
JavaScript:
/* loads initial state on page load */
function setup() {
	removeEventListener("load", setup)
	return replace_scene_by_id("#start")
}

/* navigation event listener callback */
function replace_scene(event) {
	if (event == undefined) {
		console.error("event undefined")
		return
	}
	return replace_scene_by_id(new URL(event.destination.url).hash)
}

/* copy a scene from the data storage to display */
function replace_scene_by_id(scene_id) {
	if (scene_id.charAt(0) != '#') {
		console.error("bad id")
		return
	}
	/* strip # */
	scene_id = scene_id.slice(1,)
	/* copy from data storage */
	current_scene.innerHTML = document.getElementById(scene_id).outerHTML
	/* delete id from working copy */
	current_scene.childNodes[0].id=""

	/* eval code snippets and replace */
	code_elements = current_scene.getElementsByTagName("code")

	/* modifying the element removes it from the array ??? */
	while (code_elements.length > 0) {
		e = code_elements[0]
		v = eval(e.innerText)

		/* HACK: don't display results for code tags in pre tag */
		p = e.parentElement
		if (p != null && p.tagName == "PRE") {
			e.outerHTML = ""
		} else {
			e.outerHTML = v
		}
	}
}

/* need to fire on event to delay execution */
window.addEventListener("load", setup)
/* will trigger on every URL change */
window.navigation.addEventListener("navigate", replace_scene)

Very easy to "build", as you might imagine.
Code:
out.html: tmp.html
	./tweak.py $^ -o $@

tmp.html: in.md
	pandoc -f markdown-smart $^ -o $@
 
Hello. Programmer of nearly 15 years here.

I never 'got' recursion as in doing it. And it was only in school, because it's foundational to CS itself and parts of compilers (ASTs). In work, I literally never, ever do non-trivial recursion. I don't do coding interview bench racing to get jobs.

How the fuck do you stop the mental infinite recursion loop when you do this shit anyway? Figured better late than never.
Do SICP and it will become more intuitive than loops. After i done so and gotten fully into LISP it clicked for me and now i cannot stand loops because recursion seems way simpler due to it being more declarative in nature as you are not trying to fit what you want to do into specific loop paradigm
 
Last edited:
Fuuuck meeee.
Predictably, there are bugs from when settings were added or changed and it wasn't done consistently in all of these places. How could there not be!
Did it at no point occur to these people to ADD A FUCKING LAYER OF ABSTRACTION? Fuck!
probably started with like 2 or 3 settings at first with different types, then another setting got added with yet anoyher type, establishing the precedent that you "need" to write each settings handler on your own and nobody thought deeper about it so that became the standard way.
 
Also, how the hell do you write a webserver? We only had some semblance of that in Java were we used the Socket Listener to accept requests on port XYZ. I assume the only way to do this right would he to read through the RFCs (like an engineer would) and handle the standard precisely.
Someone in the thread mentioned it's every programmers toy project, but considering that it's just insanely hard I doubt everyone wrote a web server by themselves.

Marvin's answer is good but let me give my own as a web dev fag and hopefully slightly redeem our not entirely unfounded reputation as talentless idiots to some degree. This is all 95% right off of my head so I'm sure there will be some people who um actually me on the finer points but hopefully it gets across how simple web servers can be.

HTTP is the protocol that web browsers use, and the HTTP 1.x version of the protocol is very simple. It's also very ancient and not in common use in the real world anymore, but given that HTTP 2.x is a supserset of 1.x and is still in common use, it's still understood by pretty much any web browser out there, so go ahead and just start with implementing that.

HTTP is a stateless protocol, which at least as far as we care about means that the client connects to the server and sends a request, the server sends a response back to the client, and then they disconnect. That's it. No need to keep the connection alive beyond that. (HTTP 2 adds "Keep-Alive," which basically allows multiple requests and responses to be sent on a single IP connection, but we're not worrying about that right now.)

Both requests and responses are basically text files with a header section and a body. Even in cases where the body is a binary file like an image, the header section is still just plain text followed by the blob of binary data. All lines in the header are separated by \r\n, and the header section and the body are separated by \r\n\r\n (of course these are line breaks, not literal backslashes and letters):

A request looks like this:

Code:
GET /index.html HTTP/1.1\r\n
Host: example.com\r\n
User-Agent: MyWebBrowser/1.0\r\n
\r\n

The first line has three parts. The first is the method (or, as some smartasses call it, the "verb"), of which there are several including GET, PUT, POST, PATCH etc, but start out by only worrying about GET, which a browser sends when it just wants to fetch a resource, and HEAD, which is basically the same as GET but just means the server should only send the headers of the response without the body part. Then you can implement POST, which is the most common way that data is sent/"uploaded" from the browser to the server, in which case there will be content in the body section of the request, but since the example above is just a GET request, there is no body (I think it might actually be against the standard to include a body in a GET or HEAD request but don't hold me to that). In KF terms, think of a GET request as what gets made when you click on a thread to read it, and a POST request as what gets made when you make a shitpost into it, with the body of that POST request containing your carefully crafted slurs.

The next, "/index.html", is the resource you're requesting from the server. "Please send me the file at this path." Of course, modern web applications means that this usually doesn't point to a literal file anymore, but for now if you imagine a "my web files" folder on your computer filled with other files and folders, this part is the path to the file being requested, relative to that root level of the "my web files" folder.

The last part is self-explanatory; we're making this request with version 1.1 of the HTTP standard.

Finally, this request has two headers: a Host header and a User-Agent header. Headers are composed of a label and a value separated by a colon-space sequence, and the label part can't have spaces, so its words are typically title-cased and separated with hyphens. The value can be pretty much anything of a reasonable length (I think maximum length is like 1K bytes but don't hold me to that). The "Host" header is probably the most ubiquitous as it's useful for telling the server from which exact site you are requesting the resource, since it's not uncommon for a server to host more than one web site; if this particular server hosts both example.org and example.com, it now knows which "/index.html" to send based on this header. The "User-Agent" header is basically a vanity header that browsers send to identify themselves in the server and is sometimes used by the server for statistics purposes (how many visitors are using Firefox versus Safari versus Chrome?) but can be easily faked so shouldn't be entirely trusted. (Note that while it's fairly uncommon, it's legal in both requests and responses for there to be more than one header line using the same header label, so in code structure terms the values should be properly represented as an array of values rather than a single one, but if you're just implementing this server for shits and giggles I wouldn't worry about that.)

Once the server gets a request like this, it decides how to handle it. The simplest way is, as I mentioned above, to check in its "my web files" folder for the file indicated by the resource path, and either send it as the response or send an error if it can't be found. For things like PHP applications like XenForo, the server will instead send all of the connection details to the application which will then process and handle it, but don't worry about that now and just go for the simple file server approach.

An HTTP response looks like this:
Code:
HTTP/1.1 200 OK\r\n
Server: kiwiserve/0.1\r\n
Date: Tue, 02 May 2026 12:34:56 GMT\r\n
Content-Type: text/html\r\n
Content-Length: 12345
\r\n
<html>
  <head>
    <title>…

Once again there's some important stuff packed into the first line: The HTTP response version (usually a server will respond to a request with an HTTP response using the same version of the request, but that's not always the case), a status code, and a status message. Here's a full list of status codes but the ones you'll care about implementing first are "200 OK" for when a file is found and is being included in the response, and the classic "404 File Not Found" for when the requested file is not found. The "Content-Type" header is the MIME (file) type of the response body, so the browser doesn't have to guess if it's a PNG or a JPEG or a MP4 or whatever, and the "Content-Length" is the length of the response body in bytes, so the browser knows when it's safe to assume the entire file has been fetched and it can cut the connection. "Date" is the date in this format and "Server," like "User-Agent," is an ego header bragging about what the server software is. Then, after the headers, two line breaks separate the head part of the document from the body, and the body follows.

And that's about it. So yeah, once you've got the underlying socket stuff down, it's actually really simple to implement the basic HTTP functionality.
 
Last edited:
And that's about it. So yeah, once you've got the underlying socket stuff down, it's actually really simple to implement the basic HTTP functionality.
Very informative post, thank you.

I think maximum length is like 1K bytes but don't hold me to that
IIRC from my Cisco CCNA classes I think (again, might be wrong) it's because of packet fragementation.
Anything over 1K can get fragmented.

Once you have the parsers/unparsers, then you just hook them up to a TCP listener and loop. Start by accepting a connection. Then run the request parser, then read the body (length specified by a header in the request). Then handle the request and its optional body however you want. And reply by constructing a response and unparsing it out.
How complicated of a parser are we talking about? AST trees, First-descent or something simple like an FSM?
 
Back
Top Bottom