Also, how the hell do you write a webserver? We only had some semblance of that in Java were we used the Socket Listener to accept requests on port XYZ. I assume the only way to do this right would he to read through the RFCs (like an engineer would) and handle the standard precisely.
Someone in the thread mentioned it's every programmers toy project, but considering that it's just insanely hard I doubt everyone wrote a web server by themselves.
Marvin's answer is good but let me give my own as a web dev fag and hopefully slightly redeem our not entirely unfounded reputation as talentless idiots to some degree. This is all 95% right off of my head so I'm sure there will be some people who um actually me on the finer points but hopefully it gets across how simple web servers can be.
HTTP is the protocol that web browsers use, and the HTTP 1.x version of the protocol is very simple. It's also very ancient and not in common use in the real world anymore, but given that HTTP 2.x is a supserset of 1.x and
is still in common use, it's still understood by pretty much any web browser out there, so go ahead and just start with implementing that.
HTTP is a stateless protocol, which at least as far as we care about means that the client connects to the server and sends a request, the server sends a response back to the client, and then they disconnect. That's it. No need to keep the connection alive beyond that. (HTTP 2 adds "Keep-Alive," which basically allows multiple requests and responses to be sent on a single IP connection, but we're not worrying about that right now.)
Both requests and responses are basically text files with a header section and a body. Even in cases where the body is a binary file like an image, the header section is still just plain text followed by the blob of binary data. All lines in the header are separated by \r\n, and the header section and the body are separated by \r\n\r\n (of course these are line breaks, not literal backslashes and letters):
A request looks like this:
Code:
GET /index.html HTTP/1.1\r\n
Host: example.com\r\n
User-Agent: MyWebBrowser/1.0\r\n
\r\n
The first line has three parts. The first is the method (or, as some smartasses call it, the "verb"), of which there are several including GET, PUT, POST, PATCH etc, but start out by only worrying about GET, which a browser sends when it just wants to fetch a resource, and HEAD, which is basically the same as GET but just means the server should only send the headers of the response without the body part. Then you can implement POST, which is the most common way that data is sent/"uploaded" from the browser to the server, in which case there will be content in the body section of the request, but since the example above is just a GET request, there is no body (I think it might actually be against the standard to include a body in a GET or HEAD request but don't hold me to that). In KF terms, think of a GET request as what gets made when you click on a thread to read it, and a POST request as what gets made when you make a shitpost into it, with the body of that POST request containing your carefully crafted slurs.
The next, "/index.html", is the resource you're requesting from the server. "Please send me the file at this path." Of course, modern web applications means that this usually doesn't point to a literal file anymore, but for now if you imagine a "my web files" folder on your computer filled with other files and folders, this part is the path to the file being requested, relative to that root level of the "my web files" folder.
The last part is self-explanatory; we're making this request with version 1.1 of the HTTP standard.
Finally, this request has two headers: a Host header and a User-Agent header. Headers are composed of a label and a value separated by a colon-space sequence, and the label part can't have spaces, so its words are typically title-cased and separated with hyphens. The value can be pretty much anything of a reasonable length (I think maximum length is like 1K bytes but don't hold me to that). The "Host" header is probably the most ubiquitous as it's useful for telling the server from which exact site you are requesting the resource, since it's not uncommon for a server to host more than one web site; if this particular server hosts both example.org and example.com, it now knows which "/index.html" to send based on this header. The "User-Agent" header is basically a vanity header that browsers send to identify themselves in the server and is sometimes used by the server for statistics purposes (how many visitors are using Firefox versus Safari versus Chrome?) but can be easily faked so shouldn't be entirely trusted. (Note that while it's fairly uncommon, it's legal in both requests and responses for there to be more than one header line using the same header label, so in code structure terms the values should be properly represented as an array of values rather than a single one, but if you're just implementing this server for shits and giggles I wouldn't worry about that.)
Once the server gets a request like this, it decides how to handle it. The simplest way is, as I mentioned above, to check in its "my web files" folder for the file indicated by the resource path, and either send it as the response or send an error if it can't be found. For things like PHP applications like XenForo, the server will instead send all of the connection details to the application which will then process and handle it, but don't worry about that now and just go for the simple file server approach.
An HTTP response looks like this:
Code:
HTTP/1.1 200 OK\r\n
Server: kiwiserve/0.1\r\n
Date: Tue, 02 May 2026 12:34:56 GMT\r\n
Content-Type: text/html\r\n
Content-Length: 12345
\r\n
<html>
<head>
<title>…
Once again there's some important stuff packed into the first line: The HTTP response version (usually a server will respond to a request with an HTTP response using the same version of the request, but that's not always the case), a status code, and a status message.
Here's a full list of status codes but the ones you'll care about implementing first are "200 OK" for when a file is found and is being included in the response, and the classic "404 File Not Found" for when the requested file is not found. The "Content-Type" header is the
MIME (file) type of the response body, so the browser doesn't have to guess if it's a PNG or a JPEG or a MP4 or whatever, and the "Content-Length" is the length of the response body in bytes, so the browser knows when it's safe to assume the entire file has been fetched and it can cut the connection. "Date" is the date in
this format and "Server," like "User-Agent," is an ego header bragging about what the server software is. Then, after the headers, two line breaks separate the head part of the document from the body, and the body follows.
And that's about it. So yeah, once you've got the underlying socket stuff down, it's actually really simple to implement the basic HTTP functionality.