Programming thread

  • 🐕 I am attempting to get the site runnning as fast as possible. If you are experiencing slow page load times, please report it.
Fuck I hate how much of a brainlet you Chads make me feel like sometimes. Fuck this thread, honestly.
you c fuckers.png
 
Yeah, I'd say C is one of the simplest languages overall and It's weird to me that so many people just seem to be afraid of using it. Especially for small stuff (where performance still somewhat matters though) where you need tight integration, C is really hard to beat and quite approachable. I feel a lot of the "simpler" languages that obfuscate the tediousness just cost as much time. You might have the solution implemented quicker but you end up debugging much longer why it didn't work exactly the way you imagined it to, just to find out it was some obscure language limitation, things "lost in translation" between the abstraction layers and/or buggy library or two. C is much more straightforward.
 
Yeah, I'd say C is one of the simplest languages overall and It's weird to me that so many people just seem to be afraid of using it. Especially for small stuff (where performance still somewhat matters though) where you need tight integration, C is really hard to beat and quite approachable. I feel a lot of the "simpler" languages that obfuscate the tediousness just cost as much time. You might have the solution implemented quicker but you end up debugging much longer why it didn't work exactly the way you imagined it to, just to find out it was some obscure language limitation, things "lost in translation" between the abstraction layers and/or buggy library or two. C is much more straightforward.
C++ of course is a whole different story and deserves its reputation. Learning it is useful because you get to see where every other language in existence said "yeah wtf let's not do that." If you're working with generics in C++ and miss a parenthesis or semicolon you're going to get like 15,000 lines of error messages, it's hilarious.

Plus you've got shit like iostream to work with in the STL. I don't know which coked-out academic came up with that whole idea.
 
If you're working with generics in C++
THEY'RE CALLED TEMPLATES YOU C# (or Java?) WEENIE

REEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
and miss a parenthesis or semicolon you're going to get like 15,000 lines of error messages, it's hilarious.
You're not wrong tho.
Plus you've got shit like iostream to work with in the STL. I don't know which coked-out academic came up with that whole idea.
I'm actually tempted to do a talk on some local C++ users group meeting on how I learned to stop worrying and learned to love the <iostream> - after > 10 years of fear and loathing of that monstrosity.

Turns out that writing your own iomanips and std::streambuf-derived classes is pretty easy and immensely useful. I think it's one of the best things I learned about C++ after using it for all that time.
 
I'm actually tempted to do a talk on some local C++ users group meeting on how I learned to stop worrying and learned to love the <iostream> - after > 10 years of fear and loathing of that monstrosity.

Turns out that writing your own iomanips and std::streambuf-derived classes is pretty easy and immensely useful. I think it's one of the best things I learned about C++ after using it for all that time.
I'd be interested in some examples and use cases. I think my biggest beef with iostream is that it's stateful though.
 
I'd be interested in some examples and use cases. I think my biggest beef with iostream is that it's stateful though.
Yeah, the statefulness can be a bitch sometimes and is inconsistent: boolalpha and basefield iomanips stick, but setw() is only for the next token and so on.

As for use cases, first let's state the objective: to make outputting of various data structures and messages as minimal boilerplate'y as possible. To facilitate that, we need to be able to call operator << (std::ostream &, ...) on various types. So what's the problem, you might say? Just implement the damn operator and be done with it! Sure, it mostly works, but not always reduces the boilerplate enough for my tastes.

I'll post the various examples in separate posts due to length.

Example 1.
I'm dealing with a legacy system in which I need to receive some input strings, convert them from UTF8 to ISO-8859 encoding, do some other validation checks and either write out some error message or continue data processing. Instead of doing inline something like:
Code:
if (!convertEncoding(name)) {
    os << "Encoding error in field \"name\"\n";
    return;
}

if (!convertEncoding(description)) {
    os << "Encoding error in field \"description\"\n";
    return;
}

//etc.
I'll transform the code first into something like this:
Code:
using FieldDescription = std::pair <std::string *, const char *>;
for (auto &p : std::initializer_list <FieldDescription>{{&name, "name"}, {&description, "description"}}) {
    if (!convertEncoding(*p.first)) {
        os << "Encoding error in field " << p.second << '\n';
        return;
    }
}
and then I'll add a helper iomanip class:
Code:
namespace Error {
    class Encoding {
    public:
        Encoding(const std::string &s) : m_s{s} {}
    private:
        const std::string &m_s;

        friend std::ostream & operator << (std::ostream &os, const Encoding &obj)
        {
            return os << "Encoding error in field " << obj.m_s << '\n';
        }
    };
}
thus changing the relevant line of code in the earlier snippet into:
Code:
os << Error::Encoding{p.second} << '\n';

Obviously I'll be having a number of such iomanip-classes in namespace Error so that I can do, for example:
Code:
std::string name = "FullRetard";
constexpr unsigned MaxLength = 8;

if (name.size() > MaxLength)
    os << Error::TooLong{"name", name, MaxLength} << '\n';
implemented as:
Code:
return os << std::quoted(obj.m_s) << " too long: " << obj.m_value.size() << " bytes where max allowed is " << obj.m_limit;
resulting in message:
Code:
"name" too long: 10 bytes where max allowed is 8
 
Example 2.

A bit of a stupid example: C++ to this very day doesn't have a dedicated utility for outputting a sequence of values joined by some token (*). For the sake of simplicity, assume that I want to output a std::vector <int> joined by some commas - the following code could have been generalized to use templates and iterators to support a wide array of data structures, but I want to keep the example simple.

So the naive implementation would look something like this:
Code:
std::vector <int> data;
for (auto i = 0u; i < data.size(); ++i)
   os << data[i] << ", ";
But now I have a dangling , which is no bueno.

OK, let's do it again:
Code:
for (auto i = 0u; i < data.size(); ++i) {
    if (i != 0)
        os << ", ";
    os << data[i];
}
But now I'm paying for the condition in each loop iteration (though most likely the compiler would move that out of the loop, resulting in my next snippet).

Let's refine:
Code:
os << data[0];
for (auto i = 1u; i < data.size(); ++i) {
    os << ", " << data[i];
}
Fuck me, this is getting ugly AND what if the vector is empty? Nasal daemons, that's what! This fancy-schmancy C++ is getting severely retarded and we haven't even got to the range-based for loops.

C++20 introduced a concept called ranges, which generalises a lot of the algorithms operating, well, on ranges. Another iomanip class incoming:
Code:
template <typename T> requires std::ranges::view <T>
class Joiner {
public:
    Joiner(const T &data, std::string_view separator) : m_data{data}, m_separator{separator} {}
private:
    const T &m_data;
    const std::string_view m_separator;

    friend std::ostream & operator << (std::ostream &os, const Joiner &obj)
    {
        if (obj.m_data.empty())
            return os;

        os << *obj.m_data.begin();
        for (const auto &elem : obj.m_data | std::views::drop(1))
            os << obj.m_separator << elem;

        return os;
    }
};
The important part is the | std::views::drop(1) which basically "advances" the range given on the left side of the "pipe" by one element. Usage:
Code:
std::vector <int> data;
os << Joiner{data, ", "} << '\n';
Ranges can be pipelined, so let's for example print out the vector back to front and multiply all numbers by 2 on the fly:
Code:
auto mul2 = [](auto &&v) { return v * 2; };
os << Joiner{std::views::transform(data, mul2) | std::views::reverse, ", "} << '\n';
Fun fact: it is allowed for transformations to convert the range of type A into a range of type B, which is cool if you want - for example - convert some strings into ints on the fly. Or whatever.

(*) There will be a std::views::join_with in C++23.
 
Example 3.
I have an object of class QueryResult storing a result of a database query. I want to output, let's say, a JSON document with these results BUT I want to output the rows with some offset and some limit. Basically I make a SQL query without OFFSET and LIMIT parts in the SQL and then I want to partially output the result. Think - paging of results.

In such a case, basic operator overloading won't suffice, because you can't parametrise the operator call with (offset, limit). There just is no way to pass those arguments. You can have these values set in the QueryResult object itself, but it's ugly, prone to errors, violates OOP and is thread-unsafe (unless you plan to lock whole object every time you're outputting a page of results).

Let's augment the QueryResult class with a helper subclass Serializer (with a friend operator <<) and a member function outputPage():
Code:
class QueryResult {
    struct Serializer {
        const QueryResult *qr;
        unsigned offset = 0, limit = std::numeric_limits <unsigned>::max();
    };

    const Serializer outputPage(unsigned offset, unsigned limit) const
    {
        return Serializer{this, offset, limit};
    }

    friend std::ostream & operator << (std::ostream &os, const Serializer &s);
};

Implementation of that operator << is left as a rather obvious exercise for the reader. Usage:
Code:
QueryResult qr;
os << qr.outputPage(45, 15) << '\n';

Example 4.
When in doubt, create your own classes encapsulating a std::ostream * and use that encapsulation to write out things differently depending on encapsulating classes. For example:

Code:
uint32_t v = 0x01020304;

BigEndianStream beStream{&os};
//assume an overload exists: BigEndianStream & operator << (BigEndianStream &, uint32_t);
beStream << v; // writes out: 01 02 03 04

LittleEndianStream leStream{&os};
//assume an overload exists: LittleEndianStream & operator << (LittleEndianStream &, uint32_t);
leStream << v; // writes out: 04 03 02 01

=======

I think that's enough for now. Keep in mind I've written the code directly in browser, so they might be some errors. But I hope the concepts are clean enough. Feel free to poke me further.
 
  • Informative
Reactions: eternal dog mongler
Python:
#!/usr/bin/python3.9

"""
uncozy.py
By @Snigger
Just a simple script to scrape the entire history of a page from the wayback machine
"""

from bs4 import BeautifulSoup
import json
import os
import requests
from threading import Thread
from urllib import request
import time


# AHAHAHAHA FUCK YOU NICK, YOU'VE BEEN SNIGGERED
cozyURL = "https://api.cozy.tv/cache/homepage"


def get_wayback_entries(target: str):
    query = f"https://web.archive.org/cdx/search/cdx?url={target}"
    # Get all entries of url on wayback
    req = requests.get(query)
    text = req.text
    entries = text.split("\n")
    # Discard nonsense
    fields = entries.pop(0)
    return entries


def get_url_by_timecode(timecode: int, target: str = cozyURL):
    query = f"https://web.archive.org/web/{timecode}if_/{target}"
    return query


def scrape(page: str):
    # Get page
    result = request.urlopen(page)
    content = result.read()
    # Get body text
    soup = BeautifulSoup(content)
    payload = soup.text
    return payload


def dump_data(data: str, timecode: int, directory: str = "./data"):
    with open(os.path.join(directory, f"{timecode}.json"), "w") as file:
        file.write(data)


def process_timecode(timecode: int):
    print(f"\tDownloading {timecode}")
    link = get_url_by_timecode(timecode)
    payload = scrape(link)
    dump_data(payload, timecode)
    print(f"\tFinished downloading {timecode}")


def scrape_all(target: str):
    threads = []
    for entry in get_wayback_entries(target):
        # Ignore weird end thing that kept fucking things up
        if entry == "":
            continue
        # Get UTC timecode
        entryData = entry.split(" ")
        timecode = entryData[1]
        # Thread stuff
        thread = Thread(target=process_timecode, args=(timecode,))
        threads.append(thread)
        thread.start()
        # This kind of negates the point of multithreading tbh
        time.sleep(3)
    # Cleanup
    for thread in threads:
        thread.join()
    print("Done")


def main():
    scrape_all(cozyURL)


if __name__ == "__main__":
    main()

Wrote this lil beauty to help us figure out how bad the botting is on cozy.tv
 
Last edited:
Python:
#!/usr/bin/python3.9
import csv
import json


def check_stats_on_user(user: str):
    with open("./data/master.json", "r") as file:
        database: dict = json.load(file)
        for utc, dataThatDay in database.items():
            users = dataThatDay.get("users", list())
            for person in users:
                if person.get("name") != user:
                    continue
                viewerCount = person.get("viewers", -1)
                followerCount = person.get("followerCount", -1)
                yield utc, followerCount, viewerCount


def dump_to_csv(user: str):
    with open(f"./data/{user}.csv", "w") as file:
        writer = csv.writer(file)
        writer.writerow(["Time", "Followers", "Viewers"])
        for utc, fCount, vCount in check_stats_on_user(user):
            writer.writerow([utc, fCount, vCount])


def main():
    dump_to_csv("nick")


if __name__ == "__main__":
    main()
Here's a script to check on your favorite cozy members from the master file!
 
Python:
#!/usr/bin/python3.9

import csv
import matplotlib.pyplot as plt
import sys


def render(filename: str):
    times = list()
    viewers = list()
    followers = list()
    with open(filename, "r") as file:
        lines = csv.reader(file)
        skipped = False
        for row in lines:
            if not skipped:
                skipped = True
                continue
            utc, fCount, vCount = map(int, row)
            times.append(utc)
            followers.append(fCount)
            viewers.append(vCount)
    fig, ax = plt.subplots()
    ax.plot(times, viewers)
    ax.plot(times, followers)
    plt.xlabel("UTC Time")
    plt.legend(["Viewers", "Followers"], loc=0, frameon=True)
    plt.show()


def main():
    render(f"./data/{sys.argv[1]}.csv")


if __name__ == "__main__":
    main()
Here's my rendering script, not great but eh
 
Final script:
Python:
#!/usr/bin/python3.9
import datetime
import json
import math
import os.path
import random
import time
import urllib.request

from regex import regex


class Info:
    threshold = 30*60
    outdir = "./data"
    url = "https://api.cozy.tv/cache/homepage"
    time = 3*60
    tolerance = int(1.5*60)
    outfile = "data.json"
    logfile = "log.txt"
    pattern = regex.compile(r"backup-([0-9]{9,11})-data\.json")


class Nanny:
    def __init__(self):
        self.data = dict()
        if not os.path.exists(Nanny.get_outfile()):
            with open(Nanny.get_outfile(), "w") as _:
                pass

    @staticmethod
    def print(text: str, file: str = Info.logfile):
        print(text)
        with open(Info.logfile, "a") as log:
            log.write(f"{text}\n")

    def write(self):
        Nanny.backup()
        with open(Nanny.get_outfile(), "r") as file:
            try:
                oldData = json.load(file)
            except json.decoder.JSONDecodeError as e:
                Nanny.print(e)
                oldData = dict()
            addition = {f'{Nanny.get_utc()}': self.data}
            oldData.update(addition)
        with open(Nanny.get_outfile(), "w") as file:
            json.dump(oldData, file)

    def grab(self):
        req = urllib.request.Request(Info.url,
                                     data=None,
                                     headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) '
                                                            'AppleWebKit/537.36 (KHTML, like Gecko) '
                                                            'Chrome/35.0.1916.47 Safari/537.36'})
        text = urllib.request.urlopen(req).read()
        # print(text)
        self.data = json.loads(text)

    def mainloop(self):
        try:
            while True:
                Nanny.print("Collecting data...")
                self.grab()
                # self.display_live()
                # Cozy doesn't seem to have this implemented yet
                self.write()
                self.clean()
                Nanny.display_time()
                Nanny.nap()
                Nanny.print("="*25)
        except KeyboardInterrupt as e:
            self.write()
        finally:
            self.write()

    @staticmethod
    def display_time():
        # Get time
        currentDateTime = datetime.datetime.now()
        current = Nanny.get_utc()
        Nanny.print(f"Data captured at {currentDateTime.strftime('%d/%m/%Y %H:%M:%S')} ({current})")

    @staticmethod
    def nap():
        # Sleep
        randomShift = random.randint(-Info.tolerance, Info.tolerance)
        sleepTime = Info.time + randomShift
        Nanny.print(f"Sleeping for {sleepTime} seconds")
        time.sleep(sleepTime)

    @staticmethod
    def backup():
        with open(f"{Info.outdir}/backup-{Nanny.get_utc()}-{Info.outfile}", "w") as outFile,\
                open(Nanny.get_outfile(), "r") as inFile:
            outFile.write(inFile.read())

    @staticmethod
    def get_utc():
        return math.floor(time.time())

    @staticmethod
    def get_outfile():
        return f"{Info.outdir}/{Info.outfile}"

    def display_live(self):
        Nanny.print("Currently the following are live: ")
        users = self.data.get("users", dict())
        for userData in users:
            if userData.get("live", None) is not None:
                user = userData.get("name", "ERROR")
                Nanny.print(f"\t{user}")

    @staticmethod
    def clean():
        files = list()
        for _, __, filenames in os.walk(Info.outdir):
            files.extend(filenames)
        for file in filter(lambda f: f.startswith("backup"), files):
            groups = Info.pattern.findall(file)
            if len(groups) < 1:
                continue
            utc = int(groups.pop(0))
            if (Nanny.get_utc() - utc) > Info.threshold:
                Nanny.print(f"Deleting {file}")
                os.remove(f"{Info.outdir}/{file}")


def main():
    n = Nanny()
    n.mainloop()


if __name__ == "__main__":
    main()
 
Last edited:
Could you too be feeling like you've been writing the same thing over
and over again, but in different programming languages, so I can't copy
the last time I wrote the same thing? It's the second time in one year
I've written a data view and pagination implementation.

How come Java thinks that taking an ActionListener and passing it to a
class as an Array and then calling it from inside another Action
listener is an "Invalid Statement"?

Java:
JButton btn = new JButton();
btn.addActionListener(event -> { this.thing = 22; this.actions[3](event); } );

Completely ridiculous that Java doesn't allow you to do this. Just more
evidence that James Gosling belongs up against a wall. What kind of
programming language does not allow you to do re-direct-able
indirection; that's what programming is all about? Absolutely
angering. James Gosling is not a true Albania he is an Albanian Bureau
of Land Managers. Fucking nigger aids!

9 out of 10 witch doctors say raping a virgin will cure you of the
AIDs virus.

9 out of 10 witch doctors say raping a virgin will cure you of the
covid virus.

But it's 1 for 1, so if you have both you have to rape 2 virgins to be
completely cured; accounting to 18 out of 20 witch doctors.

Use a non-Java JVM language, you say? Most JVM languages are just a
sugar over top of Java. So you're still stuck with all the bullshit of
Java, even if it's more kitsch to write. Anything that you want to use
in the Java Eco-system in you kitsch language will force you back into
writing with the retarded conventions of Java.

I want pointers back! Rust has pointers and Rust doesn't even have
mutable variables. Java is written by sadists. Java didn't change
anything about how software is written, only now people re-implement
assignment with getters and setters instead of just assigning a
member of a struct like they used to do in C. It's absolutely
retarded to re-implement assignment for every data member of a
Class. You already get assignment for free with the language;
re-implementing it is the most stupid idea for a language idiom ever.

James Gosling is an evil man.
 
Last edited:
Python:
#!/usr/bin/python3.9

"""
uncozy.py
By @Snigger
Just a simple script to scrape the entire history of a page from the wayback machine
"""

from bs4 import BeautifulSoup
import json
import os
import requests
from threading import Thread
from urllib import request
import time


# AHAHAHAHA FUCK YOU NICK, YOU'VE BEEN SNIGGERED
cozyURL = "https://api.cozy.tv/cache/homepage"


def get_wayback_entries(target: str):
    query = f"https://web.archive.org/cdx/search/cdx?url={target}"
    # Get all entries of url on wayback
    req = requests.get(query)
    text = req.text
    entries = text.split("\n")
    # Discard nonsense
    fields = entries.pop(0)
    return entries


def get_url_by_timecode(timecode: int, target: str = cozyURL):
    query = f"https://web.archive.org/web/{timecode}if_/{target}"
    return query


def scrape(page: str):
    # Get page
    result = request.urlopen(page)
    content = result.read()
    # Get body text
    soup = BeautifulSoup(content)
    payload = soup.text
    return payload


def dump_data(data: str, timecode: int, directory: str = "./data"):
    with open(os.path.join(directory, f"{timecode}.json"), "w") as file:
        file.write(data)


def process_timecode(timecode: int):
    print(f"\tDownloading {timecode}")
    link = get_url_by_timecode(timecode)
    payload = scrape(link)
    dump_data(payload, timecode)
    print(f"\tFinished downloading {timecode}")


def scrape_all(target: str):
    threads = []
    for entry in get_wayback_entries(target):
        # Ignore weird end thing that kept fucking things up
        if entry == "":
            continue
        # Get UTC timecode
        entryData = entry.split(" ")
        timecode = entryData[1]
        # Thread stuff
        thread = Thread(target=process_timecode, args=(timecode,))
        threads.append(thread)
        thread.start()
        # This kind of negates the point of multithreading tbh
        time.sleep(3)
    # Cleanup
    for thread in threads:
        thread.join()
    print("Done")


def main():
    scrape_all(cozyURL)


if __name__ == "__main__":
    main()

Wrote this lil beauty to help us figure out how bad the botting is on cozy.tv
as I mentioned in chat earlier to get more precise numbers we should use the API status call that each individual streamer's page offers and not the homepage's cache link. The individual streamers status API call can tell us when they are live. This is important as we should then be able to get a second by second status update from the API at the time Fuentes goes live and we can validate that the first viewer count is multiple thousands, which is something we all see when he goes live but have no "proof" of.

Here is the link to Nick's api status call as well as some others of interest:
 
I went back to the documentation for ActionListener in
Java. ActionListener, like all interfaces with a single method can be
constructed with a lambda like,

Java:
event -> { this.thing = 22 }

This expression is used in place of naming the sub-class that
implements ActionListener interface. But the object received from the
lambda expression is still just an object with a normal interface
member. So, there are no such things as lambdas in Java, it's just a
sugar added, so programmers don't have to write out a whole interface
implementation.

Java:
btn.addActionListener(event -> {
    System.out.println(100);
    this.actions[1].actionPerformed(event);
});

I did figure out how to get it to work, but Java could have been a lot
more simpler a language and been a lot more productive to write at the
same time. Forcing everything to be a Class is a dumb idea. Many
things are not classes and don't make sense to forced to be
classes. Ironically, that fact that lambda construction exists in Java
is an admission that having to create an Sub-class or implement an
interface (maybe on an inner class) compounds into excessive
inconvenience, leading any right-thinking person to believe that not
everything should be a Class and that functions are perfectly good for
writing programs.
 
Back