- Joined
- Jun 9, 2020
In the best of all possible worlds, they would just leave us in peace. But they won't.
This site is great, and definitely more robust than it was two months ago[1]. However, I still think there is a value in decentralized forum software. There are two principal reasons:
1. Events may change again, and the Internet is hardly guaranteed to be free forever.
2. It would be good to have access to a 100% robust "bunker" that you can always rely on - not just for this site, but also for other communities.
For these reasons, I've decided to stop shitposting and to make a real life effort post. Thus far, it's not anywhere near ready for general use - I just wanted to make a thread and post about it for those who are interested and to keep myself accountable.
My basic design goal is to reimplement FMS[2][3][4], but using modern technology. That system is decentralized, spam-resistant, and well-tested. (Technical explanation follows, but those links are recommended reading).
I don't have an implementation that is usable yet, but here are my notes on the design. I appreciate all critique.
Design
Nodes
Each user runs a node. A node consists of:
On making a post, the following actions will happen:
Web of trust
All fora require moderation. This is not a problem; the problem arises when that moderation is not based on consent. If you want to see viagra spam, death threats, etc, that's your problem.
Web of trust-based systems handle this in a similar fashion to the Internet. I'll try to explain it with numbers schematically:
1. I trust Alice 80%
2. Alice trusts Bob 70%
3. Bob trusts Carol and David 30%
4. You trust me 50%.
You therefore transitively trust Alice, Bob, Carol, and David to some degree. If you disagree with this, then either stop trusting them or stop trusting the person which assigned trust to them. To be specific, your locally seen peer trust will be as follows:
1. You trust Alice 40%
2. You trust Bob 28%
3. You trust Carol and David 8.4%
It's worth noting that this can, with some effort, be seen basically to be the way the Farms operate today:
Technical note: Message trust isn't calculated by naively multiplying down the path, since this is trivially Sybil-resistant. I intend to use personalized PageRank with self-trust, but I appreciate suggestions regarding this. (TK! There's two papers about this that I can't find)
An optimization to this system is to treat (non-sage) replies to posts as a (weakly) positive signal. This makes the trust graph denser, and appears to work fine in current systems. (Compare KF, where ~everyone is within 2-3 reply/rating degrees of staff members)
There is also a Sybil resistance mechanism described in a paper (TK!) to penalize the TLT of nodes that incorrectly rate other nodes; this is worth looking into but not essential.
For more details regarding the operation of Web of Trust systems, see the literature on FMS linked in the footnotes.
Probably, the earliest MVP will just have a basic subscribe list.
Reading
You download manifests from all the nodes with peer trust above a certain epsilon, and that pass validation. You then insert all manifests into the database with non-negative message trust (or message trust above a certain epsilon, up to you. The intent is to show posts from new users but to hide new threads made by them and to treat their posts as sage)
Technical note: All rows are fk'ed to the node that is responsible; inserting a new manifest is a simple matter of deleting all rows of that node, inserting, and then letting the COMMIT take care of FK based trigger deletions.
This means that you only download content that you want to see. It's not possible to flood, since you are only ever pulling content, not pushing.[8]
Registration
Registration is done by finding any trusted node and getting them to assign you a non-zero trust.
As an implementation note: It would be easy to create bots that accepted a CAPTCHA and assigned a trivially low but not zero message trust, so that new users could join.
This basically mirrors, respectively, the invite system of KF and the registration system of KF. The difference is that anyone could run either service, and that users will not be dependent on their initial introduction point once they have made a few (non-spam) posts.
Other notes
I deliberately did not want to use too much complicated technology; decentralization is already hard enough as it is to get right, and using absurd Rube Goldberg machines hardly makes things less error-prone. There is no technical reason to use blockchains for this, which would only make everything by far worse.
Manifests are kept in JSON, since that's pretty much a universal standard
The code is in Python, since everyone knows it and it has a lot of libraries.
Things should be kept simple so that everyone should be able to contribute. This is important:
[1]: This is totally different from SneedForo AKA RuForo, which aims to be a XenForo replacement for the continued use of this website. My goal with this project is not to be schismatic.
[2]: https://blog.locut.us/2008/05/11/fms-spam-proof-anonymous-message-boards-on-freenet/
[3]: https://fms.fn.mk16.de/operation.htm
[4]: http://freesocial.draketo.de/fms_en.html
[5]: As a performance matter, it may turn out to be a better idea to break these up, but I haven't done the numbers yet.
[6]: While the main body of the manifest will be JSON, I haven't decided yet whether the signature should be appended onto or downloaded separately from it, or by what format that should be. The signature itself will be made according to BEP 44. I appreciate all pointers here. I'm also not sure whether sequence IDs should be sequential or based on the time.
[7]: Not a full copy of them, just time + OP + subject. Otherwise, if user A posts in a thread made by user B, and user C is receiving posts from A but not B, user C would not know the subject line of that thread.
[8]: Aside for the optimization mentioned earlier, but for any of that to happen you have to have assigned trust to them already.
This site is great, and definitely more robust than it was two months ago[1]. However, I still think there is a value in decentralized forum software. There are two principal reasons:
1. Events may change again, and the Internet is hardly guaranteed to be free forever.
2. It would be good to have access to a 100% robust "bunker" that you can always rely on - not just for this site, but also for other communities.
For these reasons, I've decided to stop shitposting and to make a real life effort post. Thus far, it's not anywhere near ready for general use - I just wanted to make a thread and post about it for those who are interested and to keep myself accountable.
My basic design goal is to reimplement FMS[2][3][4], but using modern technology. That system is decentralized, spam-resistant, and well-tested. (Technical explanation follows, but those links are recommended reading).
I don't have an implementation that is usable yet, but here are my notes on the design. I appreciate all critique.
Design
Nodes
Each user runs a node. A node consists of:
- An ed25519 keypair.
- A manifest[5], which contains the following data:
- A list of posts created by that user
- A list of threads that user has posted in.
- The user need not have made those threads.
- Those threads are digitally signed, again, by their author.
- A list of transport endpoints; URLs at which new versions of this manifest may be found.
- These will have authoritative=true set for own onions and authoritative=false otherwise. This is mainly to enable other nodes to know where to expect to find their manifests (and add as non-authoritative sources) if they are trusted by that node.
- A list of trusts. This basically merges together the Ratings and Reports features of XenForo/phpBB/etc; see section "Web of trust" for more details
- It might be valuable to "freeze" users at a specific version. I have not thought about the details of this and it will not be an initial priority.
- A signature to the manifest[6], which contains the following data:
- Ed25519 signature
- A sequence ID (strictly increasing); a more recent manifesto shall have precedence over an older.
- An onion (optional), which hosts own manifests as well as those to whom that user subscribes.
On making a post, the following actions will happen:
- A local entry will be written to the
posts
database, but missing some content of a technical character. - The missing content will be filled in (see routine
normalize_outbox()
). - Assuming the contents of the manifest has changed from the last published version or there is no last published version, a new manifest will be generated, serialized, and saved into local database. Such manifests will contain all posts made by user, in addition to all the threads[7] they are posted within.
- Node pushes manifests to all nodes which it trusts and which trusts such node over onion.
- Node exposes new manifest on its own onion.
Web of trust
All fora require moderation. This is not a problem; the problem arises when that moderation is not based on consent. If you want to see viagra spam, death threats, etc, that's your problem.
Web of trust-based systems handle this in a similar fashion to the Internet. I'll try to explain it with numbers schematically:
1. I trust Alice 80%
2. Alice trusts Bob 70%
3. Bob trusts Carol and David 30%
4. You trust me 50%.
You therefore transitively trust Alice, Bob, Carol, and David to some degree. If you disagree with this, then either stop trusting them or stop trusting the person which assigned trust to them. To be specific, your locally seen peer trust will be as follows:
1. You trust Alice 40%
2. You trust Bob 28%
3. You trust Carol and David 8.4%
It's worth noting that this can, with some effort, be seen basically to be the way the Farms operate today:
- Null trusts staff 80% or whatever
- Staff trusts new users 5% or so on joining
- If staff "stops trusting" a user, that user is banned
- If Null "stops trusting" a staff member, he is no longer a mod.
Technical note: Message trust isn't calculated by naively multiplying down the path, since this is trivially Sybil-resistant. I intend to use personalized PageRank with self-trust, but I appreciate suggestions regarding this. (TK! There's two papers about this that I can't find)
An optimization to this system is to treat (non-sage) replies to posts as a (weakly) positive signal. This makes the trust graph denser, and appears to work fine in current systems. (Compare KF, where ~everyone is within 2-3 reply/rating degrees of staff members)
There is also a Sybil resistance mechanism described in a paper (TK!) to penalize the TLT of nodes that incorrectly rate other nodes; this is worth looking into but not essential.
For more details regarding the operation of Web of Trust systems, see the literature on FMS linked in the footnotes.
Probably, the earliest MVP will just have a basic subscribe list.
Reading
You download manifests from all the nodes with peer trust above a certain epsilon, and that pass validation. You then insert all manifests into the database with non-negative message trust (or message trust above a certain epsilon, up to you. The intent is to show posts from new users but to hide new threads made by them and to treat their posts as sage)
Technical note: All rows are fk'ed to the node that is responsible; inserting a new manifest is a simple matter of deleting all rows of that node, inserting, and then letting the COMMIT take care of FK based trigger deletions.
This means that you only download content that you want to see. It's not possible to flood, since you are only ever pulling content, not pushing.[8]
Registration
Registration is done by finding any trusted node and getting them to assign you a non-zero trust.
As an implementation note: It would be easy to create bots that accepted a CAPTCHA and assigned a trivially low but not zero message trust, so that new users could join.
This basically mirrors, respectively, the invite system of KF and the registration system of KF. The difference is that anyone could run either service, and that users will not be dependent on their initial introduction point once they have made a few (non-spam) posts.
Other notes
I deliberately did not want to use too much complicated technology; decentralization is already hard enough as it is to get right, and using absurd Rube Goldberg machines hardly makes things less error-prone. There is no technical reason to use blockchains for this, which would only make everything by far worse.
Manifests are kept in JSON, since that's pretty much a universal standard
The code is in Python, since everyone knows it and it has a lot of libraries.
Things should be kept simple so that everyone should be able to contribute. This is important:
- It's important to keep a big potential contributor base
- There are unlikely to be any salaried full-time devs
- There aren't that many technologically proficient KF users worldwide
- Mild refactoring:
- Change table names to be unique
- Merge all sqlite databases
- Sharing of peer manifests:
- Add new table peer_manifests(id, seq, blob)
- Add new table my_manifests(seq, hash)
- Decide on serialization format (data + sig [+ seq + id])
- Encode and sign manifests
- Expose known manifests on HTTP over To
- Run a(nother) local HTTP server
- Preferably over Unix socket
- Integrate with the Stem controller library
- Subscribe to others' manifests over Tor
- Strictly opt-in basis at first, like RSS
- Onions only. I don't want for there to be a possibility of a proxy leak.
- UI features for trust ratings
- Neg/posrate
- Automatic imputation by non-sage replies
- Notes feature
- Implement calculation of trust rankings
- Probably personalized PageRank with self-trust (probably using networkx library)
- Automatically subscribe to users with positive trust
- Other work
- Sybil resistance
- DoS resistance
- UI work
- maybe public node support some day, if I get that far
- Web development/web design. I just have a very rudimentary table-based imitation of "classic forums" (i.e. phpBB) - see screenshots. If you can submit CSS or templating changes that makes it look nicer, this is welcomed. Also CSRF/XSS protection and that general sort of thing.
- Interface design. In particular, constructive suggestions on how to make WoT user-friendly, by giving it an UI similar to negrating on XenForo.
- Packaging, in terms of distro compliance, creating systemd/init system files, and porting to Windows. In the short term, just running a .py file is Good Enough(TM) for the few autists (o7) who want to try it.
- Tor, in particular Stem/control port protocol and (later on) DDoS protection - think ClientAuthorization but only for mutuals.
- Abstract elements of computer science, in particular graph trust algorithms for TLT. Is there a theoretical justification for personalized PageRank *with nodes assigning non-zero trust to themselves*?
- Scaling.
- How many O(n^2) algorithms are there, which ones are needed?
- Three months' retention of 100 PPH (KF does ~700) averaging 1000 chars at 1:8 compression ratio is 25 MB of storage - how close can we get to this lower bound, and what optimizations would it take?
- Probably user count is a bigger issue, but then again most users (at least on KF) are idle.
- Scaling is a good problem to have, since it'd mean people are using it a lot, which I consider unlikely.
- Cryptography - how to sign manifests in an interoperable fashion? Is the signature mechanism used for threads workable, is there a cleaner way? Are there (serious) downsides to signing raw blobs without using something like HKDF first?
- Data interchange formats. How do you design something that's future-proof but not underspec'd?
- Database schema design (SQL)
- General programming/HTML - there are some endpoints that need to be fleshed out and templates written for them
[1]: This is totally different from SneedForo AKA RuForo, which aims to be a XenForo replacement for the continued use of this website. My goal with this project is not to be schismatic.
[2]: https://blog.locut.us/2008/05/11/fms-spam-proof-anonymous-message-boards-on-freenet/
[3]: https://fms.fn.mk16.de/operation.htm
[4]: http://freesocial.draketo.de/fms_en.html
[5]: As a performance matter, it may turn out to be a better idea to break these up, but I haven't done the numbers yet.
[6]: While the main body of the manifest will be JSON, I haven't decided yet whether the signature should be appended onto or downloaded separately from it, or by what format that should be. The signature itself will be made according to BEP 44. I appreciate all pointers here. I'm also not sure whether sequence IDs should be sequential or based on the time.
[7]: Not a full copy of them, just time + OP + subject. Otherwise, if user A posts in a thread made by user B, and user C is receiving posts from A but not B, user C would not know the subject line of that thread.
[8]: Aside for the optimization mentioned earlier, but for any of that to happen you have to have assigned trust to them already.
Last edited: