The Linux Thread - The Autist's OS of Choice

  • 🐕 I am attempting to get the site runnning as fast as possible. If you are experiencing slow page load times, please report it.
If I ever figure out containers I might be interested in switching my home server from Debian to Azure Linux if I can make it work for home use. I think Hyper-V server is is free so that would be something interesting to play with
You mean you want to run Azure Linux on bare metal, as a container host?
 
Well everything extracted fine. I made sure to carefully select which hidden folders to copy over and which ones to delete so nothing gets messed up.
Now seems like bad time to promote xz but IIRC it's about the best lossless compression you can have on Linux and works well with tar. Specifically xz -9 -e will produce the best results. bzip2 will also result in better file sizes than gzip in most cases.
 
You mean you want to run Azure Linux on bare metal, as a container host?
Looking into more it's optimized to run on top of Hyper-V so it wouldn't really work well on bare metal. It's doable but would be ineffecient for my needs and need more expensive hardware. So realistically unless my home setup ends up being really complicated I'll just stick with Debian. Plus I still haven't worked with containers yet.
 
Now seems like bad time to promote xz but IIRC it's about the best lossless compression you can have on Linux and works well with tar. Specifically xz -9 -e will produce the best results. bzip2 will also result in better file sizes than gzip in most cases.
Zstandard is also fairly nice. It uses a newer form of entropy coding that gives it a better bang for your buck on the ratio/time tradeoff, while also being about as good as LZMA on higher compression settings.
 
Zstandard is also fairly nice. It uses a newer form of entropy coding that gives it a better bang for your buck on the ratio/time tradeoff, while also being about as good as LZMA on higher compression settings.
That's nice info and I had no idea I even had it installed but Atril doesn't open .pdf.zstd files automatically like it does with .pdf.xz files. I wonder how to change that.
 
Now seems like bad time to promote xz but IIRC it's about the best lossless compression you can have on Linux and works well with tar.
Con Kolivas' lrzip ( https://github.com/ckolivas/lrzip ) does a little better (also defaults to LZMA based like XZ, has ZPAQ) on some edge cases. Where you have repetition over long distances is where it really shines. Got double-digit percent improvement over directories of uncompressed console ROMs, for example. It's an updating of rzip.
 
That's nice info and I had no idea I even had it installed but Atril doesn't open .pdf.zstd files automatically like it does with .pdf.xz files. I wonder how to change that.
It's becoming one of the main compression formats that a lot of programs use internally. Arch packages, for instance, have been switched from .tar.xz to .tar.zst for some time now. A somewhat recent version of Blender switched from using gzip for internal .blend compression to using Zstandard. It was a fairly bold move by its developers to call it "Zstandard" but it actually seems to be living up to that name.

Unrelated, but it has a neat little feature where you can "train" it on a bunch of files, which will save a dictionary. You can then use that to compress large numbers of small files in a more compact way, without having to resort to solid archives with bad random-access capabilities. This feature is probably better for certain types of API usage than manual usage, however. If you do end up with several thousand single-digit-KiB files that you need to compress individually, I guess you're in luck?
 
Unrelated, but it has a neat little feature where you can "train" it on a bunch of files, which will save a dictionary. You can then use that to compress large numbers of small files in a more compact way, without having to resort to solid archives with bad random-access capabilities. This feature is probably better for certain types of API usage than manual usage, however. If you do end up with several thousand single-digit-KiB files that you need to compress individually, I guess you're in luck?
I think that's mostly for if you have a lot of files with a lot of overlap. Like say if you have several XML files that all use the same template but have one or two lines that are unique to each file. The dictionary will contain the basic file structure while the compressed file will contain the details unique to that file
 
I think that's mostly for if you have a lot of files with a lot of overlap. Like say if you have several XML files that all use the same template but have one or two lines that are unique to each file. The dictionary will contain the basic file structure while the compressed file will contain the details unique to that file
That's underestimating it. It doesn't know or care about structure in any way whatsoever. It will accidentally represent structure if it proves to compress well, but that's as far as it goes. It will work well on data much more diverse than a specific low-entropy XML schema. If you ran it random XML files in general, it will optimize away the common elements of XML itself, as well as some common tags and attributes, and the raw text included in the XML, which could be English. English is quite compressible. If you ran it on some specific variant of XML, like HTML (note that HTML isn't exactly XML (at least in its current form, there are older versions of HTML that are subsets of XML,) but HTML can be considered an XML variant in this example,) it would work more specifically on HTML, and compress HTML better at the expense of other XML-like data.

If you ran it on data like you described, it would work exceptionally well. But it's not "mostly for" that kind of task. It's for making small pieces of data compress better without compressing it in blocks, so you can decompress 1KiB JSON block #47,573,245 without having to decompress at least several dozen of them at once, which can hurt performance for certain things.
 
  • Informative
Reactions: Vecr and analrapist
As you can possibly tell from my pfp, I'm usually the one dispensing IT and programming advice in interactions where that comes up. It's nice to have things the other way around from time to time!
One of the things I like about this thread is because you assholes will tell me I'm wrong straight up and bring your receipts. It's not something that happens often in my world, and it's beyond valuable.
 
I think that's mostly for if you have a lot of files with a lot of overlap. Like say if you have several XML files that all use the same template but have one or two lines that are unique to each file. The dictionary will contain the basic file structure while the compressed file will contain the details unique to that file
What you're describing seems quite similar to minimum description length, which is about evaluating which statistical model works best given certain data but also uses a compression-based approach
 
One of the things I like about this thread is because you assholes will tell me I'm wrong straight up and bring your receipts. It's not something that happens often in my world, and it's beyond valuable.
Funny what happens when you can finally take the corporate circle jerking gloves off and speak one on one without potential repercussions of someone getting their feelings hurt huh. Nobody ever learns by being babied all day.

Also remember to patch that xz vulnerability on your personal pcs if you haven't already frens, most of the affected distros have released patches at this point now there's an official CVE for it. God help all of us who need to patch this shit at work this week.
 
That's nice info and I had no idea I even had it installed but Atril doesn't open .pdf.zstd files automatically like it does with .pdf.xz files. I wonder how to change that.

I was curious and felt vaguely solicitous after some of the dialogue here. I use neither Atril nor zstd, but from what I see, if the MIME type of zstd files are set correctly, it looks like zstd support requires very little additional code, supposing zstd uses the same invocation syntax as gzip/bzip2/xz.

Add code here:
https://github.com/mate-desktop/atril/blob/master/libdocument/ev-file-helpers.h#L36 - one line, EV_COMPRESSION_ZSTD
https://github.com/mate-desktop/atril/blob/master/libdocument/ev-document-factory.c#L102, - one if statement, to point at that, "zs" for the string.
https://github.com/mate-desktop/atril/blob/master/libdocument/ev-file-helpers.c#L441 - "zstd" in array

Surprisingly painless!

Edit:
Funny what happens when you can finally take the corporate circle jerking gloves off and speak one on one without potential repercussions of someone getting their feelings hurt huh. Nobody ever learns by being babied all day.

Also remember to patch that xz vulnerability on your personal pcs if you haven't already frens, most of the affected distros have released patches at this point now there's an official CVE for it. God help all of us who need to patch this shit at work this week.

Incidentally, Debian Stable is unaffected by this (because of how slow moving it is, still on 5.4.1).
 
Last edited:
Is anyone concerned about the fact that Microsoft going full ham on Linux invokes their old strategy of Embrace, Extend, Extinguish? Their end goal is to get everyone running Linux containers on a tiny stub of an OS that's running on top of Microsoft Hyper-V.
 
Is anyone concerned about the fact that Microsoft going full ham on Linux invokes their old strategy of Embrace, Extend, Extinguish? Their end goal is to get everyone running Linux containers on a tiny stub of an OS that's running on top of Microsoft Hyper-V.

We already have IBM and what they did with Red Hat fostering ABI and binary compatibility drama in the commercial sector. CentOS getting fucked over for what amounts to a worse, commercial version of Fedora. Enterprise clones such as AlmaLinux and Rocky Linux are at the mercy of whatever leeway IBM/Red Hat are willing to afford.

Microsoft is just part of it.
 
Back