That page is a lot of fluffery that makes it sound like an awful place to work. Sounds like the kind of place that makes everyone participate in pronoun circles and requires everyone in the group to talk about their innermost feelings during mandatory team building events.
It also doesn't really go into any sort of explanation on what the fuck observiblity is. I asked my husband, as I do with all tech questions, he gave this explanation as an example.
If you have a collision domain* and the traffic all goes through a hub, you have observibiliy over the traffic.
If the traffic goes through a switch* then unless your switch has tools that let you see the traffic, you do not have observability.
In the context of a company, it depends. Could be network traffic snooping tools, or it could be software that goes on computers to see if Bob in accounting is surfing porn.
* collision domain - a group of networked devices that can talk directly to each other without going through a router. Your home wifi network is likely a collision domain. That's why things like Microsoft File Share can work between two different computers.
* the difference between a hub and a switch. Both of them provide the ability for an ethernet point-to-point network with multiple devices to actually work. A hub is insecure because it takes all the traffic that comes into it and sends it to every other device plugged into it. If a malicious person gets control of the dumb secretaries computer, they can set up a program to read all the network traffic - like accounts payable and steal money.
A switch is smarter and takes in all the traffic and sends it only to the device the traffic is for. Think of it like the telephone system, I call you and the phone network sends our voices only to each other. A malicious actor would need to compromise at least one computer in accounts payable in order to sniff the traffic and steal the money.
Hubs are hardly ever used these days. Switches are just more better in almost every way.
Your husband is on the right track but still wrong. It is not about network tracing or tracking what apps are installed on computers.
It is observability on how your applications work and behave in a distributed environment.
With observability they mean logging and alerting of their applications.
So say you have a few hundred machines in the cloud that act as some kind or service and they all talk to each other.
To monitor and manage this you have a library that you link with your application,and then you add code to the application to
"every time you make a call to a different server, log what kind of service it was, how long it took to get a response, and if there was an error."
And you add a LOT of these data collection points to the application, to all applications you have and you fetch and store all these samples into a database.
Then, using this data you can build dashboards like this :
Example Dashboard
where you can create graphs that display things like "error rate in errors per million requests" or
"average response time when service x calls out to service y"
When you use the frameworks that are part of any common cloud service/provider this is mostly added automatically to your application.
This is very useful because if your user facing service starts to fail these graphs might then be able to tell you "the failures started about the same time as backend service Y started showing slow response times" and you get a good clue on where to start troubleshooting the root casue.
Basically it makes it easier to troubleshoot or even where to start troubleshooting in a complex distributed environment.
In addition to data collection and graphs like this you can also set up Alerts. For example "if backend service Y starts to become so slow that average response time is higher than some threshold then page the engineer on duty to have a look to resolve the issue BEFORE it becomes a end-user visible failure"
Now, this is not exactly rocket science. At google they have many different such systems, the two most common ones are called Borgmon and Monarch. If you sign up for Google Cloud you get access to very similar systems. Same if you sign up for Azure.
At twitter they have their own but very similar systems too (though a lot more primitive compared to Googles/Microsofts/Amazons/Facebooks systems)
If you build your application using the standard frameworks for these cloud services then all the data collection points are built into the application automatically and you just have to describe what graphs you want and what to alert on.
If you are an SRE at any of the big tech companies 90% of your workday will be working on these exact things.
What I am trying to say is that this is super standard in every single distributed environment on the planet. And for the most advanced ones this is almost fully automatically added to your code.
From what I can tell of honeycombs product is that they offer a similar but less sophisticated type of logging/monitoring/alerting subsystems for people that want to build it into their applications.
Which is a good thing. But, how big is the actual market?
I don't understand the market for this product though. Is it aimed at companies that want to run their apps in their own datacentres or on rented machines instead of a cloud? Why? If you need hundreds of machines or more for your service it is much easier and cheaper to just use Amazon/Google/Azure/... and then you have this all built in automatically as part of the framework that they provide for your applications.
It is an offering for people that want to build their own cloud service on their own machines. It makes little sense to me but hey they apparently have customers.