Disaster Cloudflare has admitted that one of its engineers "stepped beyond the bounds of its policies" and throttled traffic to a customer's website. - Website and API became unresponsive due to extensive throttling

UPDATED Cloudflare has admitted that one of its engineers stepped beyond the bounds of its policies and throttled traffic to a customer's website.

The internet-grooming outfit has 'fessed up to the incident and explained it started on February 2 when a network engineer "received an alert for a congesting interface" between an Equinix datacenter and a Cloudflare facilit

Cloudflare's post about the matter states that such alerts aren't unusual – but this one was due to a sudden and extreme spike of traffic and had occurred twice in successive day

"The engineer in charge identified the customer's domain … as being responsible for this sudden spike of traffic between Cloudflare and their origin network, a storage provider," the post states. "Traffic from this customer went suddenly from an average of 1,500 requests per second, and a 0.5MB payload per request, to 3,000 requests per second (2x) and more than 12MB payload per request (25x)

As the spike created congestion on a physical interface, it impacted many Cloudflare customers and peer

Cloudflare's automated remedies swung into action, but weren't sufficient to completely fix the proble

An unidentified engineer "decided to apply a throttling mechanism to prevent the zone from pulling so much traffic from their origin

A post to Hacker News that Cloudflare's post links to – and which The Register therefore assumes was posted by the throttled customer – states the throttle was applied without warning and caused the customer's site and API to become effectively unavailable due to slow responses leading to timeouts.

Cloudflare has issued a mea culpa for its decision to impose the throttle.

"Let's be very clear on this action: Cloudflare does not have an established process to throttle customers that consume large amounts of bandwidth, and does not intend to have one," wrote Cloudflare senior veep for production engineering Jeremy Hartman and veep for networking engineering Jérôme Fleury.

This remediation was a mistake, it was not sanctioned, and we deeply regret it."

Cloudflare has promised to change its policies and procedures so this can't happen again – at least not without multiple execs signing off on it.

"To make sure a similar incident does not happen, we are establishing clear rules to mitigate issues like this one. Any action taken against a customer domain, paying or not, will require multiple levels of approval and clear communication to the customer," Hartman and Fleury state. "Our tooling will be improved to reflect this. We have many ways of traffic shaping in situations where a huge spike of traffic affects a link and could have applied a different mitigation in this instance."

The Hacker News post referenced above sparked a 300-plus comment conversation in which few authors have kind things to say about Cloudflare. Nor do various folks in some of the darker reaches of the web, where Cloudflare has often been accused of throttling traffic as a political act, given its track record of declining to serve sites that host hate speech.

Actually throttling a customer without warning will likely fuel theories that Cloudflare, like its Big Tech peers, is an activist organization that does not treat all types of speech fairly.

Hartman and Fleury promised that Cloudflare is re-writing its legalese to better explain what customers can expect. "We will follow up with a blog post dedicated to these changes later," the pair wrote.

The post does not mention what, if anything, happened to the engineer who applied the throttle. ®

Updated to add at 2350 UTC, February 9
Cloudflare contacted The Register with the following statement: "There were no punitive measures taken against anyone involved in this unfortunate incident. We have a blame-free culture at Cloudflare. People make mistakes. It's the responsibility of the organization to make sure that the damage from those mistakes is limited."

 
Cloudflare contacted The Register with the following statement: "There were no punitive measures taken against anyone involved in this unfortunate incident. We have a blame-free culture at Cloudflare. People make mistakes. It's the responsibility of the organization to make sure that the damage from those mistakes is limited."

So a rogue employee took advantage of a system that they don't have set up to attack a customers web service, which is a "serious issue", then gets a pat on the back from a blame-free culture.


That'll learn him, surely there aren't any other "rogue employee's" weaponizing a horrificly powerful company repercussion free.

Let's be very clear on this action: Cloudflare does not have an established process to throttle customers that consume large amounts of bandwidth, and does not intend to have one,"
One employee sure managed to do it pretty quickly, completely on his own

"Unfortunate"
 
First the kiwi farms backtrack and now rogue employees taking action themselves. This is the future they chose.

Updated to add at 2350 UTC, February 9
Cloudflare contacted The Register with the following statement: "There were no punitive measures taken against anyone involved in this unfortunate incident. We have a blame-free culture at Cloudflare. People make mistakes. It's the responsibility of the organization to make sure that the damage from those mistakes is limited."
So nothing changed and it will happen again.
 
Looks like these fucks can literally do whatever they want without any repercussions. We already knew that though.
Whoa dude let's not use such hostile language it's not their fault, they have a blame-free culture what could they possibly do?!?!

Just because that tranny accidentally turned off their service at the same time 10 large scale DDOS attacks were pulled on the site, and accidentally directed a bunch of incredibly illegal material directly to the FBI that was "intercepted from bad website" doesn't mean people don't make honest mistakes ok? Repercussions and rules BAD
 
So nothing changed and it will happen again.
No, no, no, now that the company has officially and publicly come out saying their employees can literally do anything, intentional or not, and it's totally fine, means their very responsible, very intelligent employees will gain MUCHOS RESPECTO and surely be far more careful and considerate

This reads like basic triage on the fly, if the attack was large enough to start affecting other customer's shit then it only makes sense to throttle the offender down. Obviously this site has its issues with Cloudflare, but what this engineer did is just common sense network admin stuff.
Is it common sense when they claim it was an unsanctioned mistake that they're sorry this employee did? Doesn't sound like "common sense admin stuff"
 
This reads like basic triage on the fly, if the attack was large enough to start affecting other customer's shit then it only makes sense to throttle the offender down. Obviously this site has its issues with Cloudflare, but what this engineer did is just common sense network admin stuff.
The "offender" in this case being the website paying for Cloudflare's services and with the expectation that Cloudflare would be able to keep their access stable when they were under DDoS attack.
 
Is it common sense when they claim it was an unsanctioned mistake that they're sorry this employee did? Doesn't sound like "common sense admin stuff"
Weasel words, they're happy the guy didn't let the infection spread but have to do PR over the fact they let a customer eat shit.
The "offender" in this case being the website paying for Cloudflare's services and with the expectation that Cloudflare would be able to keep their access stable when they were under DDoS attack.
Welcome to reality, you think all of those websites providing the five nines actually achieve such things? What is going to happen is that CF is going to rewrite the contracts with a statement that they can isolate or degrade you if they feel its required to protect the entire business and its customers, if its not already in the contract, which it would almost certainly be.
 
This reads like basic triage on the fly, if the attack was large enough to start affecting other customer's shit then it only makes sense to throttle the offender down. Obviously this site has its issues with Cloudflare, but what this engineer did is just common sense network admin stuff.
I was about to say, from the description of the incident this seems like a reasonable course of action in the moment to get things back under control and to reduce load.

In the same situation this would probably be one of my first thoughts too, so I can’t really be too upset.

Is it common sense when they claim it was an unsanctioned mistake that they're sorry this employee did? Doesn't sound like "common sense admin stuff"
It’s called damage control and PR. Make the customer less angry and give the notoriously annoying HN users something to chew on.

Side note, The Register is such a garbage tech “news” site. The fake outrage, anger, and writing style get old quick.
 
Last edited:
they leaked a photo of the employee in question:

21291 - SoyBooru.png
 
I was about to say, from the description of the incident this seems like a reasonable course of action in the moment to get things back under control and to reduce load.

In the same situation this would probably be one of my first thoughts too, so I can’t really be too upset.


It’s called damage control and PR. Make the customer less angry and give the notoriously annoying HN users something to chew on.

Side note, The Register is such a garbage tech “news” site. The fake outrage, anger, and writing style get old quick.
"There's 0 punishment, we don't blame him, anyone can do whatever they want to our customers we don't care"

That makes what customer less angry? I know when I have a complaint about the job somebody did the boss saying "don't worry we don't blame anyone for anything they can do whatever they want" always settles my seethe
 
"There's 0 punishment, we don't blame him, anyone can do whatever they want to our customers we don't care"

That makes what customer less angry? I know when I have a complaint about the job somebody did the boss saying "don't worry we don't blame anyone for anything they can do whatever they want" always settles my seethe
Frankly what the customer thinks probably doesn't matter, I'm sure the TOS and the like give CF that ability to do things like this, but this quote is clearly trying calm down/reassure the customer/s:
To make sure a similar incident does not happen, we are establishing clear rules to mitigate issues like this one. Any action taken against a customer domain, paying or not, will require multiple levels of approval and clear communication to the customer

Why would there be punishment? The employee protected CF first and foremost during an active incident. It's not like the customer is going to go elsewhere in 2023, KF only did because Null was given zero choice in the matter.
 
Back