Hello.. how’s your morning going?
I hope it’s been a
little better than mine.
We had a
teensy eensy weensy little billing error last night… my first clue
something was up when I saw this morning’s
daily billing report (so far):
$7,500,000.
It turns out due to my excessively fat fingers,
nearly every one of our customers has been seriously over-billed in the last 12 hours.
I bet when you read this part of the
last newsletter:
4. New Office!
Another important thing I’ve been doing instead of writing newsletters
is looking out the window of our NEW OFFICE:
http://blog.dreamhost.com/2007/12/21/were-so-high-right-now-you-dont-even-know
If your next web hosting bill from us is mysteriously tripled, now you
know why.
.. you thought it was a
joke!
Ha, the
joke is on you! I guess. Um, okay, no, not really, I’m sorry.
How on earth could something like this happen?
Let Me Explain
A couple of weeks ago, just around new years, we started beefing up some of our internal “controller” servers. These are the machines that run all of our “behind-the-scenes” services; things from adding a user to registering a domain to configuring apaches to
rebilling customers.
I was on a little-bit-too-long vacation, but when I got back, I noticed our daily credit card payments seemed a tad
low in the new year.
So, late last week I tried re-running the billing services for all the days back three weeks or so. I
knew this was safe, because after
10 years, the
one thing you
DO get perfect is your billing system. Our biller is pretty bug-free and robust at this point, because we’d be
broke and eating bugs if it weren’t.
In fact, it’s
so robust you can just run it on
any day you want, and it’s safe. It
won’t double-charge people and it’ll
even automatically find any missing charges and catch everything up to the day you said.
Anyway, I ran it, and things were
fine.. and sure enough, it caught a lot of missed payments. I didn’t have time to look into it right then, but I made a note to myself to check up on it on
Monday (yesterday) and see if things were fine or still messed up.
Come Monday
Monday came. I checked the reports and sure enough, things were still pretty low. So I looked at the logs for some of the biller services, and I noticed they were only failing on the
machines that had been recently upgraded!
That explained why we were getting
some money still (since not all the controllers have been upgraded yet), but not
all of it.
Anyway, it turned out there was no 64 bit version of the PFProAPI module we use to interface to the credit card transaction server. No big deal, there’s
a new module that interfaces with their new and preferred
https interface, and it was only a couple of lines of code to change to get us switched over!
So anyway, I made the change, and it worked, and
I even tested it, and things were
fine!
But then… late last night, I realized: when I re-ran those biller services last week,
they must not have fixed everybody
then either! It’s just that by running it
again I randomly got different people being charged on the working controllers who had been assigned an upgraded (and therefore broken) one before.
So why not just run it all one more time?
Sure, it should be
no problem! So I did,
manually running the biller (which is normally automatically scheduled) for
2008-01-14, 2008-01-13, 2008-01-12, 2008-01-11, 2008-01-10, 2008-01-09, 2008-01-08, 2008-01-07, 2008-01-06, 2008-01-05, 2008-01-04, 2008-01-03, 2008-01-02, and
2008-01-01.
I probably should have just stopped there. But then I thought
better. I thought to myself,
“When did we start upgrading these controllers anyway?”
I couldn’t remember. But, since the biller is
super-safe and robust anyway, I went ahead and ran it for
2008-12-31, 2008-12-30, 2008-12-29, 2008-12-28, 2008-12-27, 2008-12-26, and
2008-12-25, just for the hell of it.
Notice Anything?
Don’t feel bad if you didn’t. I kind of missed it
myself.
THOSE SHOULD HAVE BEEN 2007!!
Heh, uh.. um, er.. my bad?
So what happened?
Well, that
super-robust and stable biller did what it was
programmed to do, it ran as though
today was December 31st, 200
8!
And what did it see? Well, it saw a
whole lot of accounts (essentially
all of them) who for some
unknown, mysterious reason hadn’t been charged
at all for
eleven and a half months!
So off it went, busily through the night,
“fixing” everything up for
“today”, December 31st, 2008.
Really, it’s sort of
amazing this never happened before in the last ten years.
There IS a bug here.
I can imagine the
half second or so of thought that sprinted through the programmer’s mind when he was adding the ability to allow you to pass in what day to run the biller as though today is:
Hmm.. well, I could see us POSSIBLY wanting to be able to bill for a future date.
Well guess what…
NO! We will
NEVER want to rebill as though today were a day
that hasn’t happened yet! But instead, somebody along the line (Sage? Me? Somebody else?) figured,
“What’s the harm in keeping it flexible?”
About $7,500,000 in harm, that’s what!
The serious part.
The end to this story is that of course,
I’m very very sorry,
we’re very very sorry, and I’m sure
you’re very very sorry this happened. I really am. I understand the sort of problems that an unexpected large charge to your credit card (or worse yet, your debit card) can cause. If the tone of this blog post seemed a little light, I apologize I don’t mean to offend and I realize how serious an issue this is. I’ve been up since 3:50am trying to undo the damage and maybe I’m a little shell-shocked.
A new service is running right now (in parallel on all the controllers) that fixes all those future charges, re-enables your account if it was erroneously suspended, and if your credit card was automatically rebilled, refunds the payment automatically. You
don’t have to contact us or your bank, and you’ll get an
email when your account is finished fixing up. It’s going to take
several more hours to complete. There are (or were, after this incident)
a lot of you these days!
If, because of this billing mistake, you somehow incurred some fees from your bank or credit card company, please let us know
after tomorrow (
today we are just replying to
all 10,000+ billing messages with a generic explanation) and we’ll do our best to make it right for you.
And of course, the biller no longer allows dates in the future.
The moral of this story is that
“flexibility” is rarely desired in programming! The
less a program will accept/the
less a program will do/the
less options and preferences it has, the
more usable it is/the
more understandable it is/the
more stable it is.
Tough Love
When designing a program, you’ve got to make some
tough decisions .. and when you really can’t decide if this is something your users will need someday,
err on the side of leaving it out.
Otherwise, your users will someday
err on the side of your face.