Alternate title: I picked the wrong week to quit drinking.
Been having trouble with a vendor software system that stops its database... in the middle of the day... while people are entering in patient data. You know, serious information.
At first the blame was a bad sector on a hard drive - sorry, but with a RAID array (I know that's redundant, but the whole idea is redundancy), it's kind of hard to have one bad sector and louse up the whole thing.
Next, it was "blame the memory" - ran an offline diagnostics that revealed everything was fine. Also ran the diagnostics against the CPUs, hard drives, controllers.... NOTHING
What sticks in my back-side is that this software company is wasting time finding the problem by blaming hardware. They can sit on the sidelines because they aren't the poor schmuck taking the servers off-line and running this crap in the middle of the night, ending up working a week in a couple of days.
The other issue I have is the "we know it all" factor. It seems they don't listen to people unless they are "experts" in their crappy software or their fringe database management system. I told them it wasn't hardware but they wouldn't listen and I had to put in extra hours (at night) to rule out what is obvious to anyone who has any sort of troubleshooting skills.
"It's good to rule it out" - Bull-loney.
From what I can determine, the code as written, doesn't do a good job of error-trapping when executing a commit to the database. They let the database management system handle errors and that system will shut down if there is a record locked.
Good job, guys.
Every diagnostic is showing hardware is running like a top. Next they'll blame the phase of the moon.
Is it waxing or waning? I don't know cause I haven't slept much this week.