Telling the last story reminded me of another one. This one also happened at Livingston after I'd moved on to being Webmaster and such, but still helping out the support group.
There was a woman running an ISP out in the middle of Arizona who would call in periodically complaining that her PortMaster IRX router, which handled a T1, had lost its connection. If it was left alone for a while it would always recover the connection eventually. The problem only happened every once in a blue moon, so it was almost impossible to troubleshoot. By the time it happened and she got into support odds are it had recovered. This went on for literally several months - sometimes the line would go a month or more without a hiccup. The IRX was set to log as much debug as possible all the time - which mostly produced wads of logs showing everything A-OK. But once in a while the V.35 serial port connected to the CSU/DSU that handled the T1 would show massive levels of errors, then the connection would drop.
Support tried lots of different things, but there really aren't many settings to change on a dedicated T1 leased line. It is either configured correctly or not. She got her telco, US Worst IIRC, to check out the line on our recommendation and they claimed it tested out fine. She was getting increasingly agitated and was starting to consider dumping the IRX for some other vendor's product. So it was passed to me.
I was convinced, based on my experience with the products, that it was *not* our problem. There was an outside chance of something freaky like a cold solder issue, but that didn't feel right to me. It should've happened a lot more frequently. The more I looked at things the more I was convinced it was something outside the box. But what? The telco claimed it was a good line, and the test results backed them up. IIRC the CSU/DSU and cables had already been swapped, to no avail, with a known good set.
This was a puzzler.
She was nice, despite being frustrated, and I spent some time on the phone with her just tossing ideas around. And we talked a bit about the area she was in, how she liked it. How it was really hot, but it was a dry heat since it rarely rained. Beautiful views, nice people... There's that brain itch again.
"How often does it rain?"
"Maybe once every month or so on average, if that."
"And how often does this problem happen?"
"Once a month or less."
"Have you noticed any correlation between the rain and the T1 going down?"
"Now that you mention it... it does seem to happen around the same time that we get rain. But not every time it rains."
"Just when it is a heavy rain, right?"
"What you have is a bad weather seal on the line somewhere between the telco's CO and your building. When you get a heavy enough rain the water seeps into the cable and shorts it out. That's what causes the errors on the line before it drops. When the water evaporates again after the sun comes out the line heals automagically and comes back. Demand the telco inspect the entire run of the line."
She was reluctant to do that because the telco said that if they did that and *didn't* find any problems they'd bill her some $BIGNUM for the expense. If they did find the problem then they'd fix it for no charge.
I got my boss to agree that Livingston would pay the bill if they didn't find any problem, so she told them to do it.
Livingston never had to pay a bill. Her line stopped failing.