I was working for Livingston Enterprises. I remember I was no longer working support engineering full time, so it must've been sometime in 1997. I still helped out the support term as a kind of senior guru. I'd take on any really confusing issues, anything that had stumped the newer folks. Especially any complex RADIUS stuff - I loved coming up with complex RADIUS rule sets that did things no one thought RADIUS could do. :-)
But this case wasn't a RADIUS case. No, this was something a bit more mundane, yet bizarre enough to have driven the support engineers up the wall and through the ceiling. Seems there was a small ISP who had a ton of PortMaster2 serial network access units. They were wired into a ton of US Robotics Courier modems, handling all the dial-in for this ISP. Remember, this was just as '56K' modems were appearing, so most folks were still on old V.34, or slower, modems. So this setup handled things fine, and it was a fairly typical one for many small ISPs. Except for one thing, this iSP was having problems.
It seemed that every time he walked into the room with the PMs, all of his modems would drop their connections at the same time. He claimed he could watch the modem lights all cycle in unison. We had logs from the PMs, and sure enough all the serial ports were resetting, pretty much simultaneously. He was sure it must be happening regularly, but the logs seemed to indicate it wasn't - just around the time someone would come into the room. The support folks had been all over this, trying every debugging and logging trick they knew, and were no closer to solving the issue. So it got bumped to me. I think some of the folks were hoping that it'd stump me too, since I had an annoying habit of walking in on a call that was driving someone nuts, looking over their shoulder, and pointing at something - "That setting is wrong" - without bothering to get all the details. The annoying part was that I was right. :-) If you work with a system long enough you tend to develop intuitions, and a setting that doesn't mesh with the others on the screen leaps out at you.
Well, anyway, I went over the collected logs and all, and yeah, everything was resetting. But all the configurations looked fine. So it must be something outside of the units. Dirty power? Time to call the guy at the ISP. He recapped the story for me and I started asking my questions. No, all the PMs were on a conditioned, UPSed circuit - as were the modems. They weren't on the same circuit as the lights. See, I was thinking maybe there was some power issue and when the lights were switched, or something, it was causing a reset. But no...
I couldn't get that thought out of my head. What do you do when you first walk into a dark, windowless machine room? You turn on the lights. But the units weren't on the same circuit...
"Describe the room to me."
"Describe the room - how is it laid out? Where is everything?"
"Well, the PortMasters are all on one rack along one wall. All of the modems are on ISP racks on the other wall, which is where the telco drop comes in."
*itch in the back of my brain*
"So the PMs are on one side of the room and the modems on the other side?"
"How do the serial cables get from the PMs to the modems? Along one of the walls?"
"No, we ran those up and over."
"What kind of lights do you have in that room?"
"Are they the industrial kind that hang down from the ceiling?"
"Did you run the cables across the room by draping them across the lights?"
"When you turn on the lights, the ballasts in the fixtures are nailing your serial cables with an RF or magnetic pulse, causing a noise spike, which causes the serial ports to reset. Move the cables off the lights."
Yes. That fixed the problem.
Funny, not many people even know about old florescent lights and ballasts - new fixtures don't use them.