Technical fact 4: PSU redundancy problem

19-mrt-2021 10:59:31

Incredible progress has been made by technology in the last couple of years. The future of technology is overwhelming for some, while it can be exciting for others. We have put together some interesting and surprising facts about tech! You might be surprised and amused by what we've discovered.

WautWaut Bartels: "Having 2 PSU's in a redundant state, but in the meantime apparently not being redundant?" 

 

PSU-redundant-state_website-afbeelding-1

What was the issue?

Having 2 PSU’s in a redundant state, but in the meantime apparently not being redundant… I can hear you say. What the ? is this happening? Yes indeed, it happened. And the “fun” part is that the PSU goes in lockdown phase. So, you need to reseed the PSU to get it out of the lockdown phase. You cannot remotely do this.

Technical fact!

Upgrading firmware to the latest stable version is a must. And if needed, contact your vendor support for a helping hand if it is unclear which mix of firmware you need to have in your hardware devices. This is why an IT-partner that has a solid Managed Operation, monitor system and good contacts with vendor support definitely bring added value.

 

Waut's approach:

After receiving multiple errors on our monitor system, the real deal starts.
 
When following the protocol that we agreed to with our customer for P-1 problems, we also start in parallel with the investigation, the root cause analysis.

PSU-redundant-state_afbeelding-website-4What happened? 1 or multiple machines affected? System still online? If not, how to get it stable and online ASAP.
 
When reviewing the logs, it was clear that there was a problem with the PSU. 1 PSU was not online so it needed to be reseeded. After the reseed everything was fine. We already created a ticket in the system of our vendor. They told us that the firmware on the PSU wasn’t the newest and that this was a bug in the firmware.


We didn’t saw any newer firmware on the download page so we were puzzled. A PSU firmware bug that could turn a redundant setup in non-redundant? And we were not aware of it? Ok, this is a discussion for the future. Lets make IT happen first!
 
PSU-redundante-state_afbeelding-website-5The vendor already warned us that when upgrading the firmware, this bug could rise and shine again. We received a few spare PSU’s just in case it would happen. We first updated the impacted server that went down, and indeed during the firmware update, the PSU turned in lockdown phase again. We got the faulty PSU out and installed the new one. This spare PSU already came with the correct firmware level.

We decided that we wouldn’t proceed with upgrading the whole stack but go for replacements.

We got back to our vendor and told them what happened and what we wanted to do. They agreed with our action plan and shipped all the PSU’s to us. In the following weeks, we gradually replaced all the PSU’s.

 What you should definitely remember:

  • Problem Route Cause Analysis (RCA) happened really quickly.
  • We have been able to show that we are a solid IT-Partner where customers can rely on.
  • We have the necessary partnerships with vendors to communicate and resolve problems fast.
Do you need more information about this topic or maybe you can't solve another IT problem on your own? Please reach out, we will help you! 

Contact Us

Aangeraden artikels

Op basis van Technical Facts

Schrijf u in op onze nieuwsbrief