Technical fact 2: LCM error while upgrading skylake

26-feb-2021 13:47:23

Incredible progress has been made by technology in the last couple of years. The future of technology is overwhelming for some, while it can be exciting for others. We have put together some interesting and surprising facts about tech! You might be surprised and amused by what we've discovered.

BertBert Cayers: "Did you know that the IPMI can change its mac adress by itself?" 

 

What was the issue?

While upgrading BIOS firmware on a Lenovo server running ESXi, LCM got stuck on the Skylake upgrade. In addition, we weren't able to connect to the IPMI of the server anymore. As a result, we had to go to the datacenter itself to troubleshoot further directly on the server. 

Based on our own experience, we will divide the solution below into two parts: when the IPMI is reachable and when it is not. 

Core-ICT-Lenovo-Servers-IMPI-BIOS-Firmware-3

Bert's approach:

  • When the IPMI can be reached:
 

First you have to login to the IPMI and open a remote session to the host. You will notice that the host is booted in phoenix mode. Run the command: python /phoenix/reboot_to_host.py to startup the host with its normal ESXi OS.

 
  • When the IPMI can't be reached:
 Go to the datacenter where the server/host is located. Connect directly to the host with a screen and keyboard. You will notice that the host is in phoenix mode. You will need the same command that is normally used to boot the host: "python /phoenix/reboot_to_host.py". Wait for the host and CVM to boot, as they normally would.
 

If you still can’t connect to the IPMI you can send out a warm/cold reset from the host with the command: "./ipmitool mc reset (cold/warm".  Then check again if you can connect to the IPMI. If this still isn’t the case you can check the IPMI MAC address with the ipmitool on the host. We have noticed that for some reason it is possible that the IPMI changes its MAC address by itself after the reboot. The correct address of the IPMI can be found on the back of the server. 

It is possible that there still is a problem logging in to the IPMI. This easily can be fixed by checking the user number of the admin user and changing its password to what it was before. Sometimes the password itself gets changed by this bug.  After this, you can normally connect to everything and run LCM to check if the firmware has been upgraded.

Core-ICT-Lenovo-Servers-IMPI-BIOS-Firmware-1
 

The last bug that might occur is the LCM precheck issue. This can be solved with two checks. First, check which host the problem is still occurring on using the command -> allssh "genesis status | grep foundation". Next, you need to connect to this CVM through SSH and stop it with the command -> "genesis stop foundation".

After all the steps mentioned above, the problem was solved, and everything was running again as it should be.

 

How to prevent it?

According to Nutanix support: upgrade LCM to a higher version than 2.3.2.1. This should fix most of the issues, though our problems are still being investigated if all of them are fixed by the upgrade. To be continued.

Core-ICT-Lenovo-Servers-IPMI-BIOS-Fimware-2
 

Do you need more information about this topic or maybe you can't solve another IT problem on your own? Please reach out, we will help you!

Schrijf u in op onze nieuwsbrief