2024. 2. 19. 11:20 오라클
Exadata 이미지 업그레이드 작업 버그 하나 (22.1.17, 23.1.8)
Exadata 이미지 업그레이드 작업 버그 하나 (22.1.17, 23.1.8)
Issue: During OS Image Upgrade, node gets into boot loop
Impacted releases: Direct upgrade from 21.2.10 or earlier to 22.1.17 / 22.1.18 / 22.1.19 / 23.1.8 /23.1.9 / 23.1.10
Root Cause :
There is a known bug in kernel-ueknano-4.14.35-2047.511.5.5.3.el7uek.x86_64 Kernel due to which FW update of Whitney+ card crashes Kernel.
Above Kernel is included in Nov-2023 releases (22.1.17, 23.1.8).
Bug 35844212/35848949 - bnxtnvm failed to update/downgrade whitneyplus firmware
Fix for above bug is included in uptrack-update that is also included in Nov-2023 releases.
While upgrading compute during firstboot, it comes up with new Kernel and FW updates are applied.
There is no issue if FW upgrade activity starts after botting up with new Kernel and uptrack update is effective.
Due to timing issues, if FW upgrade starts after booting up with new Kernel and before uptrack-update is effective, then kernel crashes and node gets rebooted.
During boot up time (firstboot), FW upgrade kicks in again causes Kernel to crash. This is happening in a loop forever.
• Disable FW updates during impacted upgrade path ( touch /opt/oracle.cellos/DISABLE_HARDWARE_FIRMWARE_CHECKS)
• Post upgrades, install FW upgrades
o /opt/oracle.cellos/CheckHWnFWProfile -action updatefw -mode exact
Next steps:
Kernel with fix is included in Feb-2024 releases (22.1.20, 23.1.11).