Several days ago we successfully performed DRF SFTP Storage (drf.origin.gcdn.co) expansion. We added four new same-type disk enclosures and new disk drives. The problem started 2 days later with kernel panic of both cluster storage nodes simultaneously. Servers reboot didn't help. So we requested datacenter support to perform servers and new disk enclosures power drain. This also didn't help. After several tries, we booted servers into servicel OS and found version difference of the firmware for the new and old disk enclosures expanders (controllers). This seems caused some kind of SAS bus failure. We upgraded the firmwares and after disk enclosures power drain the system booted normally, service became available. Upon troubleshooting we also got memory module failure at one storage node and requested server repair. As a result we've added a firmware version check to storage expansion manual to prevent such outages in future.
We will need to perform additional maintenance in the near future. Please follow our Status page for details.