Skip to main content

All paths lost on HBA port

HP, a great company, I like the hardware design of HP ProLiant server, it's pretty easy for datacenter maintenance and operation, do you like it? Today, I'll introduce a storage issue on HP ProLiant BL460, BL480 blades. This issue happened on Qlogic HBA with VC-FC module. I have two dual port Qlogic HBAs on each ESXi5.x host, one port of each HBA was zoned together on SAN switch.

For example, vmhba1 and vmhba3 are zoned for LUN allocation, each LUN have two paths on each HBA port.

I observed all LUNs disappeared on random HBA port sometimes, it's not happening very frequently, but it can lead to ALL VM DEAD if you get storage outage when LUNs disappeared!!! This problem becomes more frequently more your virtual infrastructure grows bigger.

This is the symptoms when the issue happening:

And if you login SSH console and check HBA card status by:

less /proc/scsi/qla2xxx/[Device ID]

You will find following differences of two HBA ports:

See? All targets show Offline status on problem HBA.


You have two options to fix it:

  1. Reseat blade. Downtime and local resource is required.

  2. Reset HBA by following step:

Record the Device ID, and force HBA do rescan:

echo "scsi-qlascan" > /proc/scsi/qla2xxx/adapter_id

Wait few seconds, force LIP login:

echo "scsi-qlalip" > /proc/scsi/qla2xxx/adapter_id

Wait few minutes, LUNs come back online… JYou could refer to KB 1031199 for more detail.

This is a temporary remediation, the problem will repeat. I'll show you some permanent solution in next blog.

Popular posts from this blog

Connect-NsxtServer shows "Unable to connect to the remote server"

When you run Connect-NsxtServer in the PowerCLI, it may show "Unable to connect to the remote server".  Because the error message is a little bit confusing with other login issues. It's not easy to troubleshoot. The actual reason is the NSX-T uses a self-signed certificate, and the PowerCLI cannot accept the certificate automatically. The fix is super easy. You need to set the PowerCLI to ignore the invalid certificate with the following command: Set-PowerCLIConfiguration -Scope User -InvalidCertificateAction:Ignore -Confirm:$false

Setup Terraform and Ansible for Windows provisionon CentOS

Provisioning Windows machines with Terraform is easy. Configuring Windows machines with Ansible is also not complex. However, it's a little bit challenging to combine them. The following steps are some ideas about handling a Windows machine from provisioning to post configuration without modifying the winrm configuration on the guest operating system. Install required repos for yum. yum -y install yum -y install yum -y install yum -y install epel-release yum -y install yum-utils yum-config-manager --add-repo Install  Terraform . sudo yum -y install terraform Install  Ansible . sudo yum -y install ansible Install  Kerberos . yum -y install gcc python-devel krb5-devel krb5-libs krb5-workstation

How to List All Users in Terraform Cloud

Terraform has a rich API. However, the API documentation does not mention how to list all users. We can leverage the organization membership API and the PowerShell command  Invoke-RestMethod  to get a user list. 1. Create an organization token in Terraform Cloud. 2. Create the token variable ( $Token ) in PowerShell. $Token = "abcde" 3. Create the API parameters variable in PowerShell. $params = @{ Uri = "" Authentication = "Bearer" Token = $Token ContentType = "application/vnd.api+json" } Note: You need to replace ZHENGWU with your own organization name. And I used 100 at the end of the URI to retrieve the first 100 users. It can be any number.  4. Retrieve the API return and list the user's email address. $Test = Invoke-RestMethod @params $