How can InfiniBand be tested

IP-over-InfiniBand (IPoIB) packets do not reach a destination.

What do you see here?

When the tcpdump command is used on IB interfaces, a unicast packet sometimes leaves one computer but does not arrive on the other side. However, other IPoIB packets issued by the same sender before and after the data in question will arrive.

Note
  • Usually between 0 and 5 nodes show the problem.
  • Possible temporary network file system (NFS) server unreachable on InfiniBand (IB) fabric.

Surroundings

Intel® true Scale 7.4.0.0.21 or similar version
IPoIB networked mode
Hosts use the openfabrics Enterprise Distribution (OFED) provided by CentOS.
Over 200x CentOS-NFS clients
Most of the client nodes are CentOS 2.6.32-431.23.3. el6
QLogic 7322 Host Channel Adapters (HCAs)
2X Intel® true Scale Fabric Director Switch 12800


How can this be fixed?

Swap cables between top-of-rack edge switches and in-rack host channel adapters (HCAs).

If you need further assistance, please contact support to answer the following questions:

  1. How many nodes are working on the fabric in question?
  2. How many of you will be affected by the reported bugs?
  3. What specific NFS (Network File System) are you using?
  4. Have there been the latest changes or updates to the fabric that might have triggered this behavior?
  5. Run the iba_capture Command on one of the affected clients and one on the NFS servers and send us the output. For more information, see "Reporting an Intel® true Scale Fabric Issue" on a Linux * Node.

& Additional Information:

the information in this article has been used by our customers but has not been tested, fully replicated, or validated by Intel. the individual results may differ. All contributions and the use of the content on this website are subject to the terms and conditions for use of the website.