InfiniBand on Ubuntu Server 7.10

If for some obscure reason you ever want to use Ubuntu Server 7.10 in a HPC scenario where your compute nodes are interconnected through InfiniBand fabric bear this in mind — Ubuntu’s IB support is broken and it has been in such a state for at least an year!

Besides from not providing the full OFED package, Ubuntu also gives us errors in udev’s configuration, namely in /etc/udev/rules.d/20-names.rules. On a pristine system the Infiniband section reads:

...
# Infiniband devices
KERNEL=="umad[0-9]*",         NAME="infiniband/%k"
KERNEL=="issm[0-9]*,          NAME="infiniband/%k"
KERNEL=="uverbs[0-9]*,        NAME="infiniband/%k"
KERNEL=="ucm[0-9]*,           NAME="infiniband/%k"
KERNEL=="rdma_cm",            NAME="infiniband/%k"
...

Noticed it? The closing double quotes on the lines for issm, uverbs, and ucm are missing. Thus udev ignores those lines as errors and puts theirs device nodes in /dev instead of /dev/infiniband/ where most software expects them to be. For example, OpenMPI’s openib BTL fails to run with such a configuration.

Correct the section so that it reads:

# Infiniband devices
KERNEL=="umad[0-9]*",         NAME="infiniband/%k"
KERNEL=="issm[0-9]*",         NAME="infiniband/%k"
KERNEL=="uverbs[0-9]*",       NAME="infiniband/%k"
KERNEL=="ucm[0-9]*",          NAME="infiniband/%k"
KERNEL==“rdma_cm",            NAME="infiniband/%k"

Now reboot and your device nodes will be placed where they are meant to be.

Коментари