InfiniBand on Ubuntu Server 7.10
If for some obscure reason you ever want to use Ubuntu Server 7.10 in HPC1 scenarios where your compute nodes are interconnected through InfiniBand fabric bear this in mind – Ubuntu’s IB support is broken and it has been in such a state for at least an year!
Besides not providing the full OFED package, Ubuntu also gives us errors in udev’s configuration, namely in /etc/udev/rules.d/20-names.rules
.
On a pristine system the Infiniband section reads:
...
# Infiniband devices
KERNEL=="umad[0-9]*", NAME="infiniband/%k"
KERNEL=="issm[0-9]*, NAME="infiniband/%k"
KERNEL=="uverbs[0-9]*, NAME="infiniband/%k"
KERNEL=="ucm[0-9]*, NAME="infiniband/%k"
KERNEL=="rdma_cm", NAME="infiniband/%k"
...
Noticed it?
The closing double quotes on the lines for issm
, uverbs
, and ucm
are missing.
Thus udev
ignores those lines as errors and puts theirs device nodes in /dev
instead of /dev/infiniband/
where most software expects them to be.
For example, OpenMPI’s openib
BTL fails to run with such a configuration.
Correct the section so that it reads:
# Infiniband devices
KERNEL=="umad[0-9]*", NAME="infiniband/%k"
KERNEL=="issm[0-9]*", NAME="infiniband/%k"
KERNEL=="uverbs[0-9]*", NAME="infiniband/%k"
KERNEL=="ucm[0-9]*", NAME="infiniband/%k"
KERNEL==“rdma_cm", NAME="infiniband/%k"
Now reboot and your device nodes will be placed where they are meant to be.
High-Performance Computing ↩︎