SYSFS FILES For each InfiniBand device, the InfiniBand drivers create the following files under /sys/class/infiniband/: node_type - Node type (CA, switch or router) node_guid - Node GUID sys_image_guid - System image GUID In addition, there is a "ports" subdirectory, with one subdirectory for each port. For example, if mthca0 is a 2-port HCA, there will be two directories: /sys/class/infiniband/mthca0/ports/1 /sys/class/infiniband/mthca0/ports/2 (A switch will only have a single "0" subdirectory for switch port 0; no subdirectory is created for normal switch ports) In each port subdirectory, the following files are created: cap_mask - Port capability mask lid - Port LID lid_mask_count - Port LID mask count rate - Port data rate (active width * active speed) sm_lid - Subnet manager LID for port's subnet sm_sl - Subnet manager SL for port's subnet state - Port state (DOWN, INIT, ARMED, ACTIVE or ACTIVE_DEFER) phys_state - Port physical state (Sleep, Polling, LinkUp, etc) There is also a "counters" subdirectory, with files VL15_dropped excessive_buffer_overrun_errors link_downed link_error_recovery local_link_integrity_errors port_rcv_constraint_errors port_rcv_data port_rcv_errors port_rcv_packets port_rcv_remote_physical_errors port_rcv_switch_relay_errors port_xmit_constraint_errors port_xmit_data port_xmit_discards port_xmit_packets symbol_error Each of these files contains the corresponding value from the port's Performance Management PortCounters attribute, as described in section 16.1.3.5 of the InfiniBand Architecture Specification. The "pkeys" and "gids" subdirectories contain one file for each entry in the port's P_Key or GID table respectively. For example, ports/1/pkeys/10 contains the value at index 10 in port 1's P_Key table. There is an optional "hw_counters" subdirectory that may be under either the parent device or the port subdirectories or both. If present, there are a list of counters provided by the hardware. They may match some of the counters in the counters directory, but they often include many other counters. In addition to the various counters, there will be a file named "lifespan" that configures how frequently the core should update the counters when they are being accessed (counters are not updated if they are not being accessed). The lifespan is in milli- seconds and defaults to 10 unless set to something else by the driver. Users may echo a value between 0 - 10000 to the lifespan file to set the length of time between updates in milliseconds. MTHCA The Mellanox HCA driver also creates the files: hw_rev - Hardware revision number fw_ver - Firmware version hca_type - HCA type: "MT23108", "MT25208 (MT23108 compat mode)", or "MT25208" HFI1 The hfi1 driver also creates these additional files: hw_rev - hardware revision board_id - manufacturing board id tempsense - thermal sense information serial - board serial number nfreectxts - number of free user contexts nctxts - number of allowed contexts (PSM2) chip_reset - diagnostic (root only) boardversion - board version sdma/ - one directory per sdma engine (0 - 15) sdma/cpu_list - read-write, list of cpus for user-process to sdma engine assignment. sdma/vl - read-only, vl the sdma engine maps to. The new interface will give the user control on the affinity settings for the hfi1 device. As an example, to set an sdma engine irq affinity and thread affinity of a user processes to use the sdma engine, which is "near" in terms of NUMA configuration, or physical cpu location, the user will do: echo "3" > /proc/irq//smp_affinity_list echo "4-7" > /sys/devices/.../sdma3/cpu_list cat /sys/devices/.../sdma3/vl 0 echo "8" > /proc/irq//smp_affinity_list echo "9-12" > /sys/devices/.../sdma4/cpu_list cat /sys/devices/.../sdma4/vl 1 to make sure that when a process runs on cpus 4,5,6, or 7, and uses vl=0, then sdma engine 3 is selected by the driver, and also the interrupt of the sdma engine 3 is steered to cpu 3. Similarly, when a process runs on cpus 9,10,11, or 12 and sets vl=1, then engine 4 will be selected and the irq of the sdma engine 4 is steered to cpu 8. This assumes that in the above N is the irq number of "sdma3", and M is irq number of "sdma4" in the /proc/interrupts file. ports/1/ CCMgtA/ cc_settings_bin - CCA tables used by PSM2 cc_table_bin cc_prescan - enable prescaning for faster BECN response sc2v/ - 32 files (0 - 31) used to translate sl->vl sl2sc/ - 32 files (0 - 31) used to translate sl->sc vl2mtu/ - 16 (0 - 15) files used to determine MTU for vl