Getting the ibmvfc logs from the impacted clusters

If you are using the IBM Virtual Fibre Channel adapter with your OpenShift on Power installation, you can use these steps to get the log details.

Here are the steps to get the ibmvfc from the nodes which are failing:

Grabbing the ibmvfc logs

ibmvfc is the driver for the virtual fibre channel adapters.

To setup ibmvfc logging:

# export KUBECONFIG=/root/openstack-upi/auth/kubeconfig
# oc get MachineConfigPool -o=jsonpath='{range.items[*]}{.metadata.name} {"\t"} {.status.nodeInfo.kubeletVersion}{"\n"}{end}'
master
worker

For each of these listed MachineConfigPools, let’s create 99-<mcp-name>-vfc.yaml. These systems will reboot.

# cat << EOF > 99-worker-vfc.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: "worker"
  name: 99-worker-vfc
spec:
  kernelArguments:
    - 'scsi_mod.scsi_logging_level=4096'
    - 'ibmvfc.debug=1'
    - 'ibmvfc.log_level=3'
EOF

# cat << EOF > 99-master-vfc.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: "master"
  name: 99-master-vfc
spec:
  kernelArguments:
    - 'scsi_mod.scsi_logging_level=4096'
    - 'ibmvfc.debug=1'
    - 'ibmvfc.log_level=3'
EOF

Let’s apply the yamls, one at a time:

# oc apply -f 99-worker-vfc.yaml
machineconfig.machineconfiguration.openshift.io/99-worker-vfc created

Wait for the MachineConfigPool to come back up, such as worker:

# oc wait mcp/worker --for condition=Ready --timeout=30m

Verify each Machine Config Pool is done updating:

The following shows the worker pool is updating:

# oc get mcp worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-b93fdaee39cd7d38a53382d3c259c8ae   False     True       True       2              1                   1                     1                      8d

The following shows the worker pool is Ready:

# oc get mcp worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-b93fdaee39cd7d38a53382d3c259c8ae   True     False       False       2              2                   0                     2                      8d

Spot check the updates…

a. List the nodes oc get nodes b. Connect to one of the nodes oc debug node/worker-0 c. Change context to /host chroot /host d. verify kernel argument contain the three values we set.

# rpm-ostree kargs
rw $ignition_firstboot  ostree=/ostree/boot.1/rhcos/d7d848ba24dcacb1aba663e9868d4bd131482d9b7fecfa33197f558c53ae5208/0 ignition.platform.id=powervs root=UUID=06207aa5-3386-4044-bcb6-750e509d7cf0 rw rootflags=prjquota boot=UUID=6c67b96e-4e01-4e01-b8e5-ffeb4041bee2 systemd.unified_cgroup_hierarchy=1 cgroup_no_v1="all" psi=0 scsi_mod.scsi_logging_level=4096 ibmvfc.debug=1 ibmvfc.log_level=3 rd.multipath=default root=/dev/disk/by-label/dm-mpath-root

Wait for the error to occur, get the console logs and the journalctl --dmesg output from the node.

You’ll end up with a bunch of messages like:

[    2.333257] ibmvfc 30000004: Partner initialization complete
[    2.333308] ibmvfc 30000004: Sent NPIV login
[    2.333336] ibmvfc: Entering ibmvfc_alloc_mem
[    2.333340] ibmvfc: Entering ibmvfc_alloc_queue
[    2.333343] ibmvfc: Entering ibmvfc_init_event_pool
[    2.333402] ibmvfc: Leaving ibmvfc_alloc_mem
[    2.333439] ibmvfc: Entering ibmvfc_init_crq
[    2.333443] ibmvfc: Entering ibmvfc_alloc_queue
[    2.333446] ibmvfc: Entering ibmvfc_init_event_pool
[    2.333482] ibmvfc: Leaving ibmvfc_init_event_pool
[    2.333743] ibmvfc: Leaving ibmvfc_init_crq

Once we’ve grabbed this level of detail, we can delete the MachineConfig and it’ll reboot and reset the kernel arguments.

And you can share the logs with support.

Please only use this under guidance.

Reference

https://www.ibm.com/docs/en/linux-on-systems?topic=commands-scsi-logging-level

Getting the ibmvfc logs from the impacted clusters

Comments

Leave a Reply Cancel reply

More posts

Getting the ibmvfc logs from the impacted clusters

Great work from the IBM’s Power10 Private Cloud Rack for Db2 Warehouse team

Multi-Arch Compute and the Red Hat OpenShift Container Platform on IBM Power

Entering into Kubernetes Network Policies