Getting the ibmvfc logs from the impacted clusters

If you are using the IBM Virtual Fibre Channel adapter with your OpenShift on Power installation, you can use these steps to get the log details.

Here are the steps to get the ibmvfc from the nodes which are failing:

Grabbing the ibmvfc logs

ibmvfc is the driver for the virtual fibre channel adapters.

To setup ibmvfc logging:

  1. Login as a cluster-admin
# export KUBECONFIG=/root/openstack-upi/auth/kubeconfig
# oc get MachineConfigPool -o=jsonpath='{range.items[*]}{.metadata.name} {"\t"} {.status.nodeInfo.kubeletVersion}{"\n"}{end}'
master
worker
  1. For each of these listed MachineConfigPools, let’s create 99-<mcp-name>-vfc.yaml. These systems will reboot.
# cat << EOF > 99-worker-vfc.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: "worker"
  name: 99-worker-vfc
spec:
  kernelArguments:
    - 'scsi_mod.scsi_logging_level=4096'
    - 'ibmvfc.debug=1'
    - 'ibmvfc.log_level=3'
EOF
# cat << EOF > 99-master-vfc.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: "master"
  name: 99-master-vfc
spec:
  kernelArguments:
    - 'scsi_mod.scsi_logging_level=4096'
    - 'ibmvfc.debug=1'
    - 'ibmvfc.log_level=3'
EOF
  1. Let’s apply the yamls, one at a time:
# oc apply -f 99-worker-vfc.yaml
machineconfig.machineconfiguration.openshift.io/99-worker-vfc created
  1. Wait for the MachineConfigPool to come back up, such as worker:
# oc wait mcp/worker --for condition=Ready --timeout=30m
  1. Verify each Machine Config Pool is done updating:

The following shows the worker pool is updating:

# oc get mcp worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-b93fdaee39cd7d38a53382d3c259c8ae   False     True       True       2              1                   1                     1                      8d

The following shows the worker pool is Ready:

# oc get mcp worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-b93fdaee39cd7d38a53382d3c259c8ae   True     False       False       2              2                   0                     2                      8d
  1. Spot check the updates…

a. List the nodes oc get nodes b. Connect to one of the nodes oc debug node/worker-0 c. Change context to /host chroot /host d. verify kernel argument contain the three values we set.

# rpm-ostree kargs
rw $ignition_firstboot  ostree=/ostree/boot.1/rhcos/d7d848ba24dcacb1aba663e9868d4bd131482d9b7fecfa33197f558c53ae5208/0 ignition.platform.id=powervs root=UUID=06207aa5-3386-4044-bcb6-750e509d7cf0 rw rootflags=prjquota boot=UUID=6c67b96e-4e01-4e01-b8e5-ffeb4041bee2 systemd.unified_cgroup_hierarchy=1 cgroup_no_v1="all" psi=0 scsi_mod.scsi_logging_level=4096 ibmvfc.debug=1 ibmvfc.log_level=3 rd.multipath=default root=/dev/disk/by-label/dm-mpath-root
  1. Wait for the error to occur, get the console logs and the journalctl --dmesg output from the node.

You’ll end up with a bunch of messages like:

[    2.333257] ibmvfc 30000004: Partner initialization complete
[    2.333308] ibmvfc 30000004: Sent NPIV login
[    2.333336] ibmvfc: Entering ibmvfc_alloc_mem
[    2.333340] ibmvfc: Entering ibmvfc_alloc_queue
[    2.333343] ibmvfc: Entering ibmvfc_init_event_pool
[    2.333402] ibmvfc: Leaving ibmvfc_alloc_mem
[    2.333439] ibmvfc: Entering ibmvfc_init_crq
[    2.333443] ibmvfc: Entering ibmvfc_alloc_queue
[    2.333446] ibmvfc: Entering ibmvfc_init_event_pool
[    2.333482] ibmvfc: Leaving ibmvfc_init_event_pool
[    2.333743] ibmvfc: Leaving ibmvfc_init_crq

Once we’ve grabbed this level of detail, we can delete the MachineConfig and it’ll reboot and reset the kernel arguments.

And you can share the logs with support.

Please only use this under guidance.

Reference

https://www.ibm.com/docs/en/linux-on-systems?topic=commands-scsi-logging-level

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *