If you are using the IBM Virtual Fibre Channel adapter with your OpenShift on Power installation, you can use these steps to get the log details.
Here are the steps to get the ibmvfc from the nodes which are failing:
Grabbing the ibmvfc logs
ibmvfc is the driver for the virtual fibre channel adapters.
To setup ibmvfc logging:
- Login as a
cluster-admin
# export KUBECONFIG=/root/openstack-upi/auth/kubeconfig
# oc get MachineConfigPool -o=jsonpath='{range.items[*]}{.metadata.name} {"\t"} {.status.nodeInfo.kubeletVersion}{"\n"}{end}'
master
worker
- For each of these listed MachineConfigPools, let’s create
99-<mcp-name>-vfc.yaml
. These systems will reboot.
# cat << EOF > 99-worker-vfc.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: "worker"
name: 99-worker-vfc
spec:
kernelArguments:
- 'scsi_mod.scsi_logging_level=4096'
- 'ibmvfc.debug=1'
- 'ibmvfc.log_level=3'
EOF
# cat << EOF > 99-master-vfc.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: "master"
name: 99-master-vfc
spec:
kernelArguments:
- 'scsi_mod.scsi_logging_level=4096'
- 'ibmvfc.debug=1'
- 'ibmvfc.log_level=3'
EOF
- Let’s apply the yamls, one at a time:
# oc apply -f 99-worker-vfc.yaml
machineconfig.machineconfiguration.openshift.io/99-worker-vfc created
- Wait for the MachineConfigPool to come back up, such as
worker
:
# oc wait mcp/worker --for condition=Ready --timeout=30m
- Verify each Machine Config Pool is done updating:
The following shows the worker pool is updating:
# oc get mcp worker
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
worker rendered-worker-b93fdaee39cd7d38a53382d3c259c8ae False True True 2 1 1 1 8d
The following shows the worker pool is Ready:
# oc get mcp worker
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
worker rendered-worker-b93fdaee39cd7d38a53382d3c259c8ae True False False 2 2 0 2 8d
- Spot check the updates…
a. List the nodes oc get nodes
b. Connect to one of the nodes oc debug node/worker-0
c. Change context to /host chroot /host
d. verify kernel argument contain the three values we set.
# rpm-ostree kargs
rw $ignition_firstboot ostree=/ostree/boot.1/rhcos/d7d848ba24dcacb1aba663e9868d4bd131482d9b7fecfa33197f558c53ae5208/0 ignition.platform.id=powervs root=UUID=06207aa5-3386-4044-bcb6-750e509d7cf0 rw rootflags=prjquota boot=UUID=6c67b96e-4e01-4e01-b8e5-ffeb4041bee2 systemd.unified_cgroup_hierarchy=1 cgroup_no_v1="all" psi=0 scsi_mod.scsi_logging_level=4096 ibmvfc.debug=1 ibmvfc.log_level=3 rd.multipath=default root=/dev/disk/by-label/dm-mpath-root
- Wait for the error to occur, get the console logs and the
journalctl --dmesg
output from the node.
You’ll end up with a bunch of messages like:
[ 2.333257] ibmvfc 30000004: Partner initialization complete
[ 2.333308] ibmvfc 30000004: Sent NPIV login
[ 2.333336] ibmvfc: Entering ibmvfc_alloc_mem
[ 2.333340] ibmvfc: Entering ibmvfc_alloc_queue
[ 2.333343] ibmvfc: Entering ibmvfc_init_event_pool
[ 2.333402] ibmvfc: Leaving ibmvfc_alloc_mem
[ 2.333439] ibmvfc: Entering ibmvfc_init_crq
[ 2.333443] ibmvfc: Entering ibmvfc_alloc_queue
[ 2.333446] ibmvfc: Entering ibmvfc_init_event_pool
[ 2.333482] ibmvfc: Leaving ibmvfc_init_event_pool
[ 2.333743] ibmvfc: Leaving ibmvfc_init_crq
Once we’ve grabbed this level of detail, we can delete the MachineConfig and it’ll reboot and reset the kernel arguments.
And you can share the logs with support.
Please only use this under guidance.
Reference
https://www.ibm.com/docs/en/linux-on-systems?topic=commands-scsi-logging-level
Leave a Reply