After suspecting the Kernel Memory is leaked, using slabtop --sort c where it shows high memory usage. You can use the following steps to confirm the memory usage culprit using slub_debug=U. (Thanks to ServerFault).

$ oc login

Check that you don’t already see 99-master-kargs-slub.

$ oc get mc 99-master-kargs-slub

Create the slub_debug=U kernel argument. Note, that it’s assigned to the master role.

cat << EOF > 99-master-kargs-slub.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: master
  name: 99-master-kargs-slub
spec:
  kernelArguments:
  - slub_debug=U
EOF

Create the Kernel Arguments Machine Config.

$ oc apply -f 99-master-kargs-slub.yaml 
machineconfig.machineconfiguration.openshift.io/99-master-kargs-slub created

Wait until the master nodes are updated.

$ oc wait mcp/master --for condition=updated --timeout=25m
machineconfigpool.machineconfiguration.openshift.io/master condition met

Confirm the node status as soon as it’s up, and list the master nodes.

$ oc get nodes -l machineconfiguration.openshift.io/role=master
NAME                                                    STATUS   ROLES    AGE   VERSION
lon06-master-0.xip.io   Ready    master   30d   v1.23.5+3afdacb
lon06-master-1.xip.io   Ready    master   30d   v1.23.5+3afdacb
lon06-master-2.xip.io   Ready    master   30d   v1.23.5+3afdacb

Connect to the master node and switch to the root user

$ ssh core@lon06-master-0.xip.io
sudo su -

Check the kmalloc-32 allocation

$  cat /sys/kernel/slab/kmalloc-32/alloc_calls | sort -n  | tail -n 5
   4334 iomap_page_create+0x80/0x190 age=0/654342/2594020 pid=1-39569 cpus=0-7
   5655 selinux_sk_alloc_security+0x5c/0xd0 age=916/1870136/2594937 pid=0-39217 cpus=0-7
  41908 __kernfs_new_node+0x70/0x2d0 age=406911/2326294/2594938 pid=0-38398 cpus=0-7
9969728 memcg_update_all_list_lrus+0x1bc/0x550 age=2564414/2567167/2594607 pid=1 cpus=0-7
19861376 __list_lru_init+0x2b8/0x480 age=406870/2007921/2594449 pid=1-38406 cpus=0-7

This points to memcg_update_all_list_lrus is using a lot of resources, which is currently fixed in a patch to the Linux Kernel.

References

https://serverfault.com/questions/1020241/debugging-kmalloc-64-slab-allocations-memory-leak
http://www.jikos.cz/jikos/Kmalloc_Internals.html
https://stackoverflow.com/questions/20079767/what-is-different-functions-malloc-and-kmalloc
ServerFault: Debugging kmalloc-64 slab allocations / memory leak
Kmalloc Internals: Exploring Linux Kernel Memory Allocation
How I investigated memory leaks in Go using pprof on a large codebase
Using Go 1.10 new trace features to debug an integration test
Kernel Memory Leak Detector
go-slab – slab allocator in go
Red Hat Customer Support Portal: Interpreting /proc/meminfo and free output for Red Hat Enterprise Linux
Red Hat Customer Support Portal: Determine how much memory is being used on the system
Red Hat Customer Support Portal: Determine how much memory and what kind of objects the kernel is allocating

Identifying Kernel Memory Usage Culprits

References

More posts

Power up with omc: The OpenShift Must-Gather Client

May 2026: Additions IBM Power Open Source Images on the IBM Container Registry

REPOST: Using Red Hat Service Interconnect with OpenShift and RHEL on IBM Power

Red Hat Service Interconnect (RHSI) now supports IBM Power (ppc64le)!