Blog

  • Multi-Arch Tuning Operator 1.1.0 Released

    The Red Hat team has released a new version of the Multi-Arch Tuning Operator.

    In Multi-Arch Compute clusters, the Multiarch Tuning Operator influences the scheduling of Pods, so application run on the supported architecture.

    You can learn more about it at https://catalog.redhat.com/software/containers/multiarch-tuning/multiarch-tuning-operator-bundle/661659e9c5bced223a7f7244

    Addendum

    My colleague, Punith, worked with the Red Hat team to add NodeAffinityScoring and plugin support to the Multi-Arch Tuning Operator and ClusterPodPlacementConfig. This feature allows users to define cluster-wide preferences for specific architectures, influencing how the Kubernetes scheduler places pods. It helps optimize workload distribution based on preferred node architecture.

    	Spec:
    	    Plugins:
    		NodeAffinityScoring:
    		   enabled: true
    		   platforms:
    		   - architecture: ppc64le
    		     weight: 100
    		   - architecture: amd64
    		     weight: 50
  • FIPS support in Go 1.24

    Kudos to the Red Hat team. link

    The benefits of native FIPS support in Go 1.24

    The introduction of the FIPS Cryptographic Module in Go 1.24 marks a watershed moment for the language’s security capabilities. This new module provides FIPS 140-3-compliant implementations of cryptographic algorithms, seamlessly integrated into the standard library. What makes this particularly noteworthy is its transparent implementation. Existing Go applications can leverage FIPS-compliant cryptography without requiring code changes.

    Build-time configuration through the GOFIPS140 environment variable, allowing developers to select specific versions of the Go Cryptographic Module.

    GOFIPS140=true go build

    Runtime control via the fips140 GODEBUG setting, enabling dynamic FIPS mode activation.

    GODEBUG=

    Keep these in your toolbox along with GOARCH=ppc64le

  • Updates to Open Source Container images for Power on IBM Container Registry

    The IBM Linux on Power team pushed new images to their public open source container images in the IBM Container Registry (ICR). This should assure end users that IBM has authentically built these containers in a secure environment.

    The new container images are:

    Image NameTag NameProject LicensesImage Pull CommandLast Published
    fluentd-kubernetes-daemonsetv1.14.3-debian-forward-1.0Apache-2.0podman pull icr.io/ppc64le-oss/fluentd-kubernetes-daemonset:v1.14.3-debian-forward-1.0March 17, 2025
    cloudnative-pg/pgbouncer1.23.0Apache-2.0podman pull icr.io/ppc64le-oss/cloudnative-pg/pgbouncer:1.23.0March 17, 2025
  • Red Hat OpenShift Container Platform 4.18 Now Available on IBM Power

    Red Hat OpenShift 4.18 Now Available on IBM Power Red Hat® OpenShift® 4.18 has been released and adds improvements and new capabilities to OpenShift Container Platform components. Based on Kubernetes 1.31 and CRI-O 1.31, Red Hat OpenShift 4.18 focused on core improvements with enhanced network flexibility.

    You can download 4.18.1 from the mirror at https://mirror.openshift.com/pub/openshift-v4/multi/clients/ocp/4.18.1/ppc64le/

  • Nest Accelerator and Urandom… I think

    The NX accelerator has random number generation capabilities.

    What what happens if the random-number entropy pool runs out of numbers? If you are reading from the /dev/random device, your application will block waiting for new numbers to be generated. Alternatively the urandom device is non-blocking, and will create random numbers on the fly, re-using some of the entropy in the pool. This can lead to numbers that are less random than required for some use cases.

    Well, the Power9 and Power10 servers use the nest accelerator to generate the pseudo random numbers and maintains the pool.

    Each processor chip in a Power9 and Power10 server has an on-chip “nest” accelerator called the NX unit that provides specialized functions for general data compression, gzip compression, encryption, and random number generation. These accelerators are used transparently across the systems software stack to speed up operations related to Live Partition Migration, IPSec, JFS2 Encrypted File Systems, PKCS11 encryption, and random number generation through /dev/random and /dev/urandom.

    Kind of cool, I’ll have to find some more details to verify it and use it.

  • Kernel Stack Trace

    Quick hack to find stack trace.

    Look in proc find /proc -name stack

    You can see the last stack for example… /proc/479260/stack

    [<0>] hrtimer_nanosleep+0x89/0x120
    [<0>] __x64_sys_nanosleep+0x96/0xd0
    [<0>] do_syscall_64+0x5b/0x1a0
    [<0>] entry_SYSCALL_64_after_hwframe+0x66/0xcb
    

    It superb to figure out a real-time hang and pattern.

  • Nice article on name,version and references from Red Hat

    A reference can contain a domain (quay.io) pointing to the container registry, one or more repositories (also referred to as namespaces) on the registry (fedora), and an image (fedora-bootc) followed by a tag (41) and/or digest (sha256). Note that images can be referenced by tag, digest, or both at the same time..

    container image versioning

    Reference: How to name, version, and reference container images

  • OpenShift Container Platform and CGroups: Notes

    My notes from OCP/Cgroups debugging and usage.

    What is attaching the BPF program to my cgroup?

    When you create a Pod, the API Server reconciles the resource, and the Kube Scheduler is triggered to assign it to a Node. On the Node, the Kubelet converts to the OCI specification, enriches the container with host-device specific resources, and dispatches it to cri-o. cri-o, using the default container runtime launcher – runc or crun, and using the runc/crun configuration it launches and manages the container with SystemD, and attaches an eBPF program that controls device access.

    If you are seeing EPERM issues accessing a device, perhaps you don’t have the right access set at the Pod level, you may be able to use a Device Plugin.

    Options for adding Devices

    You have a couple of things to look at:

    1. volumeDevices
    2. io.kubernetes.cri-o.Devices
    3. cri-o config drop-in
    4. crun or runc with DeviceAllow https://github.com/containers/crun https://github.com/containers/crun/blob/017b5fddcb0a29938295d9a28fdc901164c77d74/contrib/seccomp-notify-plugin-rust/src/mknod.rs#L9
    5. A custom device plugin like https://github.com/IBM/power-device-plugin

    Note, it give R/W to the full device.

    Requires selinux-relabeling to be disabled

    You may need to stop selinux from relabeling the files when you run as randomized ids. The cloud pak describes an excelent way to disable selinux relabeling: https://www.ibm.com/docs/en/cloud-paks/cp-data/5.0.x?topic=1-disabling-selinux-relabeling

    You can confirm the file details using:

    sh-5.1$ ls -alZ /mnt/example/myfile.log
    -rw-r--r--. 1 xuser wheel system_u:object_r:container_file_t:s0 1053201 Dec 11 19:45 /mnt/example/myfile.log

    Switching Container Runtime Launchers

    You can switch your Container Runtime from runc to crun using:

    cat << EOF | oc apply -f -
    apiVersion: machineconfiguration.openshift.io/v1
    kind: ContainerRuntimeConfig
    metadata:
     name: container-crun
    spec:
     machineConfigPoolSelector:
       matchLabels:
         pools.operator.machineconfiguration.openshift.io/worker: '' 
     containerRuntimeConfig:
       logLevel: debug 
       overlaySize: 1G 
       defaultRuntime: "crun"
    EOF
    

    container_use_devices

    Allows containers to use any device volume mounted into container, see https://github.com/containers/container-selinux/blob/main/container.te#L39

    $ getsebool -a | grep container_use_devices
    container_use_devices --> off
    

    More details on creating a MachineConfig is at https://docs.openshift.com/container-platform/4.16/networking/multiple_networks/configuring-additional-network.html

    blktrace

    blktrace is a superb tool. You’ll just have to put the kernel in debug mode.

    blktrace -d /dev/sdf

    We also built a crio config script.

    https://www.redhat.com/en/blog/open-container-initiative-hooks-admission-control-podman

    https://www.redhat.com/en/blog/extending-the-runtime-functionality

  • Regenerating OCP Certificates

    For those with OpenShift Container Platform nodes that must support FIPS, and you’ve previously generated the certificates on a non-FIPS node. You must execute these steps from a FIPS-compliant environment, such as a RHEL server booted in FIPS mode.

    Then you can follow the Red Hat Customer Portal document Regenerating Openshift Cluster Certificates, which shows you:

    1. Regenerate the Leaf Certificates using oc adm ocp-certificates regenerate-leaf
    2. Regenerate the Top-Level Certificates using oc adm ocp-certificates regenerate-top-level

    There is also a really cool command to restart the Kubelet oc adm restart-kubelet nodes --all --directive=RemoveKubeletKubeconfig

    This document is tried and true, and the best one to regenerate your certificates for your cluster.

    I’m blogging about this so I can find these key commands and the link when I need it again.

  • vim versus plain vi: One Compelling Reason

    My colleague, Michael Q, introduced me to a vim extension that left me saying… that’s awesome.

    set cuc which enables Cursor Column, and when I use it with set number, it’s awesome to see correct indenting

    The commands are:

    1. Shift + :
    2. set cuc and enter
    3. Shift + :
    4. set number and enter
    `set cuc` which enables *Cursor Column*, and when I use it with `set number`, it's awesome to see correct indenting

    Use set nocuc to disable

    Good luck…

    Post Script

    • Install vim with dnf install -y vim

    Reference VimTrick: set cuc