Author: Paul

  • Red Hat OpenShift Container Platform 4.18 Now Available on IBM Power

    Red Hat OpenShift 4.18 Now Available on IBM Power Red Hat® OpenShift® 4.18 has been released and adds improvements and new capabilities to OpenShift Container Platform components. Based on Kubernetes 1.31 and CRI-O 1.31, Red Hat OpenShift 4.18 focused on core improvements with enhanced network flexibility.

    You can download 4.18.1 from the mirror at https://mirror.openshift.com/pub/openshift-v4/multi/clients/ocp/4.18.1/ppc64le/

  • Nest Accelerator and Urandom… I think

    The NX accelerator has random number generation capabilities.

    What what happens if the random-number entropy pool runs out of numbers? If you are reading from the /dev/random device, your application will block waiting for new numbers to be generated. Alternatively the urandom device is non-blocking, and will create random numbers on the fly, re-using some of the entropy in the pool. This can lead to numbers that are less random than required for some use cases.

    Well, the Power9 and Power10 servers use the nest accelerator to generate the pseudo random numbers and maintains the pool.

    Each processor chip in a Power9 and Power10 server has an on-chip “nest” accelerator called the NX unit that provides specialized functions for general data compression, gzip compression, encryption, and random number generation. These accelerators are used transparently across the systems software stack to speed up operations related to Live Partition Migration, IPSec, JFS2 Encrypted File Systems, PKCS11 encryption, and random number generation through /dev/random and /dev/urandom.

    Kind of cool, I’ll have to find some more details to verify it and use it.

  • Kernel Stack Trace

    Quick hack to find stack trace.

    Look in proc find /proc -name stack

    You can see the last stack for example… /proc/479260/stack

    [<0>] hrtimer_nanosleep+0x89/0x120
    [<0>] __x64_sys_nanosleep+0x96/0xd0
    [<0>] do_syscall_64+0x5b/0x1a0
    [<0>] entry_SYSCALL_64_after_hwframe+0x66/0xcb
    

    It superb to figure out a real-time hang and pattern.

  • Nice article on name,version and references from Red Hat

    A reference can contain a domain (quay.io) pointing to the container registry, one or more repositories (also referred to as namespaces) on the registry (fedora), and an image (fedora-bootc) followed by a tag (41) and/or digest (sha256). Note that images can be referenced by tag, digest, or both at the same time..

    container image versioning

    Reference: How to name, version, and reference container images

  • OpenShift Container Platform and CGroups: Notes

    My notes from OCP/Cgroups debugging and usage.

    What is attaching the BPF program to my cgroup?

    When you create a Pod, the API Server reconciles the resource, and the Kube Scheduler is triggered to assign it to a Node. On the Node, the Kubelet converts to the OCI specification, enriches the container with host-device specific resources, and dispatches it to cri-o. cri-o, using the default container runtime launcher – runc or crun, and using the runc/crun configuration it launches and manages the container with SystemD, and attaches an eBPF program that controls device access.

    If you are seeing EPERM issues accessing a device, perhaps you don’t have the right access set at the Pod level, you may be able to use a Device Plugin.

    Options for adding Devices

    You have a couple of things to look at:

    1. volumeDevices
    2. io.kubernetes.cri-o.Devices
    3. cri-o config drop-in
    4. crun or runc with DeviceAllow https://github.com/containers/crun https://github.com/containers/crun/blob/017b5fddcb0a29938295d9a28fdc901164c77d74/contrib/seccomp-notify-plugin-rust/src/mknod.rs#L9
    5. A custom device plugin like https://github.com/IBM/power-device-plugin

    Note, it give R/W to the full device.

    Requires selinux-relabeling to be disabled

    You may need to stop selinux from relabeling the files when you run as randomized ids. The cloud pak describes an excelent way to disable selinux relabeling: https://www.ibm.com/docs/en/cloud-paks/cp-data/5.0.x?topic=1-disabling-selinux-relabeling

    You can confirm the file details using:

    sh-5.1$ ls -alZ /mnt/example/myfile.log
    -rw-r--r--. 1 xuser wheel system_u:object_r:container_file_t:s0 1053201 Dec 11 19:45 /mnt/example/myfile.log

    Switching Container Runtime Launchers

    You can switch your Container Runtime from runc to crun using:

    cat << EOF | oc apply -f -
    apiVersion: machineconfiguration.openshift.io/v1
    kind: ContainerRuntimeConfig
    metadata:
     name: container-crun
    spec:
     machineConfigPoolSelector:
       matchLabels:
         pools.operator.machineconfiguration.openshift.io/worker: '' 
     containerRuntimeConfig:
       logLevel: debug 
       overlaySize: 1G 
       defaultRuntime: "crun"
    EOF
    

    container_use_devices

    Allows containers to use any device volume mounted into container, see https://github.com/containers/container-selinux/blob/main/container.te#L39

    $ getsebool -a | grep container_use_devices
    container_use_devices --> off
    

    More details on creating a MachineConfig is at https://docs.openshift.com/container-platform/4.16/networking/multiple_networks/configuring-additional-network.html

    blktrace

    blktrace is a superb tool. You’ll just have to put the kernel in debug mode.

    blktrace -d /dev/sdf

    We also built a crio config script.

    https://www.redhat.com/en/blog/open-container-initiative-hooks-admission-control-podman

    https://www.redhat.com/en/blog/extending-the-runtime-functionality

  • Regenerating OCP Certificates

    For those with OpenShift Container Platform nodes that must support FIPS, and you’ve previously generated the certificates on a non-FIPS node. You must execute these steps from a FIPS-compliant environment, such as a RHEL server booted in FIPS mode.

    Then you can follow the Red Hat Customer Portal document Regenerating Openshift Cluster Certificates, which shows you:

    1. Regenerate the Leaf Certificates using oc adm ocp-certificates regenerate-leaf
    2. Regenerate the Top-Level Certificates using oc adm ocp-certificates regenerate-top-level

    There is also a really cool command to restart the Kubelet oc adm restart-kubelet nodes --all --directive=RemoveKubeletKubeconfig

    This document is tried and true, and the best one to regenerate your certificates for your cluster.

    I’m blogging about this so I can find these key commands and the link when I need it again.

  • vim versus plain vi: One Compelling Reason

    My colleague, Michael Q, introduced me to a vim extension that left me saying… that’s awesome.

    set cuc which enables Cursor Column, and when I use it with set number, it’s awesome to see correct indenting

    The commands are:

    1. Shift + :
    2. set cuc and enter
    3. Shift + :
    4. set number and enter
    `set cuc` which enables *Cursor Column*, and when I use it with `set number`, it's awesome to see correct indenting

    Use set nocuc to disable

    Good luck…

    Post Script

    • Install vim with dnf install -y vim

    Reference VimTrick: set cuc

  • Cool Plugin… kube-health

    kube-health has a new release v0.3.0. I’ve been following along on this tool for a while.

    Here’s why:

    1. It allows you to poll a single resource and see if it’s OK… in the aggregate. You can see the status of subresources at the same time.
    2. It’s super simple to watch the resource until it exits cleanly or fails…

    Kudos to iNecas for a wonderful tool.

    The following is an image from the github site. demo.svg

  • Custom nftable firewall rules in OpenShift

    Here is a good references for using OpenShift:

    Custom nftable firewall rules in OpenShift: https://access.redhat.com/articles/7090422

    It’s a supported method for implementing custom nftables firewall rules in OpenShift clusters. It is intended for cluster administrators who are responsible for managing network security policies within their OpenShift environments.

  • k8s-etcd-decryptor

    I’m making a mental note that this tool from @simonkrenger k8s-etcd-decryptor is a life saver – I’ve used it once during development and need to get data out of etcd.

    The tool decrypts the AES-CBC-encrypted objects from etcd. Note, AES-CBC is one of two encyrption types AES-GCM, and is not covered by the tool.

    You can read more about encryption in OpenShift at Chapter 15. Encrypting etcd data