Category: IBM Power Systems

Outrigger: Rethinking Kubernetes Scheduling for a Smarter Future

At DevConf.CZ 2025, a standout session from Alessandro Di Stefano and Prashanth Sundararaman introduced the Outrigger project, a forward-thinking initiative aimed at transforming Kubernetes scheduling into a dynamic, collaborative ecosystem. Building on the success of the Multiarch Tuning Operator for OpenShift, Outrigger leverages Kubernetes’ scheduling gates to go beyond traditional multi-architecture scheduling.

👉 Watch the full session here:

Excellent work by that team.

2025-06-15
CP4D 5.2 release – IBM Knowledge Catalog (IKC) and DataStage are both now available on OpenShift on Power through Cloud Pak for Data

Per the CP4D Leader, with CP4D 5.2 release – IBM Knowledge Catalog (IKC) and DataStage are both now available on OpenShift on Power through Cloud Pak for Data!

– IBM Knowledge Catalog provides the methods that your enterprise needs to automate data governance so you can ensure data accessibility, trust, protection, security, and compliance

– With DataStage, you can design and run data flows that move and transform data. You’re able to compose data flows with speed and accuracy using an intuitive graphical design interface that lets you connect to a wide range of data sources, integrate and transform data, and deliver it to your target system in batch or real time

Read more about it at:

1. https://www.ibm.com/docs/en/software-hub/5.2.x?topic=requirements-ppc64le-hardware#services

2. https://community.ibm.com/community/user/blogs/jay-carman/2025/06/12/introducing-ibm-knowledge-catalog-on-ibm-power

2025-06-12

prometheus hack to reduce disk pressure in non-prod environments

Here is a script to reduce monitoring disk pressure. It prunes the db.

cat << EOF > cluster-monitoring-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |
    prometheusK8s:
      retention: 1d
EOF
oc apply -f cluster-monitoring-config.yaml

2025-06-09

Using nx-gzip in your Red Hat OpenShift Container Platform on IBM Power to accelerate GZip performance
Cross post from https://community.ibm.com/community/user/blogs/paul-bastide/2025/06/09/using-nx-gzip-in-your-red-hat-openshift-container

The Power10 processor features an on-chip accelerator that is called the nest accelerator unit (NX unit). The coprocessor features that are available on the Power10 processor are similar to the features of the Power9 processor. These coprocessors provide specialized functions, such as the Industry-standard Gzip compression and decompression, Random number generation and AES and Secure Hash Algorithm (SHA) cryptography.

This article outlines how to use nx-gzip in a non-privileged container in Red Hat OpenShift Container Platform on IBM Power. You must have deployed a cluster with workers with a processor compatibility of IBM Power 10 or higher. The Active Memory Expansion feature must be licensed.

Build the power-gzip selftest binary

The test binary is used to show the feature is working and you can use the selftest and sample code to integrate in your environment.
1. Login to the PowerVM instance running Red Hat Enterprise Linux 9
2. Install required build binaries
```
dnf install make git gcc zlib-devel vim util-linux-2.37.4-11.el9.ppc64le -y
```
1. Setup the Clone repository
```
git clone https://github.com/libnxz/power-gzip
cd power-gzip/
```
1. Run the tests
```
./configure 
cd selftests
make
```
1. Find the created test files
```
# ls g*test -al
-rwxr-xr-x. 1 root root 74992 Jun  9 08:24 gunz_test
-rwxr-xr-x. 1 root root 74888 Jun  9 08:24 gzfht_test
```
You are ready to test it.

Setup the NX-GZip test deployment

Download the examples repository and setup kustomization, and configure cri-o so you can deploy and use /dev/crypto/nx-gzip in a container.
1. Install Kustomization tool for the deployment
```
curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh"  | bash
sudo mv kustomize /usr/local/bin
kustomize -h
```
1. Clone the ocp4-power-workload-tools repository
```
git clone https://github.com/IBM/ocp4-power-workload-tools
cd ocp4-power-workload-tools
```
1. Configure the worker nodes to use /dev/crypto/nx-gzip as an allowed_device.
```
oc apply -f ocp4-power-workload-tools/manifests/nx-gzip/99-worker-crio-nx-gzip.yaml
```
1. Export kubeconfig using export KUBECONFIG=~/.kube/config
2. Setup the nx-gzip test Pod as below
```
cd manifests/nx-gzip
kustomize build . | oc apply -f - 
```
1. Resulting running pod as below
```
# oc get pod -n nx-gzip-demo
NAME               READY   STATUS    RESTARTS   AGE
nx-gzip-ds-2mlmh   1/1     Running   0          3s
```
You are ready to test nx-gzip.

To test with Privileged mode, you can use nx-gzip-privileged.

Copy the Test artifact into the running Pod and Run the Test Artifact
1. Copy the above created executable files to the running pod
```
# oc cp gzfht_test nx-gzip-ds-2mlmh:/nx-test/
```
1. Access the pod shell and confirm the Model name is Power10 or higher.
```
# oc rsh nx-gzip-ds-2mlmh
sh-5.1# lscpu | grep Model
Model name:                           POWER10 (architected), altivec supported
Model:                                2.0 (pvr 0080 0200)
```
1. Create a test file for testing
```
sh-5.1# dd if=/dev/random of=/nx-test/test bs=1M count=1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.00431494 s, 243 MB/s
sh-5.1#
```
1. Run the tests in pod
```
sh-5.1# /nx-test/gzfht_test /nx-test/test
file /nx-test/test read, 1048576 bytes
compressed 1048576 to 1105994 bytes total, crc32 checksum = a094fbab
sh-5.1# echo $?
0
```
If it shows as compressed and the return code is 0 and as above then its considered as PASS.

You’ve seen how the nx-gzip works in Pod. You can also combine with the Node Feature Discovery to label each Node Resource with cpu-coprocessor.nx_gzip=true

Thank you for your time and good luck.

Reference
2025-06-09

Getting the ibmvfc logs from the impacted clusters

If you are using the IBM Virtual Fibre Channel adapter with your OpenShift on Power installation, you can use these steps to get the log details.

Here are the steps to get the ibmvfc from the nodes which are failing:

Grabbing the ibmvfc logs

ibmvfc is the driver for the virtual fibre channel adapters.

To setup ibmvfc logging:

# export KUBECONFIG=/root/openstack-upi/auth/kubeconfig
# oc get MachineConfigPool -o=jsonpath='{range.items[*]}{.metadata.name} {"\t"} {.status.nodeInfo.kubeletVersion}{"\n"}{end}'
master
worker

For each of these listed MachineConfigPools, let’s create 99-<mcp-name>-vfc.yaml. These systems will reboot.

# cat << EOF > 99-worker-vfc.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: "worker"
  name: 99-worker-vfc
spec:
  kernelArguments:
    - 'scsi_mod.scsi_logging_level=4096'
    - 'ibmvfc.debug=1'
    - 'ibmvfc.log_level=3'
EOF

# cat << EOF > 99-master-vfc.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: "master"
  name: 99-master-vfc
spec:
  kernelArguments:
    - 'scsi_mod.scsi_logging_level=4096'
    - 'ibmvfc.debug=1'
    - 'ibmvfc.log_level=3'
EOF

Let’s apply the yamls, one at a time:

# oc apply -f 99-worker-vfc.yaml
machineconfig.machineconfiguration.openshift.io/99-worker-vfc created

Wait for the MachineConfigPool to come back up, such as worker:

# oc wait mcp/worker --for condition=Ready --timeout=30m

Verify each Machine Config Pool is done updating:

The following shows the worker pool is updating:

# oc get mcp worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-b93fdaee39cd7d38a53382d3c259c8ae   False     True       True       2              1                   1                     1                      8d

The following shows the worker pool is Ready:

# oc get mcp worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-b93fdaee39cd7d38a53382d3c259c8ae   True     False       False       2              2                   0                     2                      8d

Spot check the updates…

a. List the nodes oc get nodes b. Connect to one of the nodes oc debug node/worker-0 c. Change context to /host chroot /host d. verify kernel argument contain the three values we set.

# rpm-ostree kargs
rw $ignition_firstboot  ostree=/ostree/boot.1/rhcos/d7d848ba24dcacb1aba663e9868d4bd131482d9b7fecfa33197f558c53ae5208/0 ignition.platform.id=powervs root=UUID=06207aa5-3386-4044-bcb6-750e509d7cf0 rw rootflags=prjquota boot=UUID=6c67b96e-4e01-4e01-b8e5-ffeb4041bee2 systemd.unified_cgroup_hierarchy=1 cgroup_no_v1="all" psi=0 scsi_mod.scsi_logging_level=4096 ibmvfc.debug=1 ibmvfc.log_level=3 rd.multipath=default root=/dev/disk/by-label/dm-mpath-root

Wait for the error to occur, get the console logs and the journalctl --dmesg output from the node.

You’ll end up with a bunch of messages like:

[    2.333257] ibmvfc 30000004: Partner initialization complete
[    2.333308] ibmvfc 30000004: Sent NPIV login
[    2.333336] ibmvfc: Entering ibmvfc_alloc_mem
[    2.333340] ibmvfc: Entering ibmvfc_alloc_queue
[    2.333343] ibmvfc: Entering ibmvfc_init_event_pool
[    2.333402] ibmvfc: Leaving ibmvfc_alloc_mem
[    2.333439] ibmvfc: Entering ibmvfc_init_crq
[    2.333443] ibmvfc: Entering ibmvfc_alloc_queue
[    2.333446] ibmvfc: Entering ibmvfc_init_event_pool
[    2.333482] ibmvfc: Leaving ibmvfc_init_event_pool
[    2.333743] ibmvfc: Leaving ibmvfc_init_crq

Once we’ve grabbed this level of detail, we can delete the MachineConfig and it’ll reboot and reset the kernel arguments.

And you can share the logs with support.

Please only use this under guidance.

Reference

https://www.ibm.com/docs/en/linux-on-systems?topic=commands-scsi-logging-level

2025-05-30

Great work from the IBM’s Power10 Private Cloud Rack for Db2 Warehouse team

The IBM’s Power10 Private Cloud Rack for Db2 Warehouse team posted an article on their offering which is the next generation of the IBM Integrated Analytics System (IIAS); modernized to operate on the Red Hat OpenShift Container Platform. As the team notes, this architecture shift enables a more modular and scalable deployment model, aligning with modern cloud-native practices

In their article, they outline the stringent performance and scalability, the use of OpenShift Container Platform on Power10 with Storage Scale. For more detailed information, you can visit the IBM Data Management Community blog

2025-05-01
Multi-Arch Compute and the Red Hat OpenShift Container Platform on IBM Power
Red Hat OpenShift Container Platform supports multi-arch compute which allow you to mix supported compute architectures so you can build your optimal solution. With multi-architecture compute, you run pairs of architectures in the compute plane – a Power (ppc64le) control plane supports running power and intel workers (p-px), and the Intel (amd64) control plane supports Power and intel workers (x-px). This setup uses a custom multi payload that is manifest listed so you can use the IBM Power (ppc64le) alongside Intel (amd64).

In this document you will find a series of steps to setup a Multi-Arch Compute cluster.

After you install your cluster, Multi-Arch Compute is a post installation task that follows this process:
1. Prepare
- Networking – ensure ports are configured, dhcp is configured, dns is configured (if you require it), load balancer
- Prepare Cluster Services – create MachineConfigPool if you have different kernel parameters, add MachineConfigs, isolate the ingress on one architecture type
- Prepare Ignition – download the latest ignition
1. Image
- Download Architecture specific Image
- Load Image in Target Platform
1. Ignite Workers
- Start them up
- Approve Node Bootstrapper
- Issue Kubelet Certificate
1. Post Startup
- Add labels to the nodes
By following these steps, you can successfully install Intel and Power workers in an OpenShift Cluster on IBM Power. This setup allows you to leverage the strengths of both architectures, providing a robust and flexible environment for your applications.

Feel free to reach out if you have any questions or need further assistance with the installation process. Happy deploying!

Reference
1. https://community.ibm.com/community/user/blogs/paul-bastide/2024/02/20/multi-arch-compute-getting-started
2025-04-30
Entering into Kubernetes Network Policies
Kubernetes Network Policies (NetworkPolicy) Resources declaratively manage network access (ingress, egress) within a Kubernetes cluster. Network Policices identify the Pod labels, namespaces or IP blocks, definite the network traffic flow (Ingress, Egress), and the protocol/ports/ips involved – thus controlling allowed and disallowed communication.

There are good examples on the kubernetes website https://kubernetes.io/docs/concepts/services-networking/network-policies/#networkpolicy-resource
1. Identify the Pod to Secure, such as the Pod with label role=db. These should be as precise as possible. You may want to have more than one per your namespace.
```
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: test-network-policy
  namespace: xyz
spec:
  podSelector:
    matchLabels:
      role: db
```
1. Set a default deny policy, then add your allow policies per https://spacelift.io/blog/kubernetes-network-policy
2. Add DNS UDP 53 to the Policy so you can dynamically lookup services in your cluster per https://snyk.io/blog/kubernetes-network-policy-best-practices/:
```
  egress: 
    - to:
        - namespaceSelector: {}
          podSelector:
            matchLabels:
              dns.operator.openshift.io/daemonset-dns: default
      ports:
        - port: 53
          protocol: UDP
```
Be sure to capture all of your anticipated traffic. If you get really advanced, you’ll want to use the Editor Network Policy

Good luck…

Reference
1. NetworkPolicy v1 networking.k8s.io
2. Editor Network Policy
2025-04-29
Extending PCI-DSS v4 Support on Red Hat OpenShift Container Platform on IBM Power with the Compliance Operator
The Compliance Operator is an optional tool within the OpenShift Container Platform that allows administrators to run compliance scans and recommend remediations to bring the cluster into compliance. It utilizes OpenSCAP, a NIST-certified tool, to describe and enforce security policies. The operator is configured to run a set of Platform and Node profiles that check the cluster and associate the checks with PCI-DSS controls ensuring comprehensive security and compliance.

To support PCI-DSS v4, administrators can follow the detailed guide provided in the document “Supporting PCI-DSS v4 with the Compliance Operator on the OpenShift Container Platform”. The Power Developer Exchange article through the setup, running compliance scans, auto-remediation, and manual fixes required to configure the environment and facilitate compliance.

Note, the security-profiles-operator-exists rule will be removed in future Compliance Operator releases.
```
apiVersion: compliance.openshift.io/v1alpha1
kind: TailoredProfile
metadata:
  name: ocp4-pci-dss-custom
spec:
  extends: ocp4-pci-dss
  title: PCI-DSS v4 Customized
  disableRules:
    - name: ocp4-pci-dss-security-profiles-operator-exists
      rationale: security profiles operator is not used in the control.
```
You can see the details on CMP-3278: Misleading rule associated with PCI-DSS 6.4.2 and BSI

Summary

With the addition of PCI-DSS v4 support, the OpenShift Container Platform on IBM Power continues to enhance its security capabilities, making it an excellent choice for organizations processing credit card payments. By leveraging the Compliance Operator, administrators can ensure their clusters meet the necessary security standards, protecting sensitive payment card data effectively.

Explore these resources for more detailed information on the Compliance Operator and its supported profiles.

References
2025-04-28
Adding DISA-STIG Compliance Profiles for Red Hat OpenShift Container Platform on IBM Power
With the release of Compliance Operator v1.7.0, Red Hat OpenShift Container Platform now supports DISA-STIG profiles for IBM Power. This update includes the rhcos4-disa-stig and ocp4-disa-stig profiles, adhering to the OSCAL format for version v2r2. These profiles ensure that your systems meet the stringent security requirements set by the Defense Information Systems Agency (DISA).

Key Features
1. Added Compliance Profiles for IBM Power: The ocp4-stig, ocp4-stig-node, and rhcos4-stig profiles are continuously updated to reflect the latest DISA-STIG benchmarks. This ensures that your systems remain compliant with the most current Defense Information Systems Agency Security Technical Implementation Guide.
2. Version-Specific Profiles: For those needing to adhere to specific versions, such as DISA-STIG V2R1, the ocp4-stig-v2r1 and ocp4-stig-node-v2r1 profiles are available.
For more detailed information, you can refer to the following resources:
Stay compliant and secure your cluster with the latest updates from Compliance Operator v1.7.0 and IBM Power Systems!
2025-04-28

Category: IBM Power Systems

Build the power-gzip selftest binary

Setup the NX-GZip test deployment

Copy the Test artifact into the running Pod and Run the Test Artifact

Reference