Author: Paul

  • Red Hat Article: Building multi-architecture container images on OpenShift Container Platform clusters

    Our colleague at Red Hat Dylan Orzel posted an article on Building multi-architecture container images on OpenShift Container Platform clusters

    In this article we’ll explore how to make use of the built-in build capabilities available in Red Hat OpenShift 4 in a multi-arch compute environment, and how to make use of nodeSelectors to schedule builds on nodes of the architecture of our choosing.

  • Things to Know in July 2024

    Here are some things around IBM Power Systems and Red Hat OpenShift you should know about:

    Newly Supported Open Source Containers on IBM Power

    The IBM Power team has updated the list of containers they build with support for ppc64le. The list is kept at https://community.ibm.com/community/user/powerdeveloper/blogs/priya-seth/2023/04/05/open-source-containers-for-power-in-icr

    The updates are:

    system-loggerv1.19.0podman pull icr.io/ppc64le-oss/system-logger-ppc64le:v1.14.0July 18, 2024
    postgres-operatorv15.7podman pull icr.io/ppc64le-oss/postgres-operator-ppc64le:v15.7July 18, 2024
    postgresqlv14.12.0-bvpodman pull icr.io/ppc64le-oss/postgresql:v14.12.0-bvJuly 9, 2024
    mongodb5.0.26podman pull icr.io/ppc64le-oss/mongodb-ppc64le:5.0.26April 9, 2024
    6.0.13podman pull icr.io/ppc64le-oss/mongodb-ppc64le:6.0.13April 9, 2024

    Aqua Trivy and Starboard for scanning GitLab on IBM Power

    Trivy and Starboard are now available per https://community.ibm.com/community/user/powerdeveloper/blogs/gerrit-huizenga/2024/07/17/aqua-trivy-and-starboard-for-scanning-gitlab-on-ib

    You can download the Trivy RPM using:

    rpm -ivh https://github.com/aquasecurity/trivy/releases/download/v0.19.2/trivy_0.19.2_Linux-PPC64LE.rpm

    Or you could use Starboard directly from https://github.com/aquasecurity/trivy-operator/releases/tag/v0.22.0

    These provide some nice security features and tools for IBM Power containers.

    OpenShift Routes for cert-manager

    The OpenShift Routes project supports automatically getting a certificate for OpenShift routes from any cert-manager Issuer, similar to annotating an Ingress or Gateway resource in vanilla Kubernetes.

    You can download the helm chart from https://github.com/cert-manager/openshift-routes/releases

    Or you can use:

    helm install openshift-routes -n cert-manager oci://ghcr.io/cert-manager/charts/openshift-routes

    OpenBAO

    OpenBao exists to provide a software solution to manage, store, and distribute sensitive data including secrets, certificates, and keys.

    OpenBAO has released v2.0.0

    You can use helm to install on IBM Power use the values.openshift.yaml link

    helm repo add openbao https://openbao.github.io/openbao-helm
    helm install openbao openbao/openbao

    The Containers are at https://quay.io/repository/openbao/openbao?tab=tags&tag=latest

  • Red Hat OpenShift 4.16

    Red Hat OpenShift 4.16 is generally available for upgrades and new installations, and as of today is announced. It is based on Kubernetes 1.29 with the CRI-O 1.29 runtime, RHEL CoreOS 9.4. You can read the release notes at https://docs.openshift.com/container-platform/4.16/release_notes/ocp-4-16-release-notes.html

    Some cool features you can use are:

    – oc adm upgrade status command, which decouples status information from the existing oc adm upgrade command and provides specific information regarding a cluster update, including the status of the control plane and worker node updates. https://docs.openshift.com/container-platform/4.16/updating/updating_a_cluster/updating-cluster-cli.html#update-upgrading-oc-adm-upgrade-status_updating-cluster-cli

    – Tech Preview and Generally Available Table – https://docs.openshift.com/container-platform/4.16/release_notes/ocp-4-16-release-notes.html#ocp-4-16-technology-preview-tables_release-notes

  • Crane has new features for container experts

    FYI: google/go-containerregistry has a new release v0.19.2. This adds a new feature we care about:

    crane mutate myimage --set-platform linux/arm64
    

    This release also supports using podman’s authfile from the REGISTRY_AUTH_FILE file.

  • Cert Manager on Multi-Architectures

    I found a cool article on Cert Manager with IPI PowerVS

    Simplify certificate management on OpenShift across multiple architectures

    Chirag Kyal is a Software Engineer at Red Hat… has authored an article about deploying IPI PowerVS and Cert Manager on IBM Cloud.

    Check out the article about efficient certificate management techniques on Red Hat OpenShift using the cert-manager Operator for OpenShift’s multi-architecture support.

    https://developers.redhat.com/learning/learn:openshift:simplify-certificate-management-openshift-across-multiple-architectures/resource/resources:automate-tls-certificate-management-using-cert-manager-operator-openshift

  • End of April Information 2024

    The new information for the end of April is:

    The IBM Linux on Power team released more images to their IBM Container Registry (ICR) here are the new ones:

    milvusv2.3.3docker pull icr.io/ppc64le-oss/milvus-ppc64le:v2.3.3April 2, 2024
    rust1.66.1docker pull icr.io/ppc64le-oss/rust-ppc64le:1.66.1April 2, 2024
    opensearch2.12.0docker pull icr.io/ppc64le-oss/opensearch-ppc64le:2.12.0April 16, 2024
    https://community.ibm.com/community/user/powerdeveloper/blogs/priya-seth/2023/04/05/open-source-containers-for-power-in-icr
  • Getting Started with a Sock-Shop – a sample multi-arch compute application

    Original post was at https://community.ibm.com/community/user/powerdeveloper/blogs/paul-bastide/2024/04/26/getting-started-with-a-sock-shop-a-sample-multi-ar?CommunityKey=daf9dca2-95e4-4b2c-8722-03cd2275ab63

    I’ve developed the following script to help you get started deploying multiarchitecture applications and show elaborate on the techniques for controllin multiarch compute. This script uses the sock-shop application which is available at https://github.com/ocp-power-demos/sock-shop-demo . This series of instructions for sock-shop-demo requires kustomize and following the readme.md in the repository to setup the username and password for mongodb.

    You do not need to do every step that follows, please feel free to install/use what you’d like. I recommend the kustomize install with multi-no-ns, and then playing with the features you find interesting. Note, multi-no-ns requires no namespace.

    The layout of the application is described in this diagram:

    demo application layout

    Deploying a non-multiarch Intel App

    This deployment shows the Exec errors and pod scheduling errors that are encountered when scheduling Intel only Pods on Power.

    For these steps, you are going to clone the ocp-power-demos’s sock-shop-demo and then experiment to resolve errors so the application is up and running.

    I’d recommend running this from a bastion.

    1. Clone the repository
    git clone https://github.com/ocp-power-demos/sock-shop-demo
    
    1. Switch to the sock-shop-demo folder
    2. Download kustomize – this tool enable a ordered layout of the resources. You’ll also need oc installed.
    curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" | bash
    

    Ref: https://kubectl.docs.kubernetes.io/installation/kustomize/binaries/

    The reason kustomize is used is due to the sort order feature in the binary.

    1. Update the manifests/overlays/single/env.secret file with a username and password for mongodb. openssl rand -hex 10 is a good tip to generating a random password. You’ll need to copy this env.secret in each ‘overlays/` folder that is used in the demo.
    2. We’re going to create the sock-shop application.
    ❯ kustomize build manifests/overlays/single | oc apply -f -
    

    This create a full application within the OpenShift project.

    To see the layout of the application you can see the third diagram of the layout (except these are only Intel images) https://github.com/ocp-power-demos/sock-shop-demo/blob/main/README.md#diagrams

    1. Run oc get pods -owide
    ❯ oc get pods -owide
    NAME                            READY   STATUS             RESTARTS        AGE     IP            NODE                                   NOMINATED NODE   READINESS GATES
    carts-585dc6c878-wq6jg          0/1     Error              6 (2m56s ago)   6m21s   10.129.2.24   mac-01a7-worker-0                      <none>           <none>
    carts-db-78f756b87c-r4pl9       1/1     Running            0               6m19s   10.131.0.32   rdr-mac-cust-el-tmwmg-worker-1-6g97b   <none>           <none>
    catalogue-77d7c444bb-wnltt      0/1     CrashLoopBackOff   6 (8s ago)      6m17s   10.130.2.21   mac-01a7-worker-1                      <none>           <none>
    catalogue-db-5bc97c6b98-v9rdp   1/1     Running            0               6m16s   10.131.0.33   rdr-mac-cust-el-tmwmg-worker-1-6g97b   <none>           <none>
    front-end-648fdf6957-bjk9m      0/1     CrashLoopBackOff   5 (2m44s ago)   6m14s   10.129.2.25   mac-01a7-worker-0                      <none>           <none>
    orders-5dbf8994df-whb9r         0/1     CrashLoopBackOff   5 (2m47s ago)   6m13s   10.130.2.22   mac-01a7-worker-1                      <none>           <none>
    orders-db-7544dc7fd9-w9zh7      1/1     Running            0               6m11s   10.128.3.83   rdr-mac-cust-el-tmwmg-worker-2-5hbxg   <none>           <none>
    payment-6cdff467b9-n2dql        0/1     Error              6 (2m53s ago)   6m10s   10.130.2.23   mac-01a7-worker-1                      <none>           <none>
    queue-master-c9dcf8f87-c8drl    0/1     CrashLoopBackOff   5 (2m41s ago)   6m8s    10.129.2.26   mac-01a7-worker-0                      <none>           <none>
    rabbitmq-54689956b9-rt5fb       2/2     Running            0               6m7s    10.131.0.34   rdr-mac-cust-el-tmwmg-worker-1-6g97b   <none>           <none>
    session-db-7d4cc56465-dcx9f     1/1     Running            0               6m5s    10.130.2.24   mac-01a7-worker-1                      <none>           <none>
    shipping-5ff5f44465-tbjv7       0/1     Error              6 (2m51s ago)   6m4s    10.130.2.25   mac-01a7-worker-1                      <none>           <none>
    user-64dd65b5b7-49cbd           0/1     CrashLoopBackOff   5 (2m25s ago)   6m3s    10.129.2.27   mac-01a7-worker-0                      <none>           <none>
    user-db-7f864c9f5f-jchf6        1/1     Running            0               6m1s    10.131.0.35   rdr-mac-cust-el-tmwmg-worker-1-6g97b   <none>           <none>
    

    You might be lucky enough for the scheduler to assign these to Intel only nodes.

    At this point if they are all Running with no restarts, yes it’s running.

    1. Grab the external URL
    ❯ oc get routes                                            
    NAME        HOST/PORT                                                      PATH   SERVICES    PORT   TERMINATION     WILDCARD
    sock-shop   sock-shop-test-user-4.apps.rdr-mac-cust-d.rdr-xyz.net          front-end   8079   edge/Redirect   None
    
    1. Open a Browser, and navigate around. Try registering a user.

    It failed for me.

    Cordon Power nodes

    The purpose is to cordon the Power Nodes and delete the existing pod so you get the Pod running on the architecture you want. This is only recommended on a dev/test system and on the worker nodes.

    1. Find the Power workers
    oc get nodes -l kubernetes.io/arch=ppc64le | grep worker
    
    1. For each of the Power, cordon the nodes

    oc adm cordon node/<worker>

    1. List the front-end app pods
    ❯ oc get pods -l name=front-end
    NAME                         READY   STATUS             RESTARTS       AGE
    front-end-648fdf6957-bjk9m   0/1     CrashLoopBackOff   13 (26s ago)   42m
    
    1. Delete the front-end pods.
    oc delete pod/front-end-648fdf6957-bjk9m
    

    The app should be running correctly at this point.

    Use a Node Selector for the Application

    Demonstrate how to use node selector to put the workload on the right nodes.

    These microservices use Deployments. We can modify the deployment to use NodeSelectors.

    1. Edit the manifests/overlays/single/09-front-end-dep.yaml or oc edit deployment/front-end
    2. Find the nodeSelector field and add an architecture limitation using a Node label:
    nodeSelector:
      node.openshift.io/os_id: rhcos
      kubernetes.io/arch: amd64
    
    1. If you edited, the file run oc apply -f manifests/overlays/single/09-front-end-dep.yaml
    2. List the front-end app pods
    ❯ oc get pods -l name=front-end
    NAME                         READY   STATUS              RESTARTS         AGE
    front-end-648fdf6957-bjk9m   0/1     CrashLoopBackOff    14 (2m49s ago)   50m
    front-end-7bd476764-t974g    0/1     ContainerCreating   0                40s
    
    1. It may not be ‘Ready’, and you may need to delete the front-end pod on the power node.
    oc delete pod/front-end-648fdf6957-bjk9m
    

    Note, you can run the following to run with nodeSelectors.

    ❯ kustomize build manifests/overlays/single-node-selector | oc delete -f - 
    ❯ kustomize build manifests/overlays/single-node-selector | oc apply -f - 
    

    Are the pods running on the Intel node?

    Uncordon the Power nodes

    With the nodeSelector now started, you can uncordon the Power nodes. This is only recommended on a dev/test system and on the worker nodes.

    1. Find the Power workers
    oc get nodes -l kubernetes.io/arch=ppc64le | grep worker
    
    1. For each of the Power, uncordon the nodes

    oc adm uncordon node/<worker>

    1. List the front-end app pods
    ❯ oc get pods -l name=front-end
    NAME                         READY   STATUS             RESTARTS       AGE
    front-end-6944957cd6-qmhhg      1/1     Running            0           19s
    

    The application should be running. If not, please use:

    ❯ kustomize build manifests/overlays/single-node-selector | oc delete -f - 
    ❯ kustomize build manifests/overlays/single-node-selector | oc apply -f - 
    

    The workload should be all on the intel side.

    Deploying a multiarch Intel/Power App

    With many of these applications, there are architecture specific alternatives. You can run without NodeSelectors to get the workload scheduled where there is support.

    To switch to Node selectors use across Power/Intel.

    1. Switch to oc project sock-shop
    2. Delete the Pods and Recreate (this is a manifest-listed set of images)
    ❯ kustomize build manifests/overlays/multi-no-ns | oc apply -f -
    
    1. List the app pods
    ❯ oc get pods -owide
    

    Update the app deployment to use a manifest listed image and remove node selector

    We’re going to move one of the applications’ dependencies using rabbitmq. The IBM team has created a port of Redis to ppc64le. link

    1. Edit git/sock-shop-demo/manifests/overlays/multi-no-ns/19-rabbitmq-dep.yaml
    2. Replace image: kbudde/rabbitmq-exporter on line 32. icr.io/ppc64le-oss/rabbitmq-exporter-ppc64le:1.0.0-RC19
    3. Remove the nodeSelector label kubernetes.io/arch: amd64 limitation on line 39
    4. Build and replace kustomize build manifests/overlays/multi-no-ns | oc apply -f -
    5. Check the Pod is starting/running on Power
    ❯ oc get pod -l name=rabbitmq -owide
    NAME                        READY   STATUS    RESTARTS   AGE   IP            NODE                NOMINATED NODE   READINESS GATES
    rabbitmq-65c75db8db-9jqbd   2/2     Running   0          96s   10.130.2.31   mac-01a7-worker-1   <none>           <none>
    

    The pod should now start on the Power node.

    You’ve taken advantage of the containers, and you can take advantage of other OpenSource container images https://community.ibm.com/community/user/powerdeveloper/blogs/priya-seth/2023/04/05/open-source-containers-for-power-in-icr

    Using a Taint/Toleration

    Taints and Tolerations provide a way to

    1. Find the Secondary workers, for instance, if the primary architecture is Intel, you’ll get the power workers, and taint the Power workers
    oc get nodes -l kubernetes.io/arch=ppc64le | grep worker
    
    1. Taint the Power Customers
    oc adm taint nodes node1 kubernetes.io/arch=ppc64le:NoSchedule
    

    Also note, the taints are flipped (intel is tained with power taint)

    1. Edit manifests/overlays/multi-taint-front-end/09-front-end-dep.yaml
    2. Update the toleration to match your architecture taint.
    3. Save the file
    4. Run the oc apply command oc apply -f manifests/overlays/multi-taint-front-end/09-front-end-dep.yaml
    5. Check the location of the workers
    ❯ oc get pods -o wide -l name=front-end                                      
    NAME                         READY   STATUS    RESTARTS   AGE    IP            NODE                                   NOMINATED NODE   READINESS GATES
    front-end-69c64bf86f-98nkc   0/1     Running   0          9s     10.128.3.99   rdr-mac-cust-el-tmwmg-worker-2-5hbxg   <none>           <none>
    front-end-7f4f4844c8-x79zn   1/1     Running   0          103s   10.130.2.33   mac-01a7-worker-1                      <none>           <none>
    

    You might have to give a few minutes before the workload shifts.

    1. Check again to see the workload is moved to an intel node:
    ❯ oc get pods -o wide -l name=front-end
    NAME                         READY   STATUS    RESTARTS   AGE   IP            NODE                                   NOMINATED NODE   READINESS GATES
    front-end-69c64bf86f-98nkc   1/1     Running   0          35s   10.128.3.99   rdr-mac-cust-el-tmwmg-worker-2-5hbxg   <none>           <none>
    

    Ref: OpenShift 4.14: Understanding taints and tolerations

    Summary

    These are different techniques to help schedule/control workload placement and help you explore Multi-Arch Compute.

    My colleague, Punith, and I have also posted two documents on further controlling workload placement:

    1. Multi-Arch Compute: Node Selector
    1. Controlling Pod placement based on weighted node-affininty with your Multi-Arch Compute cluster
  • Setting up nfs-provisioner on OpenShift on Power Systems with a template

    Here are my notes for setting up the SIG’s nfs-provisioner. You should follow these directions to setup the nfs-provisioner kubernetes-sigs/nfs-subdir-external-provisioner.

    1. If you haven’t already, you need to create the nfs-provisioner namespace.

    a. Create the namespace

    oc new-project nfs-provisioner

    b. Annotate the namespace with elevated privileges so we can create NFS mounts

    # oc label namespace/nfs-provisioner security.openshift.io/scc.podSecurityLabelSync=false --overwrite=true
    namespace/nfs-provisioner labeled
    # oc label namespace/nfs-provisioner pod-security.kubernetes.io/enforce=privileged --overwrite=true
    namespace/nfs-provisioner labeled
    # oc label namespace/nfs-provisioner pod-security.kubernetes.io/enforce-version=v1.24 --overwrite=true
    namespace/nfs-provisioner labeled
    # oc label namespace/nfs-provisioner pod- security.kubernetes.io/audit=privileged --overwrite=true
    namespace/nfs-provisioner labeled
    # oc label namespace/nfs-provisioner pod-security.kubernetes.io/warn=privileged --overwrite=true
    namespace/nfs-provisioner labeled
    1. Download the storage-class-nfs-template
    # curl -O -L https://github.com/IBM/ocp4-power-workload-tools/manifests/storage/storage-class-nfs-template.yaml
    1. Setup Authorization
    oc adm policy add-scc-to-user hostmount-anyuid system:serviceaccount:nfs-provisioner:nfs-client-provisioner
    • Process the template with the NFS_PATH and NFS_SERVER
    # oc process -f storage-class-nfs-template.yaml -p NFS_PATH=/data -p NFS_SERVER=10.17.2.138 | oc apply -f –
    
    deployment.apps/nfs-client-provisioner created
    serviceaccount/nfs-client-provisioner created
    clusterrole.rbac.authorization.k8s.io/nfs-client-provisioner-runner created
    clusterrolebinding.rbac.authorization.k8s.io/run-nfs-client-provisioner created
    role.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner created
    rolebinding.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner created
    storageclass.storage.k8s.io/nfs-client created
    1. Get the pods
    oc get pods
    NAME                                     READY   STATUS    RESTARTS   AGE
    nfs-client-provisioner-b8764c6bb-mjnq9   1/1     Running   0          36s
    1. Check the storage class… You should see the nfs-client listed. This is the default.

    ❯ oc get sc

    NAME         PROVISIONER                                   RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE

    nfs-client   k8s-sigs.io/nfs-subdir-external-provisioner   Delete          Immediate           false                  3m27s

    If you see more than the nfs-client listed, you may have to change the defaults.

    oc patch storageclass storageclass-name -p ‘{“metadata”: {“annotations”: {“storageclass.kubernetes.io/is-default-class”: “false”}}}’

  • April 2024 Updates

    Here are some updates for April 2024.

    FYI: I was made aware of kubernetes-sigs/kube-scheduler-simulator and the release simulator/v0.2.0.

    That’s why we are developing a simulator for kube-scheduler — you can try out the behavior of the scheduler while checking which plugin made what decision for which Node.

    https://github.com/kubernetes-sigs/kube-scheduler-simulator/tree/simulator/v0.2.0

    The Linux on Power Team added three new Power supported containers.

    cassandra 	4.1.3 	docker pull icr.io/ppc64le-oss/cassandra-ppc64le:4.1.3 	April 2, 2024
    milvus 	v2.3.3 	docker pull icr.io/ppc64le-oss/milvus-ppc64le:v2.3.3 	April 2, 2024
    rust 	1.66.1 	docker pull icr.io/ppc64le-oss/rust-ppc64le:1.66.1 	April 2, 2024
    mongodb 5.0.26 April 9, 2024 docker pull icr.io/ppc64le-oss/mongodb-ppc64le:5.0.26
    mongodb 6.0.13 April 9, 2024 docker pull icr.io/ppc64le-oss/mongodb-ppc64le:6.0.13
    logstash 8.11.3 April 9, 2024 docker pull icr.io/ppc64le-oss/logstash-ppc64le:8.11.3 
    
    https://community.ibm.com/community/user/powerdeveloper/blogs/priya-seth/2023/04/05/open-source-containers-for-power-in-icr

    Added a new fix for imagestream set schedule

    https://gist.github.com/prb112/838d8c2ae908b496f5d5480411a7d692

    An article worth rekindling in our memories…

    Optimal LPAR placement for a Red Hat OpenShift cluster within IBM PowerVM

    Optimal logical partition (LPAR) placement can be important to improve the performance of workloads as this can favor efficient use of the memory and CPU resources on the system. However, for certain configuration and settings such as I/O devices allocation to the partition, amount of memory allocation, CPU entitlement to the partition, and so on we might not get a desired LPAR placement. In such situations, the technique described in this blog can enable you to place the LPAR in a desired optimal configuration.

    https://community.ibm.com/community/user/powerdeveloper/blogs/mel-bakhshi/2022/08/11/openshift-lpar-placement-powervm

    There is an updated list Red Hat products supporting IBM Power.

    https://community.ibm.com/community/user/powerdeveloper/blogs/ashwini-sule/2024/04/05/red-hat-products-mar-2024

    Enhancing container security with Aqua Trivy on IBM Power

    … IBM Power development team found that Trivy is as effective as other open source scanners in detecting vulnerabilities. Not only does Trivy prove to be suitable for container security in IBM Power clients’ DevSecOps pipelines, but the scanning process is simple. IBM Power’s support for Aqua Trivy underscores its industry recognition for its efficacy as an open source scanner.

    https://community.ibm.com/community/user/powerdeveloper/blogs/jenna-murillo/2024/04/08/enhanced-container-security-with-trivy-on-power

    Podman 5.0 is released

    https://blog.podman.io/2024/03/podman-5-0-has-been-released/
  • Replay: Getting started with Multi-Arch Compute workloads with your Red Hat OpenShift cluster

    I presented on:

    The Red Hat OpenShift Container Platform runs on IBM Power systems, offering a secure and reliable foundation for modernizing applications and running containerized workloads.

    Multi-Arch Compute for OpenShift Container Platform lets you use a pair of compute architectures, such as ppc64le and amd64, within a single cluster. This exciting feature opens new possibilities for versatility and optimization for composite solutions that span multiple architectures.

    Join Paul Bastide, IBM Senior Software Engineer, as he introduces the background behind Multi-Arch Compute and then gets you started setting up, configuring, and scheduling workloads. After, Paul will take you through a brief demonstration showing common problems and solutions for running multiple architectures in the same cluster.

    Go here to see the download https://ibm.webcasts.com/starthere.jsp?ei=1660167&tp_key=ddb6b00dbd&_gl=11snjgp3_gaMjk3MzQzNDU1LjE3MTI4NTQ3NzA._ga_FYECCCS21D*MTcxMjg1NDc2OS4xLjAuMTcxMjg1NDc2OS4wLjAuMA..&_ga=2.141469425.2128302208.1712854770-297343455.1712854770