Category: IBM Power Systems

  • Cert Manager on Multi-Architectures

    I found a cool article on Cert Manager with IPI PowerVS

    Simplify certificate management on OpenShift across multiple architectures

    Chirag Kyal is a Software Engineer at Red Hat… has authored an article about deploying IPI PowerVS and Cert Manager on IBM Cloud.

    Check out the article about efficient certificate management techniques on Red Hat OpenShift using the cert-manager Operator for OpenShift’s multi-architecture support.

    https://developers.redhat.com/learning/learn:openshift:simplify-certificate-management-openshift-across-multiple-architectures/resource/resources:automate-tls-certificate-management-using-cert-manager-operator-openshift

  • End of April Information 2024

    The new information for the end of April is:

    The IBM Linux on Power team released more images to their IBM Container Registry (ICR) here are the new ones:

    milvusv2.3.3docker pull icr.io/ppc64le-oss/milvus-ppc64le:v2.3.3April 2, 2024
    rust1.66.1docker pull icr.io/ppc64le-oss/rust-ppc64le:1.66.1April 2, 2024
    opensearch2.12.0docker pull icr.io/ppc64le-oss/opensearch-ppc64le:2.12.0April 16, 2024
    https://community.ibm.com/community/user/powerdeveloper/blogs/priya-seth/2023/04/05/open-source-containers-for-power-in-icr
  • Getting Started with a Sock-Shop – a sample multi-arch compute application

    Original post was at https://community.ibm.com/community/user/powerdeveloper/blogs/paul-bastide/2024/04/26/getting-started-with-a-sock-shop-a-sample-multi-ar?CommunityKey=daf9dca2-95e4-4b2c-8722-03cd2275ab63

    I’ve developed the following script to help you get started deploying multiarchitecture applications and show elaborate on the techniques for controllin multiarch compute. This script uses the sock-shop application which is available at https://github.com/ocp-power-demos/sock-shop-demo . This series of instructions for sock-shop-demo requires kustomize and following the readme.md in the repository to setup the username and password for mongodb.

    You do not need to do every step that follows, please feel free to install/use what you’d like. I recommend the kustomize install with multi-no-ns, and then playing with the features you find interesting. Note, multi-no-ns requires no namespace.

    The layout of the application is described in this diagram:

    demo application layout

    Deploying a non-multiarch Intel App

    This deployment shows the Exec errors and pod scheduling errors that are encountered when scheduling Intel only Pods on Power.

    For these steps, you are going to clone the ocp-power-demos’s sock-shop-demo and then experiment to resolve errors so the application is up and running.

    I’d recommend running this from a bastion.

    1. Clone the repository
    git clone https://github.com/ocp-power-demos/sock-shop-demo
    
    1. Switch to the sock-shop-demo folder
    2. Download kustomize – this tool enable a ordered layout of the resources. You’ll also need oc installed.
    curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" | bash
    

    Ref: https://kubectl.docs.kubernetes.io/installation/kustomize/binaries/

    The reason kustomize is used is due to the sort order feature in the binary.

    1. Update the manifests/overlays/single/env.secret file with a username and password for mongodb. openssl rand -hex 10 is a good tip to generating a random password. You’ll need to copy this env.secret in each ‘overlays/` folder that is used in the demo.
    2. We’re going to create the sock-shop application.
    ❯ kustomize build manifests/overlays/single | oc apply -f -
    

    This create a full application within the OpenShift project.

    To see the layout of the application you can see the third diagram of the layout (except these are only Intel images) https://github.com/ocp-power-demos/sock-shop-demo/blob/main/README.md#diagrams

    1. Run oc get pods -owide
    ❯ oc get pods -owide
    NAME                            READY   STATUS             RESTARTS        AGE     IP            NODE                                   NOMINATED NODE   READINESS GATES
    carts-585dc6c878-wq6jg          0/1     Error              6 (2m56s ago)   6m21s   10.129.2.24   mac-01a7-worker-0                      <none>           <none>
    carts-db-78f756b87c-r4pl9       1/1     Running            0               6m19s   10.131.0.32   rdr-mac-cust-el-tmwmg-worker-1-6g97b   <none>           <none>
    catalogue-77d7c444bb-wnltt      0/1     CrashLoopBackOff   6 (8s ago)      6m17s   10.130.2.21   mac-01a7-worker-1                      <none>           <none>
    catalogue-db-5bc97c6b98-v9rdp   1/1     Running            0               6m16s   10.131.0.33   rdr-mac-cust-el-tmwmg-worker-1-6g97b   <none>           <none>
    front-end-648fdf6957-bjk9m      0/1     CrashLoopBackOff   5 (2m44s ago)   6m14s   10.129.2.25   mac-01a7-worker-0                      <none>           <none>
    orders-5dbf8994df-whb9r         0/1     CrashLoopBackOff   5 (2m47s ago)   6m13s   10.130.2.22   mac-01a7-worker-1                      <none>           <none>
    orders-db-7544dc7fd9-w9zh7      1/1     Running            0               6m11s   10.128.3.83   rdr-mac-cust-el-tmwmg-worker-2-5hbxg   <none>           <none>
    payment-6cdff467b9-n2dql        0/1     Error              6 (2m53s ago)   6m10s   10.130.2.23   mac-01a7-worker-1                      <none>           <none>
    queue-master-c9dcf8f87-c8drl    0/1     CrashLoopBackOff   5 (2m41s ago)   6m8s    10.129.2.26   mac-01a7-worker-0                      <none>           <none>
    rabbitmq-54689956b9-rt5fb       2/2     Running            0               6m7s    10.131.0.34   rdr-mac-cust-el-tmwmg-worker-1-6g97b   <none>           <none>
    session-db-7d4cc56465-dcx9f     1/1     Running            0               6m5s    10.130.2.24   mac-01a7-worker-1                      <none>           <none>
    shipping-5ff5f44465-tbjv7       0/1     Error              6 (2m51s ago)   6m4s    10.130.2.25   mac-01a7-worker-1                      <none>           <none>
    user-64dd65b5b7-49cbd           0/1     CrashLoopBackOff   5 (2m25s ago)   6m3s    10.129.2.27   mac-01a7-worker-0                      <none>           <none>
    user-db-7f864c9f5f-jchf6        1/1     Running            0               6m1s    10.131.0.35   rdr-mac-cust-el-tmwmg-worker-1-6g97b   <none>           <none>
    

    You might be lucky enough for the scheduler to assign these to Intel only nodes.

    At this point if they are all Running with no restarts, yes it’s running.

    1. Grab the external URL
    ❯ oc get routes                                            
    NAME        HOST/PORT                                                      PATH   SERVICES    PORT   TERMINATION     WILDCARD
    sock-shop   sock-shop-test-user-4.apps.rdr-mac-cust-d.rdr-xyz.net          front-end   8079   edge/Redirect   None
    
    1. Open a Browser, and navigate around. Try registering a user.

    It failed for me.

    Cordon Power nodes

    The purpose is to cordon the Power Nodes and delete the existing pod so you get the Pod running on the architecture you want. This is only recommended on a dev/test system and on the worker nodes.

    1. Find the Power workers
    oc get nodes -l kubernetes.io/arch=ppc64le | grep worker
    
    1. For each of the Power, cordon the nodes

    oc adm cordon node/<worker>

    1. List the front-end app pods
    ❯ oc get pods -l name=front-end
    NAME                         READY   STATUS             RESTARTS       AGE
    front-end-648fdf6957-bjk9m   0/1     CrashLoopBackOff   13 (26s ago)   42m
    
    1. Delete the front-end pods.
    oc delete pod/front-end-648fdf6957-bjk9m
    

    The app should be running correctly at this point.

    Use a Node Selector for the Application

    Demonstrate how to use node selector to put the workload on the right nodes.

    These microservices use Deployments. We can modify the deployment to use NodeSelectors.

    1. Edit the manifests/overlays/single/09-front-end-dep.yaml or oc edit deployment/front-end
    2. Find the nodeSelector field and add an architecture limitation using a Node label:
    nodeSelector:
      node.openshift.io/os_id: rhcos
      kubernetes.io/arch: amd64
    
    1. If you edited, the file run oc apply -f manifests/overlays/single/09-front-end-dep.yaml
    2. List the front-end app pods
    ❯ oc get pods -l name=front-end
    NAME                         READY   STATUS              RESTARTS         AGE
    front-end-648fdf6957-bjk9m   0/1     CrashLoopBackOff    14 (2m49s ago)   50m
    front-end-7bd476764-t974g    0/1     ContainerCreating   0                40s
    
    1. It may not be ‘Ready’, and you may need to delete the front-end pod on the power node.
    oc delete pod/front-end-648fdf6957-bjk9m
    

    Note, you can run the following to run with nodeSelectors.

    ❯ kustomize build manifests/overlays/single-node-selector | oc delete -f - 
    ❯ kustomize build manifests/overlays/single-node-selector | oc apply -f - 
    

    Are the pods running on the Intel node?

    Uncordon the Power nodes

    With the nodeSelector now started, you can uncordon the Power nodes. This is only recommended on a dev/test system and on the worker nodes.

    1. Find the Power workers
    oc get nodes -l kubernetes.io/arch=ppc64le | grep worker
    
    1. For each of the Power, uncordon the nodes

    oc adm uncordon node/<worker>

    1. List the front-end app pods
    ❯ oc get pods -l name=front-end
    NAME                         READY   STATUS             RESTARTS       AGE
    front-end-6944957cd6-qmhhg      1/1     Running            0           19s
    

    The application should be running. If not, please use:

    ❯ kustomize build manifests/overlays/single-node-selector | oc delete -f - 
    ❯ kustomize build manifests/overlays/single-node-selector | oc apply -f - 
    

    The workload should be all on the intel side.

    Deploying a multiarch Intel/Power App

    With many of these applications, there are architecture specific alternatives. You can run without NodeSelectors to get the workload scheduled where there is support.

    To switch to Node selectors use across Power/Intel.

    1. Switch to oc project sock-shop
    2. Delete the Pods and Recreate (this is a manifest-listed set of images)
    ❯ kustomize build manifests/overlays/multi-no-ns | oc apply -f -
    
    1. List the app pods
    ❯ oc get pods -owide
    

    Update the app deployment to use a manifest listed image and remove node selector

    We’re going to move one of the applications’ dependencies using rabbitmq. The IBM team has created a port of Redis to ppc64le. link

    1. Edit git/sock-shop-demo/manifests/overlays/multi-no-ns/19-rabbitmq-dep.yaml
    2. Replace image: kbudde/rabbitmq-exporter on line 32. icr.io/ppc64le-oss/rabbitmq-exporter-ppc64le:1.0.0-RC19
    3. Remove the nodeSelector label kubernetes.io/arch: amd64 limitation on line 39
    4. Build and replace kustomize build manifests/overlays/multi-no-ns | oc apply -f -
    5. Check the Pod is starting/running on Power
    ❯ oc get pod -l name=rabbitmq -owide
    NAME                        READY   STATUS    RESTARTS   AGE   IP            NODE                NOMINATED NODE   READINESS GATES
    rabbitmq-65c75db8db-9jqbd   2/2     Running   0          96s   10.130.2.31   mac-01a7-worker-1   <none>           <none>
    

    The pod should now start on the Power node.

    You’ve taken advantage of the containers, and you can take advantage of other OpenSource container images https://community.ibm.com/community/user/powerdeveloper/blogs/priya-seth/2023/04/05/open-source-containers-for-power-in-icr

    Using a Taint/Toleration

    Taints and Tolerations provide a way to

    1. Find the Secondary workers, for instance, if the primary architecture is Intel, you’ll get the power workers, and taint the Power workers
    oc get nodes -l kubernetes.io/arch=ppc64le | grep worker
    
    1. Taint the Power Customers
    oc adm taint nodes node1 kubernetes.io/arch=ppc64le:NoSchedule
    

    Also note, the taints are flipped (intel is tained with power taint)

    1. Edit manifests/overlays/multi-taint-front-end/09-front-end-dep.yaml
    2. Update the toleration to match your architecture taint.
    3. Save the file
    4. Run the oc apply command oc apply -f manifests/overlays/multi-taint-front-end/09-front-end-dep.yaml
    5. Check the location of the workers
    ❯ oc get pods -o wide -l name=front-end                                      
    NAME                         READY   STATUS    RESTARTS   AGE    IP            NODE                                   NOMINATED NODE   READINESS GATES
    front-end-69c64bf86f-98nkc   0/1     Running   0          9s     10.128.3.99   rdr-mac-cust-el-tmwmg-worker-2-5hbxg   <none>           <none>
    front-end-7f4f4844c8-x79zn   1/1     Running   0          103s   10.130.2.33   mac-01a7-worker-1                      <none>           <none>
    

    You might have to give a few minutes before the workload shifts.

    1. Check again to see the workload is moved to an intel node:
    ❯ oc get pods -o wide -l name=front-end
    NAME                         READY   STATUS    RESTARTS   AGE   IP            NODE                                   NOMINATED NODE   READINESS GATES
    front-end-69c64bf86f-98nkc   1/1     Running   0          35s   10.128.3.99   rdr-mac-cust-el-tmwmg-worker-2-5hbxg   <none>           <none>
    

    Ref: OpenShift 4.14: Understanding taints and tolerations

    Summary

    These are different techniques to help schedule/control workload placement and help you explore Multi-Arch Compute.

    My colleague, Punith, and I have also posted two documents on further controlling workload placement:

    1. Multi-Arch Compute: Node Selector
    1. Controlling Pod placement based on weighted node-affininty with your Multi-Arch Compute cluster
  • Setting up nfs-provisioner on OpenShift on Power Systems with a template

    Here are my notes for setting up the SIG’s nfs-provisioner. You should follow these directions to setup the nfs-provisioner kubernetes-sigs/nfs-subdir-external-provisioner.

    1. If you haven’t already, you need to create the nfs-provisioner namespace.

    a. Create the namespace

    oc new-project nfs-provisioner

    b. Annotate the namespace with elevated privileges so we can create NFS mounts

    # oc label namespace/nfs-provisioner security.openshift.io/scc.podSecurityLabelSync=false --overwrite=true
    namespace/nfs-provisioner labeled
    # oc label namespace/nfs-provisioner pod-security.kubernetes.io/enforce=privileged --overwrite=true
    namespace/nfs-provisioner labeled
    # oc label namespace/nfs-provisioner pod-security.kubernetes.io/enforce-version=v1.24 --overwrite=true
    namespace/nfs-provisioner labeled
    # oc label namespace/nfs-provisioner pod- security.kubernetes.io/audit=privileged --overwrite=true
    namespace/nfs-provisioner labeled
    # oc label namespace/nfs-provisioner pod-security.kubernetes.io/warn=privileged --overwrite=true
    namespace/nfs-provisioner labeled
    1. Download the storage-class-nfs-template
    # curl -O -L https://github.com/IBM/ocp4-power-workload-tools/manifests/storage/storage-class-nfs-template.yaml
    1. Setup Authorization
    oc adm policy add-scc-to-user hostmount-anyuid system:serviceaccount:nfs-provisioner:nfs-client-provisioner
    • Process the template with the NFS_PATH and NFS_SERVER
    # oc process -f storage-class-nfs-template.yaml -p NFS_PATH=/data -p NFS_SERVER=10.17.2.138 | oc apply -f –
    
    deployment.apps/nfs-client-provisioner created
    serviceaccount/nfs-client-provisioner created
    clusterrole.rbac.authorization.k8s.io/nfs-client-provisioner-runner created
    clusterrolebinding.rbac.authorization.k8s.io/run-nfs-client-provisioner created
    role.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner created
    rolebinding.rbac.authorization.k8s.io/leader-locking-nfs-client-provisioner created
    storageclass.storage.k8s.io/nfs-client created
    1. Get the pods
    oc get pods
    NAME                                     READY   STATUS    RESTARTS   AGE
    nfs-client-provisioner-b8764c6bb-mjnq9   1/1     Running   0          36s
    1. Check the storage class… You should see the nfs-client listed. This is the default.

    ❯ oc get sc

    NAME         PROVISIONER                                   RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE

    nfs-client   k8s-sigs.io/nfs-subdir-external-provisioner   Delete          Immediate           false                  3m27s

    If you see more than the nfs-client listed, you may have to change the defaults.

    oc patch storageclass storageclass-name -p ‘{“metadata”: {“annotations”: {“storageclass.kubernetes.io/is-default-class”: “false”}}}’

  • April 2024 Updates

    Here are some updates for April 2024.

    FYI: I was made aware of kubernetes-sigs/kube-scheduler-simulator and the release simulator/v0.2.0.

    That’s why we are developing a simulator for kube-scheduler — you can try out the behavior of the scheduler while checking which plugin made what decision for which Node.

    https://github.com/kubernetes-sigs/kube-scheduler-simulator/tree/simulator/v0.2.0

    The Linux on Power Team added three new Power supported containers.

    cassandra 	4.1.3 	docker pull icr.io/ppc64le-oss/cassandra-ppc64le:4.1.3 	April 2, 2024
    milvus 	v2.3.3 	docker pull icr.io/ppc64le-oss/milvus-ppc64le:v2.3.3 	April 2, 2024
    rust 	1.66.1 	docker pull icr.io/ppc64le-oss/rust-ppc64le:1.66.1 	April 2, 2024
    mongodb 5.0.26 April 9, 2024 docker pull icr.io/ppc64le-oss/mongodb-ppc64le:5.0.26
    mongodb 6.0.13 April 9, 2024 docker pull icr.io/ppc64le-oss/mongodb-ppc64le:6.0.13
    logstash 8.11.3 April 9, 2024 docker pull icr.io/ppc64le-oss/logstash-ppc64le:8.11.3 
    
    https://community.ibm.com/community/user/powerdeveloper/blogs/priya-seth/2023/04/05/open-source-containers-for-power-in-icr

    Added a new fix for imagestream set schedule

    https://gist.github.com/prb112/838d8c2ae908b496f5d5480411a7d692

    An article worth rekindling in our memories…

    Optimal LPAR placement for a Red Hat OpenShift cluster within IBM PowerVM

    Optimal logical partition (LPAR) placement can be important to improve the performance of workloads as this can favor efficient use of the memory and CPU resources on the system. However, for certain configuration and settings such as I/O devices allocation to the partition, amount of memory allocation, CPU entitlement to the partition, and so on we might not get a desired LPAR placement. In such situations, the technique described in this blog can enable you to place the LPAR in a desired optimal configuration.

    https://community.ibm.com/community/user/powerdeveloper/blogs/mel-bakhshi/2022/08/11/openshift-lpar-placement-powervm

    There is an updated list Red Hat products supporting IBM Power.

    https://community.ibm.com/community/user/powerdeveloper/blogs/ashwini-sule/2024/04/05/red-hat-products-mar-2024

    Enhancing container security with Aqua Trivy on IBM Power

    … IBM Power development team found that Trivy is as effective as other open source scanners in detecting vulnerabilities. Not only does Trivy prove to be suitable for container security in IBM Power clients’ DevSecOps pipelines, but the scanning process is simple. IBM Power’s support for Aqua Trivy underscores its industry recognition for its efficacy as an open source scanner.

    https://community.ibm.com/community/user/powerdeveloper/blogs/jenna-murillo/2024/04/08/enhanced-container-security-with-trivy-on-power

    Podman 5.0 is released

    https://blog.podman.io/2024/03/podman-5-0-has-been-released/
  • Replay: Getting started with Multi-Arch Compute workloads with your Red Hat OpenShift cluster

    I presented on:

    The Red Hat OpenShift Container Platform runs on IBM Power systems, offering a secure and reliable foundation for modernizing applications and running containerized workloads.

    Multi-Arch Compute for OpenShift Container Platform lets you use a pair of compute architectures, such as ppc64le and amd64, within a single cluster. This exciting feature opens new possibilities for versatility and optimization for composite solutions that span multiple architectures.

    Join Paul Bastide, IBM Senior Software Engineer, as he introduces the background behind Multi-Arch Compute and then gets you started setting up, configuring, and scheduling workloads. After, Paul will take you through a brief demonstration showing common problems and solutions for running multiple architectures in the same cluster.

    Go here to see the download https://ibm.webcasts.com/starthere.jsp?ei=1660167&tp_key=ddb6b00dbd&_gl=11snjgp3_gaMjk3MzQzNDU1LjE3MTI4NTQ3NzA._ga_FYECCCS21D*MTcxMjg1NDc2OS4xLjAuMTcxMjg1NDc2OS4wLjAuMA..&_ga=2.141469425.2128302208.1712854770-297343455.1712854770

  • Red Hat OpenShift Multi-Architecture Compute – Demo MicroServices on IBM Power Systems

    Shows a Microservices Application running on Red Hat OpenShift Control Plane on IBM Power Systems with an Intel Worke

  • Updates for End of March 2024

    Here are some great updates for the first half of April 2024.

    Sizing and configuring an LPAR for AI workloads

    Sebastian Lehrig has a great introduction into CPU/AI/NUMA on Power10.

    https://community.ibm.com/community/user/powerdeveloper/blogs/sebastian-lehrig/2024/03/26/sizing-for-ai

    FYI: a new article is published – Improving the User Experience for Multi-Architecture Compute on IBM Power

    More and more IBM® Power® clients are modernizing securely with lower risk and faster time to value with cloud-native microservices on Red Hat® OpenShift® running alongside their existing banking and industry applications on AIX, IBM i, and Linux. With the availability of Red Hat OpenShift 4.15 on March 19th, Red Hat and IBM introduced a long-awaited innovation called Multi-Architecture Compute that enables clients to mix Power and x86 worker nodes in a single Red Hat OpenShift cluster. With the release of Red Hat OpenShift 4.15, clients can now run the control plane for a Multi-Architecture Compute cluster natively on Power.

    Some tips for setting up a Multi-Arch Compute Cluster

    Setting up a multi-arch compute cluster manually, not using automation, you’ll want to follow this process:

    1. Setup the Initial Cluster with the multi payload on Intel or Power for the Control Plane.
    2. Open the network ports between the two environments

    ICMP/TCP/UDP flowing in both directions

    1. Configure the Cluster

    a. Change any MTU between the networks

    oc patch Network.operator.openshift.io cluster --type=merge --patch \
        '{"spec": { "migration": { "mtu": { "network": { "from": 1400, "to": 1350 } , "machine": { "to" : 9100} } } } }'
    

    b. Limit CSI drivers to a single Arch

    oc annotate --kubeconfig /root/.kube/config ns openshift-cluster-csi-drivers \
      scheduler.alpha.kubernetes.io/node-selector=kubernetes.io/arch=amd64
    

    c. Disable offloading (I do this in the ignition)

    d. Move the imagepruner jobs to the architecture that makes the most sense

    oc patch imagepruner/cluster -p '{ "spec" : {"nodeSelector": {"kubernetes.io/arch" : "amd64"}}}' --type merge
    

    e. Move the ingress operator pods to the arch that makes the most sense. If you want the ingress pods to be on Intel then patch the clsuter.

    oc edit IngressController default -n openshift-ingress-operator
    

    Change ingresscontroller.spec.nodePlacement.nodeSelector to use the kubernetes.io/arch: amd64 to move the workfload to Intel only.

    f. use routing via host

    oc patch network.operator/cluster --type merge -p \
      '{"spec":{"defaultNetwork":{"ovnKubernetesConfig":{"gatewayConfig":{"routingViaHost":true}}}}}'
    

    Wait until the MCP is finished updating and has the latest MTU

    g. Download the igntion file and host on the local network via http.

    1. Create a new VSI worker and point to the ignition in userdata
    {
        "ignition": {
            "version": "3.4.0",
            "config": {
                "merge": [
                    {
                        "source": "http://${ignition_ip}:8080/ignition/worker.ign"
                    }
                ]
            }
        },
        "storage": {
            "files": [
                {
                    "group": {},
                    "path": "/etc/hostname",
                    "user": {},
                    "contents": {
                        "source": "data:text/plain;base64,${name}",
                        "verification": {}
                    },
                    "mode": 420
                },
                {
                    "group": {},
                    "path": "/etc/NetworkManager/dispatcher.d/20-ethtool",
                    "user": {},
                    "contents": {
                        "source": "data:text/plain;base64,aWYgWyAiJDEiID0gImVudjIiIF0gJiYgWyAiJDIiID0gInVwIiBdCnRoZW4KICBlY2hvICJUdXJuaW5nIG9mZiB0eC1jaGVja3N1bW1pbmciCiAgL3NiaW4vZXRodG9vbCAtLW9mZmxvYWQgZW52MiB0eC1jaGVja3N1bW1pbmcgb2ZmCmVsc2UgCiAgZWNobyAibm90IHJ1bm5pbmcgdHgtY2hlY2tzdW1taW5nIG9mZiIKZmkKaWYgc3lzdGVtY3RsIGlzLWZhaWxlZCBOZXR3b3JrTWFuYWdlci13YWl0LW9ubGluZQp0aGVuCnN5c3RlbWN0bCByZXN0YXJ0IE5ldHdvcmtNYW5hZ2VyLXdhaXQtb25saW5lCmZpCg==",
                        "verification": {}
                    },
                    "mode": 420
                }
            ]
        }
    }
    

    ${name} is base64 encoded.

    1. Post configuration tasks

    a. Configure shared storage using the nfs provisioner and limit to running from the architecture that is hosting the NFS shared volumes.

    b. Approve the CSRs for the workers. Do this carefully as it’s possible to lose the count as it may include Machine updates/csrs.

    1. Check the cluster operators and nodes it should be up and working.
  • Multi-Architecture Compute: Managing User Provisioned Infrastructure Load Balancers with Post-Installation workers

    From https://community.ibm.com/community/user/powerdeveloper/blogs/paul-bastide/2024/03/21/multi-architecture-compute-managing-user-provision?CommunityKey=daf9dca2-95e4-4b2c-8722-03cd2275ab63

    Multi-Arch Compute for Red Hat OpenShift Container Platform on IBM Power systems lets one use a pair of compute architectures, such as, ppc64le and amd64, within a single cluster. This feature opens new possibilities for versatility and optimization for composite solutions that span multiple architectures. The cluster owner is able to add an additional worker post installation.

    With User Provisioned Infrastructure (UPI), the cluster owner may have used automation or manual setup of front-end load balancers. The IBM team provides PowerVS ocp4-upi-powervs, PowerVM ocp4-upi-powervm and HMC ocp4-upi-powervm-hmc automation.

    When installing a cluster, the cluster is setup with ab external load balancer, such as haproxy. The external load balancer routes traffic to pools the Ingress Pods, API Server and MachineConfig server. The haproxy configuration is stored at /etc/haproxy/haproxy.cfg.

    For instance, the configuration for ingress-https load balancer would look like the following:

    frontend ingress-https
            bind *:443
            default_backend ingress-https
            mode tcp
            option tcplog
    
    backend ingress-https
            balance source
            mode tcp
            server master0 10.17.15.11:443 check
            server master1 10.17.19.70:443 check
            server master2 10.17.22.204:443 check
            server worker0 10.17.26.89:443 check
            server worker1 10.17.30.71:443 check
            server worker2 10.17.30.225:443 check
    

    When adding a post-installation worker to a UPI cluster, one must update the ingress-http and ingress-https. Y

    1. Get the IP and hostname
    # oc get nodes -lkubernetes.io/arch=amd64 --no-headers=true -ojson | jq  -c '.items[].status.addresses'
    [{"address":"10.17.15.11","type":"InternalIP"},{"address":"worker-amd64-0","type":"Hostname"}]
    [{"address":"10.17.19.70","type":"InternalIP"},{"address":"worker-amd64-1","type":"Hostname"}]
    
    1. Edit the /etc/haproxy/haproxy.cfg

    a. Find backend ingress-http then before the first server entry add the worker hostnames and ips.

            server worker-amd64-0 10.17.15.11:80 check
            server worker-amd64-1 10.17.19.70:80 check
    

    b. Find backend ingress-https then before the first server entry add the worker hostnames and ips.

            server worker-amd64-0 10.17.15.11:443 check
            server worker-amd64-1 10.17.19.70:443 check
    

    c. Save the config file.

    1. Restart the haproxy
    # systemctl restart haproxy
    

    You now have the additional workers incorporated into the haproxy, and as the ingress pods are moved from Power to Intel and back. You have a fully functional environment.

    Best wishes.

    Paul

    P.S. You can learn more about scalling up the ingress controller at Scaling an Ingress Controller

    $ oc patch -n openshift-ingress-operator ingresscontroller/default --patch '{"spec":{"replicas": 3}}' --type=merge
    

    P.P.S If you are running very advanced scenarios, you can change the ingresscontroller spec.nodePlacement.nodeSelector to put the workload on specific architectures. see Configuring an Ingress Controller

    nodePlacement:
     nodeSelector:
       matchLabels:
         kubernetes.io/arch: ppc64le