Category: Application Development

  • Notes on qcow2 on centos

    I recently had to run a centos9 qcow2 on a centos7 machine. I ran into a few problems, however, I found these steps helpful as I worked through the issue and resolved my problem. I’ve recorded them here for posterity.

    Steps

    1. Navigate to https://cloud.centos.org/centos/9-stream/x86_64/images/
    2. Click Last Modified twice to sort the images from most recent to oldest
    3. Find the latest qcow2 image – CentOS-Stream-GenericCloud-9-20230207.0.x86_64.qcow2
    4. Right Click and Copy Link
    https://cloud.centos.org/centos/9-stream/x86_64/images/
    1. Connect to your host
    ❯ curl -O -L https://cloud.centos.org/centos/9-stream/x86_64/images/CentOS-Stream-GenericCloud-9-20230207.0.x86_64.qcow2
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                     Dload  Upload   Total   Spent    Left  Speed
    100  930M  100  930M    0     0  63.2M      0  0:00:14  0:00:14 --:--:--  104M
    1. Install the dependencies
    ❯ dnf install libguestfs-tools qemu-kvm.x86_64 libvirt virt-install libguestfs-xfs.x86_64
    CentOS-7 - Base         0.0  B/s |   0  B     00:00    
    CentOS-7 - Updates      0.0  B/s |   0  B     00:00    
    CentOS-7 - Extras       0.0  B/s |   0  B     00:00    
    Package libguestfs-tools-1:1.40.2-10.el7.noarch is already installed.
    Package qemu-kvm-10:1.5.3-175.el7_9.6.x86_64 is already installed.
    Package libvirt-4.5.0-36.el7_9.5.x86_64 is already installed.
    Package virt-install-1.5.0-7.el7.noarch is already installed.
    Dependencies resolved.
    Nothing to do.
    Complete!
    1. Move the qcow over to images
    ❯ mv CentOS-Stream-GenericCloud-9-20230207.0.x86_64.qcow2 /var/lib/libvirt/images/
    1. Generate a password
    ❯ openssl rand -hex 10
    037c94bb31a9b9870178-example
    1. Set the password based on the previous step’s output
    ❯ LIBGUESTFS_BACKEND=direct virt-customize --format qcow2 -a /var/lib/libvirt/images/CentOS-Stream-GenericCloud-9-20230207.0.x86_64.qcow2 --root-password password:037c94bb31a9b9870178-example

    Note, if it fails, add -v -x to see verbose logging. Also make sure your base OS is one that can process the filesystem and run the qcow2 image. E.g. RHEL8 or Centos8 or higher.

    1. Startup the VM
    ❯ sudo virt-install
        --name ocp-bastion-server
        --ram 4096
        --vcpus 2
        --disk path=/var/lib/libvirt/images/CentOS-Stream-GenericCloud-9-20230207.0.x86_64.qcow2 
        --os-type linux
        --os-variant rhel9.0
        --network bridge=virbr0
        --graphics none
        --serial pty
        --console pty
        --boot hd
        --import

    References

    1. https://kubevirt.io/2020/Customizing-images-for-containerized-vms.html#building-standard-centos-8-image
    2. https://forums.centos.org/viewtopic.php?t=78770
    3. https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/pre-release/latest/
    4. https://www.reddit.com/r/CentOS/comments/k5sz8h/centos_8_image_editing_withing_centos7_host/

    To Grab RHCOS 4.12.

    1. Download from the mirror
    ❯ curl -O -L https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/4.12/4.12.2/rhcos-qemu.x86_64.qcow2.gz
      % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                     Dload  Upload   Total   Spent    Left  Speed
    100 1149M  100 1149M    0     0  25.5M      0  0:00:45  0:00:45 --:--:-- 32.1M
    1. Unzip
    ❯ gunzip rhcos-qemu.x86_64.qcow2.gz

    Debugging the FileSystem

    If you have the wrong version installed, sometimes the file system echos issues with superblock.

    guestfish -a /var/lib/libvirt/images/CentOS-Stream-GenericCloud-9-20230207.0.x86_64.qcow2 
    run 
    list-filesystems
    mount /dev/sda1 /
    dmesg | tail
    ><fs> run
    ><fs> list-filesystems
    /dev/sda1: xfs
    ><fs> mount /dev/sda1 /
    libguestfs: error: mount: mount exited with status 32: mount: wrong fs type, bad option, bad superblock on /dev/sda1, missing codepage or helper program, or other error In some cases useful info is found in syslog - try
           dmesg | tail or so.
    ><fs> dmesg | tail
    [   19.169691]  sda: sda1
    [   19.191795]  sda: sda1
    [   19.211130]  sda: sda1
    [   19.232340]  sda: sda1
    [   76.488398] SGI XFS with ACLs, security attributes, no debug enabled
    [   76.493455] XFS (sda1): Superblock has unknown read-only compatible features (0x4) enabled.
    [   76.504604] XFS (sda1): Attempted to mount read-only compatible filesystem read-write.
    [   76.505325] XFS (sda1): Filesystem can only be safely mounted read only.
    [   76.505362] XFS (sda1): SB validate failed with error -22.
    ><fs> 
    ><fs> quit
  • Cool Things I learned last week

    For those following along with my work, I’ve compiled a list of interesting items I’ve run across in the last week:

    Install minikube on an IBM PowerVM running RHEL 8.6 or 8.7

    Want to learn how to install minikube on an IBM Power system running RHEL? Check out this new blog on the IBM Power Developer eXchange, which provides step-by-step instructions on how to identify the software dependencies needed to download, build, and install minikube on Power

    https://community.ibm.com/community/user/powerdeveloper/blogs/vijay-puliyala/2023/01/23/install-minikube-on-ibm-powervm

    Learn the Compliance Operator

    There is a nice self-paced lab to learn the compliance-operator

    https://github.com/JAORMX/lab-compliance-operator
  • Downloading oc-compliance on ppc64le

    My team is working with the OpenShift Container Platforms Optional Operator – Compliance Operator. The Compliance Operator has a supporting tool oc-compliance.

    One tricky element was downloading the oc-compliance plugin and I’ve documented the steps here to help

    Steps

    1. Navigate to https://console.redhat.com/openshift/downloads#tool-pull-secret

    If Prompted, Login with your Red Hat Network id.

    1. Under Tokens, select Pull secret, then click Download

    2. Copy the pull-secret to your working directory

    3. Make the .local/bin directory to drop the plugin.

    $ mkdir -p ~/.local/bin
    
    1. Run the oc-compliance-rhel8 container image.
    $ podman run --authfile pull-secret --rm -v ~/.local/bin:/mnt/out:Z --arch ppc64le registry.redhat.io/compliance/oc-compliance-rhel8:stable /bin/cp /usr/bin/oc-compliance /mnt/out/
    Trying to pull registry.redhat.io/compliance/oc-compliance-rhel8:stable...
    Getting image source signatures
    Checking if image destination supports signatures
    Copying blob 847f634e7f1e done  
    Copying blob 7643f185b5d8 done  
    Copying blob d6050ae37df3 done  
    Copying config 2f0afdf522 done  
    Writing manifest to image destination
    Storing signatures
    
    1. Check the file is ppc64le
    $ file ~/.local/bin/oc-compliance 
    /root/.local/bin/oc-compliance: ELF 64-bit LSB executable, 64-bit PowerPC or cisco 7500, version 1 (SYSV), dynamically linked, interpreter /lib64/ld64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=d5bff511ee48b6cbc6afce6420e780da2f0eacdc, not stripped
    

    If it doesn’t work, you can always verify your architecture of the machine podman is running on:

    $ arch
    ppc64le
    

    It should say ppc64le.

    You’ve seen how to download the ppc64le build.

    References

  • Tweak for GoLang PowerPC Build

    As many know, Go is a designed to build architecture and operating system specific binaries. These architecture and operating system specific binaries are called a target. One can target GOARCH=ppc64le GOOS=linux go build to build for the specific OS. There is a nice little tweak which considers the architectures version and optimizes the selection of the ASM (assembler code) uses when building the code.

    To use the Power Architecture ppc64le for a specific target, you can use GOPPC64:

    1. power10 – runs with Power 10 only.
    2. power9 – runs with Power 9 and Power 10.
    3. power8 (the default) and runs with 8,9,10.

    For example the command is GOARCH=ppc64le GOOS=linux GOPPC64=power9 go build

    This may help with some various results.

    References

  • Linking Quay to OpenShift and you hit `x509: certificate signed by unknown authority`

    If you see the following error when you link OpenShift and self-signed Quay registry… I’ve got the steps for you…

    Events:
      Type     Reason          Age                From               Message
      ----     ------          ----               ----               -------
      Normal   Scheduled       38s                default-scheduler  Successfully assigned openshift-marketplace/my-operator-catalog-29vl8 to worker.output.xyz
      Normal   AddedInterface  36s                multus             Add eth0 [10.131.1.5/23] from openshift-sdn
      Normal   Pulling         23s (x2 over 36s)  kubelet            Pulling image "quay-demo.host.xyz:8443/repository/ocp/openshift4_12_ppc64le"
      Warning  Failed          22s (x2 over 35s)  kubelet            Failed to pull image "quay-demo.host.xyz:8443/repository/ocp/openshift4_12_ppc64le": rpc error: code = Unknown desc = pinging container registry quay-demo.host.xyz:8443: Get "https://quay-demo.host.xyz:8443/v2/": x509: certificate signed by unknown authority
      Warning  Failed          22s (x2 over 35s)  kubelet            Error: ErrImagePull
      Normal   BackOff         8s (x2 over 35s)   kubelet            Back-off pulling image "quay-demo.host.xyz:8443/repository/ocp/openshift4_12_ppc64le"
      Warning  Failed          8s (x2 over 35s)   kubelet            Error: ImagePullBackOff
    

    Steps

    1. Set the hostname to your registry hostname
    export REGISTRY_HOSTNAME=quay-demo.host.xyz
    export REGISTRY_PORT=8443
    
    1. Extract all the ca certs
    echo "" | openssl s_client -showcerts -prexit -connect "${REGISTRY_HOSTNAME}:${REGISTRY_PORT}" 2> /dev/null | sed -n -e '/BEGIN CERTIFICATE/,/END CERTIFICATE/ p' > tmp.crt
    
    1. Display the cert to verify you see the Issuer
    # openssl x509 -in tmp.crt -text | grep Issuer
            Issuer: C = US, ST = VA, L = New York, O = Quay, OU = Division, CN = quay-demo.host.xyz
    
    1. Create the configmap in the openshift-config namespace
    # oc create configmap registry-quay -n openshift-config --from-file="${REGISTRY_HOSTNAME}..${REGISTRY_PORT}=$(pwd)/tmp.crt"
    configmap/registry-quay created
    
    1. Add anadditionalTrustedCA to the the cluster image config.
    # oc patch image.config.openshift.io/cluster --patch '{"spec":{"additionalTrustedCA":{"name":"registry-quay"}}}' --type=merge
    image.config.openshift.io/cluster patched
    
    1. Verify you config is updated
    # oc get image.config.openshift.io/cluster -o yaml
    apiVersion: config.openshift.io/v1
    kind: Image
    metadata:
      annotations:
        include.release.openshift.io/ibm-cloud-managed: "true"
        include.release.openshift.io/self-managed-high-availability: "true"
        include.release.openshift.io/single-node-developer: "true"
        release.openshift.io/create-only: "true"
      creationTimestamp: "2022-10-20T15:35:08Z"
      generation: 2
      name: cluster
      ownerReferences:
      - apiVersion: config.openshift.io/v1
        kind: ClusterVersion
        name: version
        uid: a3df97ca-73ff-4a72-93b1-f3ef7d51e329
      resourceVersion: "6299552"
      uid: f7e56517-486d-4530-8e14-16ef0deed462
    spec:
      additionalTrustedCA:
        name: registry-quay
    status:
      internalRegistryHostname: image-registry.openshift-image-registry.svc:5000
    
    1. Check your pod that failed to connect, and you should see that it now succeeds.

    Reference

  • Use Qemu to Build S390x images

    Tips to build Qemu S390x images

    1. Connect to a build machine

    ssh root@ip

    1. Clone the operator

    “git clone https://github.com/prb112/operator.git“`

    1. Install qemu and buildah and podman-docker

    yum install -y qemu-kvm buildah podman-docker

    /usr/bin/docker run --rm --privileged tonistiigi/binfmt:latest --install all
    Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
    ✔ docker.io/tonistiigi/binfmt:latest
    Trying to pull docker.io/tonistiigi/binfmt:latest...
    Getting image source signatures
    Copying blob e9c608ddc3cb done  
    Copying blob 8d4d64c318a5 done  
    Copying config 354472a378 done  
    Writing manifest to image destination
    Storing signatures
    installing: arm64 OK
    installing: arm OK
    installing: ppc64le OK
    installing: mips64 OK
    installing: riscv64 OK
    installing: mips64le OK
    installing: s390x OK
    {
      "supported": [
        "linux/amd64",
        "linux/arm64",
        "linux/riscv64",
        "linux/ppc64le",
        "linux/s390x",
        "linux/386",
        "linux/mips64le",
        "linux/mips64",
        "linux/arm/v7",
        "linux/arm/v6"
      ],
      "emulators": [
        "kshcomp",
        "qemu-aarch64",
        "qemu-arm",
        "qemu-mips64",
        "qemu-mips64el",
        "qemu-ppc64le",
        "qemu-riscv64",
        "qemu-s390x"
      ]
    }
    

    /usr/bin/buildah bud --arch s390x -f $(pwd)/build/Dockerfile --format docker --tls-verify=true -t op:v0.1.1-linux-s390x $(pwd)/

  • openshift-install-power – quick notes

    FYI: openshift-install-power – this is a small recipe for deploying the latest code with the UPI from master branch @ my repo

    git clone https://github.com/ocp-power-automation/openshift-install-power.git
    chmod +x openshift-install-powervs
    export IBMCLOUD_API_KEY="<<redacted>>"
    export RELEASE_VER=latest
    export ARTIFACTS_VERSION="master"
    export ARTIFACTS_REPO="<<MY REPO>>"
    ./openshift-install-powervs setup
    ./openshift-install-powervs create -var-file mon01-20220930.tfvars -flavor small -trace
    

    This also recover from errors in ocp4-upi-powervs/terraform

  • Topology Manager and OpenShift/Kubernetes

    I recently had to work with the Kubernetes Topology Manager and OpenShift. Here is a braindump on Topology Manager:

    If the Topology ManagerFeature Gate is enabled, then any active HintProviders are registered to the TopologyManager.

    If the CPU Manager and feature gate are enabled, then the CPU Manager can be used to help workloads which are sensitive to CPU throttling, context switches, cache misses, require hyperthreads on same physical CPU core, low latency, and benefit from shared processor resources. The manager has two policies none and static which registers a NOP provider or statically locks the container to a set of CPUs.

    If the Memory Manager and feature gate are enabled, then the MemoryManager can be used to process independently of the CPU Manager – e.g. allocate HugePages or guarnteed memory.

    If Device Plugins are enabled, then it can be turned on to allocate Devices next to NUMA node resources (e.g., SR-IOV NICs). This may be used independent of the typical CPU/Memory management for GPUs and other machine devices.

    Generally, these are all used together to generate a BitMask that admits a pod using a best-effort, restricted, or single-numa-node policy.

    An important limitation is the Maximum Number of NUMA nodes is hard-coded to 8. When there are more than eight NUMA nodes, it’ll error out when assigning to the topology. The reason for this is related to state explosion and computational complexity.

    1. Check the worker nodes CPU if the NUMA returns 1, it’s a single NUMA node. If it returns 2 or more, it’s multiple NUMA nodes.
    sh-4.4# lscpu | grep 'NUMA node(s)'
    NUMA node(s):        1
    

    The kubernetes/enhancements repo contains great detail on the flows and weaknesses of the TopologyManager.

    To enable the Topology Manager, one uses Feature Gates:

    And OpenShift prefers the FeatureSet LatencySensitive

    1. Via FeatureGate
    $ oc patch featuregate cluster -p '{"spec": {"featureSet": "LatencySensitive"}}' --type merge
    

    Which turns on the basic TopologyManager /etc/kubernetes/kubelet.conf

      "featureGates": {
        "APIPriorityAndFairness": true,
        "CSIMigrationAzureFile": false,
        "CSIMigrationvSphere": false,
        "DownwardAPIHugePages": true,
        "RotateKubeletServerCertificate": true,
        "TopologyManager": true
      },
    
    1. Create a custom KubeletConfig, this allows targeted TopologyManager feature enablement.

    file: cpumanager-kubeletconfig.yaml

    apiVersion: machineconfiguration.openshift.io/v1
    kind: KubeletConfig
    metadata:
      name: cpumanager-enabled
    spec:
      machineConfigPoolSelector:
        matchLabels:
          custom-kubelet: cpumanager-enabled
      kubeletConfig:
         cpuManagerPolicy: static 
         cpuManagerReconcilePeriod: 5s 
    
    $ oc create -f cpumanager-kubeletconfig.yaml
    

    Net: They can be used independent of each other. They should be turned on at the same time to maximize the benefits.

    There are some examples and test cases out there for Kubernetes and OpenShift

    1. Red Hat Sys Engineering Team Test cases for Performance Addon Operator which is now the Cluster Node Tuning Operator– These are the clearest tests, which apply directly to the Topology Manager.
    2. Kube Test Cases

    This is one of the best examples k8stopologyawareschedwg/sample-device-plugin.

    Tools to know about

    1. GitHub: numalign (amd64) – you can download this in the releases. In this fork prb112/numalign I added ppc64le to the build
    2. numactl and numastat are superbly helpful to see the topology spread on a node link to a handy pdf on numa I’ve been starting up a fedora container with numactl and numastat installed

    Final note, I had written down that fedora is a great combination with taskset and numactl if you copy in the binaries. I think I used Fedora 35/36 as a container. link

    Yes. I built a Hugepages hungry container Hugepages. I also looked at hugepages_tests.go and the test plan.

    When it came down to it, I used my hunger container with the example.

    I hope this helps others as they start to work with Topology Manager.

    References

    Red Hat

    1. Red Hat Topology Aware Scheduling in Kubernetes Part 1: The High Level Business Case
    2. Red Hat Topology Awareness in Kubernetes Part 2: Don’t we already have a Topology Manager?

    OpenShift

    1. OpenShift 4.11: Using the Topology Manager
    2. OpenShift 4.11: Using device plug-ins to access external resources with pods
    3. OpenShift 4.11: Using Device Manager to make devices available to nodes Device Manager
    4. OpenShift 4.11: About Single Root I/O Virtualization (SR-IOV) hardware networks – Device Manager
    5. OpenShift 4.11: Adding a pod to an SR-IOV additional network
    6. OpenShift 4.11: Using CPU Manager CPU Manager

    Kubernetes

    1. Kubernetes: Topology Manager Blog
    2. Feature Highlight: CPU Manager
    3. Feature: Utlizing the NUMA-aware Memory Manager

    Kubernetes Enhancement

    1. KEP-693: Node Topology Manager e2e tests: Link
    2. KEP-2625: CPU Manager e2e tests: Link
    3. KEP-1769: Memory Manager Source: Link PR: Link
  • Kube 1.25.2 on RHEL9 P10

    1. Update Hosts
    9.0.90.0 ocp4daily70.ibm.com
    9.0.90.1 ocp4daily98.ibm.com
    
    1. Setup the Subscription Manager
    set +o history
    export rhel_subscription_username="rhn-ee-xxxxx"
    export rhel_subscription_password="xxxxx"
    set -o history
    subscription-manager register --username="${rhel_subscription_username}" --password="${rhel_subscription_password}"
    subscription-manager refresh
    
    1. Disable the swap
    sudo swapoff -a
    
    1. Install the libraries
    yum install -y podman podman-remote socat runc
    
    1. Install the cri-o package
    rpm -ivh https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable:/cri-o:/1.25:/1.25.0/Fedora_36/ppc64le/cri-o-1.25.0-2.1.fc36.ppc64le.rpm
    
    1. Enable podman socket
    systemctl enable --now podman.socket
    
    1. Enable crio service
    sudo systemctl enable crio
    sudo systemctl start crio
    
    1. Disable selinux
    sudo setenforce 0
    sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
    
    1. Download Release
    export RELEASE=1.25
    sudo curl -L --remote-name-all https://dl.k8s.io/v1.25.2/bin/linux/ppc64le/{kubeadm,kubelet,kubectl}
    sudo chmod +x {kubeadm,kubelet,kubectl}
    
    1. Move files to /bin
    mv kube* /bin/
    
    1. Add kubelet.service
    RELEASE_VERSION="v0.14.0"
    curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubelet/lib/systemd/system/kubelet.service" | sed "s:/usr/bin:${DOWNLOAD_DIR}:g" | sudo tee /etc/systemd/system/kubelet.service
    sudo mkdir -p /etc/systemd/system/kubelet.service.d
    curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubeadm/10-kubeadm.conf" | sed "s:/usr/bin:${DOWNLOAD_DIR}:g" | sudo tee /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
    
    1. Enable and start service
    systemctl enable --now kubelet
    systemctl start kubelet
    
    1. Update the cgroup settings
    cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
    overlay
    br_netfilter
    EOF
    
    1. Load the modules
    sudo modprobe overlay
    sudo modprobe br_netfilter
    
    1. sysctl params required by setup, params persist across reboots
    cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
    net.bridge.bridge-nf-call-iptables  = 1
    net.bridge.bridge-nf-call-ip6tables = 1
    net.ipv4.ip_forward                 = 1
    EOF
    
    1. Apply sysctl params without reboot
    sudo sysctl --system
    
    1. Install libnetfilter and conntrack-tools
    rpm -ivh http://mirror.stream.centos.org/9-stream/AppStream/ppc64le/os/Packages/libnetfilter_queue-1.0.5-1.el9.ppc64le.rpm
    rpm -ivh http://mirror.stream.centos.org/9-stream/AppStream/ppc64le/os/Packages/libnetfilter_cttimeout-1.0.0-19.el9.ppc64le.rpm
    rpm -ivh http://mirror.stream.centos.org/9-stream/AppStream/ppc64le/os/Packages/libnetfilter_cthelper-1.0.0-22.el9.ppc64le.rpm
    rpm -ivh http://mirror.stream.centos.org/9-stream/AppStream/ppc64le/os/Packages/conntrack-tools-1.4.5-15.el9.ppc64le.rpm
    
    1. Copy Kubelet
    cp /bin/kubelet /kubelet
    
    1. Edit crio.conf
    /etc/crio/crio.conf
    
    conmon_cgroup = "pod"
    cgroup_manager = "systemd"
    
    1. Add the plugins:
    curl -O https://github.com/containernetworking/plugins/releases/download/v1.1.1/cni-plugins-linux-ppc64le-v1.1.1.tgz -L
    cp cni-plugins-linux-ppc64le-v1.1.1.tgz /opt/cni/bin
    cd /opt/cni/bin
    tar xvfz cni-plugins-linux-ppc64le-v1.1.1.tgz 
    chmod +x /opt/cni/bin/*
    cd ~
    systemctl restart crio kubelet
    
    1. Download crictl
    curl -L --remote-name-all https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.25.0/crictl-v1.25.0-linux-ppc64le.tar.gz
    tar xvfz crictl-v1.25.0-linux-ppc64le.tar.gz
    chmod +x crictl
    mv crictl /bin
    
    1. Create the kubeadm
    kubeadm init --cri-socket=unix:///var/run/crio/crio.sock --pod-network-cidr=192.168.0.0/16
    
    1. Setup the configuration
    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config
    
    1. Manually copy over the .kube/config over to the worker node and do a kubeadm reset

    2. Download https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

    3. Edit the containers to point to the right instance, per the notes in the yaml to the ppc64le manifests

    4. Update net-conf.json

      net-conf.json: |
        {
          "Network": "192.168.0.0/16",
          "Backend": {
            "Type": "vxlan"
          }
        }
    
    1. Join the Cluster
    kubeadm join 9.0.90.1:6443 --token xbp7gy.9eem3bta75v0ccw8 \
            --discovery-token-ca-cert-hash sha256:a822342f231db2e730559b4962325a2c2c685d7fc440ae41987e123da47f9118
    
    1. Add role to the workers
    kubectl label node ocp4daily70.ibm.com node-role.kubernetes.io/worker=worker
    
  • Switching to use Kubernetes with Flannel on RHEL on P10

    I needed to switch from calico to flannel. Here is the recipe I followed to setting up Kubernetes 1.25.2 on a Power 10 using Flannel.

    Switching to use Kubernetes with Flannel on RHEL on P10

    1. Connect to both VMs (in split terminal)
    ssh root@control-1
    ssh root@worker-1
    
    1. Run Reset (acknowledge that you want to proceed)
    kubeadm reset
    
    1. Remove Calico
    rm /etc/cni/net.d/10-calico.conflist 
    rm /etc/cni/net.d/calico-kubeconfig
    iptables-save | grep -i cali | iptables -F
    iptables-save | grep -i cali | iptables -X 
    
    1. Initialize the cluster
    kubeadm init --cri-socket=unix:///var/run/crio/crio.sock --pod-network-cidr=192.168.0.0/16
    
    1. Setup kubeconfig
    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config
    
    1. Add the plugins:
    curl -O https://github.com/containernetworking/plugins/releases/download/v1.1.1/cni-plugins-linux-ppc64le-v1.1.1.tgz -L
    cp cni-plugins-linux-ppc64le-v1.1.1.tgz /opt/cni/bin
    cd /opt/cni/bin
    tar xvfz cni-plugins-linux-ppc64le-v1.1.1.tgz 
    chmod +x /opt/cni/bin/*
    cd ~
    systemctl restart crio kubelet
    
    1. Download https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

    2. Edit the containers to point to the right instance, per the notes in the yaml to the ppc64le manifests

    3. Update net-conf.json

      net-conf.json: |
        {
          "Network": "192.168.0.0/16",
          "Backend": {
            "Type": "vxlan"
          }
        }
    
    1. Join the Cluster

    kubeadm join 1.1.1.1:6443 –token y004bg.sc65cp7fqqm7ladg
    –discovery-token-ca-cert-hash sha256:1c32dacdf9b934b7bbd6d13fde9312a35709e2f5849008acec8f597eb5a5dad9

    1. Add role to the workers
    kubectl label node worker-01.ocp-power.xyz node-role.kubernetes.io/worker=worker
    

    Ref: https://gist.github.com/rkaramandi/44c7cea91501e735ea99e356e9ae7883 Ref: https://www.buzzwrd.me/index.php/2022/02/16/calico-to-flannel-changing-kubernetes-cni-plugin/