Tag: redhat

  • Using Kubernetes v1.25.2 on RHEL9 with Power10

    My squad is doing work with Kubernetes v1.25.2 on Red Hat Enterprise Linux 9 and IBM Power 10.

    As a pre-requisite for the work, we setup two RHEL9 VMs on a P10 with 8cpu and 16GB ram and 100G disk.

    Steps

    1. Added Subscription-Manager to Each Machine
    set +o history
    export rhel_subscription_username="rhn-ee-xxx"
    export rhel_subscription_password="xxxxxx"
    set -o history
    
    1. Register the RHEL vms
    subscription-manager register --username="${rhel_subscription_username}" --password="${rhel_subscription_password}"
    subscription-manager refresh
    
    1. Disabled swap
    sudo swapoff -a
    
    1. On Each Node, run echo $(hostname -i) $(hostname --long) and use the primary ipv4 ip address.

    2. Update /etc/hosts with the output on each node

    10.47.90.180 ocp4daily70.ocp-power.xyz
    10.47.90.127 ocp4daily17.ocp-power.xyz
    
    1. Install podman, podman remotes, socat, runc, conmon
    yum install -y podman podman-remote socat runc conmon
    
    1. Enable the podman socket
    systemctl enable --now podman.socket
    
    1. Check Remote podman-remote info should show information

    2. Added these Repos

    subscription-manager repos --enable="rhel-9-for-ppc64le-appstream-rpms" --enable="rhel-9-for-ppc64le-baseos-rpms" --enable="rhv-4-tools-for-rhel-9-ppc64le-source-rpms" --enable="fast-datapath-for-rhel-9-ppc64le-rpms"
    
    1. Install cri-o
    rpm -ivh https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable:/cri-o:/1.25:/1.25.0/Fedora_36/ppc64le/cri-o-1.25.0-2.1.fc36.ppc64le.rpm
    
    1. Start crio
    $ sudo systemctl enable crio
    Created symlink /etc/systemd/system/cri-o.service → /usr/lib/systemd/system/crio.service.
    Created symlink /etc/systemd/system/multi-user.target.wants/crio.service → /usr/lib/systemd/system/crio.service.
    $ sudo systemctl start crio
    
    1. Disable selinux
    sudo setenforce 0
    sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
    
    1. Download Release
    sudo curl -L --remote-name-all https://dl.k8s.io/v1.25.2/bin/linux/ppc64le/{kubeadm,kubelet,kubectl}
    sudo chmod +x {kubeadm,kubelet,kubectl}
    
    1. Move files to /bin and kubelet to root
    mv kube* /bin/
    cp kubelet /
    
    1. Add kubelet.service
    RELEASE_VERSION="v0.14.0"
    curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubelet/lib/systemd/system/kubelet.service" | sed "s:/usr/bin:${DOWNLOAD_DIR}:g" | sudo tee /etc/systemd/system/kubelet.service
    sudo mkdir -p /etc/systemd/system/kubelet.service.d
    curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubeadm/10-kubeadm.conf" | sed "s:/usr/bin:${DOWNLOAD_DIR}:g" | sudo tee /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
    
    1. Enable and start service
    systemctl enable --now kubelet
    systemctl start kubelet
    
    1. Download crictl
    curl -L --remote-name-all https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.25.0/crictl-v1.25.0-linux-ppc64le.tar.gz
    tar xvfz crictl-v1.25.0-linux-ppc64le.tar.gz
    chmod +x crictl
    mv crictl /bin
    
    1. Update the cgroup settings
    cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
    overlay
    br_netfilter
    EOF
    
    1. Use modprobe for the modules
    sudo modprobe overlay
    sudo modprobe br_netfilter
    
    1. Setup the sysctl.d for k8s.conf
    cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
    net.bridge.bridge-nf-call-iptables  = 1
    net.bridge.bridge-nf-call-ip6tables = 1
    net.ipv4.ip_forward                 = 1
    EOF
    
    1. Apply sysctl params without reboot

    sysctl --system

    1. Install libnetfilter and conntrack-tools
    rpm -ivh http://mirror.stream.centos.org/9-stream/AppStream/ppc64le/os/Packages/libnetfilter_queue-1.0.5-1.el9.ppc64le.rpm
    rpm -ivh http://mirror.stream.centos.org/9-stream/AppStream/ppc64le/os/Packages/libnetfilter_cttimeout-1.0.0-19.el9.ppc64le.rpm
    rpm -ivh http://mirror.stream.centos.org/9-stream/AppStream/ppc64le/os/Packages/libnetfilter_cthelper-1.0.0-22.el9.ppc64le.rpm
    rpm -ivh http://mirror.stream.centos.org/9-stream/AppStream/ppc64le/os/Packages/conntrack-tools-1.4.5-15.el9.ppc64le.rpm
    
    1. Just in case, I setup a calico ignore and loaded the calicoctl
    cat << EOF > /etc/NetworkManager/conf.d/calico.conf
    [keyfile]
    unmanaged-devices=interface-name:cali*;interface-name:tunl*;interface-name:vxlan.calico;interface-name:vxlan-v6.calico;interface-name:wireguard.cali;interface-name:wg-v6.cali
    EOF
    
    1. Download the ctl for calico
    curl -L -o calicoctl https://github.com/projectcalico/calico/releases/download/v3.24.1/calicoctl-linux-ppc64le
    chmod +x calicoctl
    mv calicoctl /bin
    
    1. Edit crio to add the last two values
    vi /etc/crio/crio.conf
    
    [crio.runtime]
    conmon_cgroup = "pod"
    cgroup_manager = "systemd"
    
    1. Setup the master node.
    [root@ocp4daily17 ~]# kubeadm init --cri-socket=unix:///var/run/crio/crio.sock --pod-network-cidr=192.168.0.0/16
    [init] Using Kubernetes version: v1.25.2
    [preflight] Running pre-flight checks
    	[WARNING SystemVerification]: missing optional cgroups: blkio
    [preflight] Pulling images required for setting up a Kubernetes cluster
    [preflight] This might take a minute or two, depending on the speed of your internet connection
    [preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
    [certs] Using certificateDir folder "/etc/kubernetes/pki"
    [certs] Generating "ca" certificate and key
    [certs] Generating "apiserver" certificate and key
    [certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local ocp4daily17.xxxx] and IPs [10.96.0.1 x.x.x.x]
    [certs] Generating "apiserver-kubelet-client" certificate and key
    [certs] Generating "front-proxy-ca" certificate and key
    [certs] Generating "front-proxy-client" certificate and key
    [certs] Generating "etcd/ca" certificate and key
    [certs] Generating "etcd/server" certificate and key
    ...
    [addons] Applied essential addon: CoreDNS
    [addons] Applied essential addon: kube-proxy
    
    Your Kubernetes control-plane has initialized successfully!
    
    To start using your cluster, you need to run the following as a regular user:
    
      mkdir -p $HOME/.kube
      sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
      sudo chown $(id -u):$(id -g) $HOME/.kube/config
    
    Alternatively, if you are the root user, you can run:
    
      export KUBECONFIG=/etc/kubernetes/admin.conf
    
    You should now deploy a pod network to the cluster.
    Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
      https://kubernetes.io/docs/concepts/cluster-administration/addons/
    
    Then you can join any number of worker nodes by running the following on each as root:
    
    kubeadm join x.x.x.x:6443 --token dagtwm.98989 \
    	--discovery-token-ca-cert-hash sha256:9898989 
    
    1. Run join on worker
    kubeadm join 9.47.90.127:6443 --token dagtwm.9898989 	--discovery-token-ca-cert-hash sha256:9898989
    
    1. Config kubectl on the Master node.
    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config
    
    1. Confirm that you are running on a P10 and the nodes are ready.

    a. Confirm CPU architecture

    [root@ocp4daily70 ~]# cat /proc/cpuinfo | grep cpu | uniq
    cpu		: POWER10 (architected), altivec supported
    

    b. confirm nodes are ready

    [root@ocp4daily70 ~]# kubectl get nodes
    NAME                 STATUS   ROLES           AGE   VERSION
    ocp4daily17.nip.io   Ready    control-plane   40m   v1.25.2
    ocp4daily70.nip.io   Ready    <none>          38m   v1.25.2
    

    You now have a working P10 with RHEL and Kubernetes.

    Debugging

    If you see… NetworkReady

    Sep 29 13:17:00 ocp4daily17.x.x.x.x kubelet[67264]: E0929 13:17:00.108806 67264 kubelet.go:2373] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: No CNI configuration file in /etc/cni/net.d/. Has your network provider started?"

    1. Check that CRIO is configured with systemd and not cgroupfs

    2. Restart CRIO

    systemctl stop crio; sleep 10s; systemctl start crio
    

    Warnings that lead to cgroupfs cgroup driver

    You should use systemd for cgroup driver. Check that there is not a /etc/default/kubelet (cgroup-driver setting)

    References

    • http://mirror.stream.centos.org/9-stream/AppStream/ppc64le/os/Packages/
    • https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
    • https://upcloud.com/resources/tutorials/install-kubernetes-cluster-centos-8
    • https://github.com/cri-o/cri-o/blob/main/tutorials/kubeadm.md.
    • https://www.linuxtechi.com/how-to-install-kubernetes-cluster-rhel/
    • https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/
    • https://kubernetes.io/docs/setup/production-environment/container-runtimes/
    • https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/
    • https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/troubleshooting-kubeadm/
    • https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/
  • Learning Resources for Operators – First Two Weeks Notes

    To quote the Kubernetes website, “The Operator pattern captures how you can write code to automate a task beyond what Kubernetes itself provides.” The following is an compendium to use while Learning Operators.

    The defacto SDK to use is the Operator SDK which provides HELM, Ansible and GO scaffolding to support your implementation of the Operator pattern.

    The following are education classes on the OperatorSDK

    When Running through the CO0201EN intermediate operators course, I did hit the case where I had to create a ClusterRole and ClusterRoleBinding for the ServiceAccount, here is a snippet that might helper others:

    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      namespace: memcached-operator-system
      name: service-reader-cr-mc
    rules:
    - apiGroups: ["cache.bastide.org"] # "" indicates the core API group
      resources: ["memcacheds"]
      verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      namespace: memcached-operator-system
      name: ext-role-binding
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: service-reader-cr-mc
    subjects:
    - kind: ServiceAccount
      namespace: memcached-operator-system
      name: memcached-operator-controller-manager

    The reason for the above, I missed adding a kubebuilder declaration:

    //+kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete
    //+kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch

    Thanks to https://stackoverflow.com/a/60334649/1873438

    The following are articles worth reviewing:

    The following are good Go resources:

    1. Go Code Comments – To write idiomatic Go, you should review the Code Review comments.
    2. Getting to Go: The Journey of Go’s Garbage Collector – The reference for Go and Garbage Collection in go
    3. An overview of memory management in Go – good overview of Go Memory Management
    4. Golang: Cost of using the heap – net 1M allocation seems to stay in the stack, outside it seems to be on the heap
    5. golangci-lint – The aggregated linters project is worthy of an installation and use. It’ll catch many issues and has a corresponding GitHub Action.
    6. Go in 3 Weeks A comprehensive training for Go. Companion to GitHub Repo
    7. Defensive Coding Guide: The Go Programming Language

    The following are good OpenShift resources:

    1. Create OpenShift Plugins – You must have a CLI plug-in file that begins with oc- or kubectl-. You create a file and put it in /usr/local/bin/
    2. Details on running Code Ready Containers on Linux – The key hack I learned awas to ssh -i ~/.crc/machines/crc/id_ecdsa core@<any host in the /etc/hosts>
      1. I ran on VirtualBox Ubuntu 20.04 with Guest Additions Installed
      2. Virtual Box Settings for the Machine – 6 CPU, 18G
        1. System > Processor > Enable PAE/NX and Enable Nested VT-X/AMD-V (which is a must for it to work)
        1. Network > Change Adapter Type to virtio-net and Set Promiscuous Mode to Allow VMS
      3. Install openssh-server so you can login remotely
      4. It will not install without a windowing system, so I have the default windowing environment installed.
      5. Note, I still get a failure on startup complaining about a timeout. I waited about 15 minutes post this, and the command oc get nodes –context admin –cluster crc –kubeconfig .crc/cache/crc_libvirt_4.10.3_amd64/kubeconfig now works.
    3. CRC virsh cheatsheet – If you are running Code Ready Containers and need to debug, you can use the virsh cheatsheet.