Category: Application Development

Use Qemu to Build S390x images

Tips to build Qemu S390x images

Connect to a build machine

ssh root@ip

Clone the operator

“git clone https://github.com/prb112/operator.git“`

Install qemu and buildah and podman-docker

yum install -y qemu-kvm buildah podman-docker

/usr/bin/docker run --rm --privileged tonistiigi/binfmt:latest --install all
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
✔ docker.io/tonistiigi/binfmt:latest
Trying to pull docker.io/tonistiigi/binfmt:latest...
Getting image source signatures
Copying blob e9c608ddc3cb done  
Copying blob 8d4d64c318a5 done  
Copying config 354472a378 done  
Writing manifest to image destination
Storing signatures
installing: arm64 OK
installing: arm OK
installing: ppc64le OK
installing: mips64 OK
installing: riscv64 OK
installing: mips64le OK
installing: s390x OK
{
  "supported": [
    "linux/amd64",
    "linux/arm64",
    "linux/riscv64",
    "linux/ppc64le",
    "linux/s390x",
    "linux/386",
    "linux/mips64le",
    "linux/mips64",
    "linux/arm/v7",
    "linux/arm/v6"
  ],
  "emulators": [
    "kshcomp",
    "qemu-aarch64",
    "qemu-arm",
    "qemu-mips64",
    "qemu-mips64el",
    "qemu-ppc64le",
    "qemu-riscv64",
    "qemu-s390x"
  ]
}

/usr/bin/buildah bud --arch s390x -f $(pwd)/build/Dockerfile --format docker --tls-verify=true -t op:v0.1.1-linux-s390x $(pwd)/

2022-10-25

openshift-install-power – quick notes

FYI: openshift-install-power – this is a small recipe for deploying the latest code with the UPI from master branch @ my repo

git clone https://github.com/ocp-power-automation/openshift-install-power.git
chmod +x openshift-install-powervs
export IBMCLOUD_API_KEY="<<redacted>>"
export RELEASE_VER=latest
export ARTIFACTS_VERSION="master"
export ARTIFACTS_REPO="<<MY REPO>>"
./openshift-install-powervs setup
./openshift-install-powervs create -var-file mon01-20220930.tfvars -flavor small -trace

This also recover from errors in ocp4-upi-powervs/terraform

2022-10-17

Topology Manager and OpenShift/Kubernetes
I recently had to work with the Kubernetes Topology Manager and OpenShift. Here is a braindump on Topology Manager:

If the Topology Manager – Feature Gate is enabled, then any active HintProviders are registered to the TopologyManager.

If the CPU Manager and feature gate are enabled, then the CPU Manager can be used to help workloads which are sensitive to CPU throttling, context switches, cache misses, require hyperthreads on same physical CPU core, low latency, and benefit from shared processor resources. The manager has two policies none and static which registers a NOP provider or statically locks the container to a set of CPUs.

If the Memory Manager and feature gate are enabled, then the MemoryManager can be used to process independently of the CPU Manager – e.g. allocate HugePages or guarnteed memory.

If Device Plugins are enabled, then it can be turned on to allocate Devices next to NUMA node resources (e.g., SR-IOV NICs). This may be used independent of the typical CPU/Memory management for GPUs and other machine devices.

Generally, these are all used together to generate a BitMask that admits a pod using a best-effort, restricted, or single-numa-node policy.

An important limitation is the Maximum Number of NUMA nodes is hard-coded to 8. When there are more than eight NUMA nodes, it’ll error out when assigning to the topology. The reason for this is related to state explosion and computational complexity.
1. Check the worker nodes CPU if the NUMA returns 1, it’s a single NUMA node. If it returns 2 or more, it’s multiple NUMA nodes.
```
sh-4.4# lscpu | grep 'NUMA node(s)'
NUMA node(s):        1
```
The kubernetes/enhancements repo contains great detail on the flows and weaknesses of the TopologyManager.

To enable the Topology Manager, one uses Feature Gates:
And OpenShift prefers the FeatureSet LatencySensitive
1. Via FeatureGate
```
$ oc patch featuregate cluster -p '{"spec": {"featureSet": "LatencySensitive"}}' --type merge
```
Which turns on the basic TopologyManager /etc/kubernetes/kubelet.conf
```
  "featureGates": {
    "APIPriorityAndFairness": true,
    "CSIMigrationAzureFile": false,
    "CSIMigrationvSphere": false,
    "DownwardAPIHugePages": true,
    "RotateKubeletServerCertificate": true,
    "TopologyManager": true
  },
```
1. Create a custom KubeletConfig, this allows targeted TopologyManager feature enablement.
file: cpumanager-kubeletconfig.yaml
```
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
  name: cpumanager-enabled
spec:
  machineConfigPoolSelector:
    matchLabels:
      custom-kubelet: cpumanager-enabled
  kubeletConfig:
     cpuManagerPolicy: static 
     cpuManagerReconcilePeriod: 5s 
```
```
$ oc create -f cpumanager-kubeletconfig.yaml
```
Net: They can be used independent of each other. They should be turned on at the same time to maximize the benefits.

There are some examples and test cases out there for Kubernetes and OpenShift
1. Red Hat Sys Engineering Team Test cases for Performance Addon Operator which is now the Cluster Node Tuning Operator– These are the clearest tests, which apply directly to the Topology Manager.
2. Kube Test Cases
  
  Topology Manager
  
  CPU Manager
  
  Device Plugin If you already do SRIOV testing, this should be implicitly covered.
  
  Memory manager
  
  Test Cases Matrix from Kubernetes PR #83481
This is one of the best examples k8stopologyawareschedwg/sample-device-plugin.

Tools to know about
1. GitHub: numalign (amd64) – you can download this in the releases. In this fork prb112/numalign I added ppc64le to the build
2. numactl and numastat are superbly helpful to see the topology spread on a node link to a handy pdf on numa I’ve been starting up a fedora container with numactl and numastat installed
Final note, I had written down that fedora is a great combination with taskset and numactl if you copy in the binaries. I think I used Fedora 35/36 as a container. link

Yes. I built a Hugepages hungry container Hugepages. I also looked at hugepages_tests.go and the test plan.

When it came down to it, I used my hunger container with the example.

I hope this helps others as they start to work with Topology Manager.

References

Red Hat
1. Red Hat Topology Aware Scheduling in Kubernetes Part 1: The High Level Business Case
2. Red Hat Topology Awareness in Kubernetes Part 2: Don’t we already have a Topology Manager?
OpenShift
Kubernetes
Kubernetes Enhancement
1. KEP-693: Node Topology Manager e2e tests: Link
2. KEP-2625: CPU Manager e2e tests: Link
3. KEP-1769: Memory Manager Source: Link PR: Link
2022-10-14

Kube 1.25.2 on RHEL9 P10

Update Hosts

9.0.90.0 ocp4daily70.ibm.com
9.0.90.1 ocp4daily98.ibm.com

Setup the Subscription Manager

set +o history
export rhel_subscription_username="rhn-ee-xxxxx"
export rhel_subscription_password="xxxxx"
set -o history
subscription-manager register --username="${rhel_subscription_username}" --password="${rhel_subscription_password}"
subscription-manager refresh

Disable the swap

sudo swapoff -a

Install the libraries

yum install -y podman podman-remote socat runc

Install the cri-o package

rpm -ivh https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable:/cri-o:/1.25:/1.25.0/Fedora_36/ppc64le/cri-o-1.25.0-2.1.fc36.ppc64le.rpm

Enable podman socket

systemctl enable --now podman.socket

Enable crio service

sudo systemctl enable crio
sudo systemctl start crio

Disable selinux

sudo setenforce 0
sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config

Download Release

export RELEASE=1.25
sudo curl -L --remote-name-all https://dl.k8s.io/v1.25.2/bin/linux/ppc64le/{kubeadm,kubelet,kubectl}
sudo chmod +x {kubeadm,kubelet,kubectl}

Move files to /bin

mv kube* /bin/

Add kubelet.service

RELEASE_VERSION="v0.14.0"
curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubelet/lib/systemd/system/kubelet.service" | sed "s:/usr/bin:${DOWNLOAD_DIR}:g" | sudo tee /etc/systemd/system/kubelet.service
sudo mkdir -p /etc/systemd/system/kubelet.service.d
curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubeadm/10-kubeadm.conf" | sed "s:/usr/bin:${DOWNLOAD_DIR}:g" | sudo tee /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

Enable and start service

systemctl enable --now kubelet
systemctl start kubelet

Update the cgroup settings

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

Load the modules

sudo modprobe overlay
sudo modprobe br_netfilter

sysctl params required by setup, params persist across reboots

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

Apply sysctl params without reboot

sudo sysctl --system

Install libnetfilter and conntrack-tools

rpm -ivh http://mirror.stream.centos.org/9-stream/AppStream/ppc64le/os/Packages/libnetfilter_queue-1.0.5-1.el9.ppc64le.rpm
rpm -ivh http://mirror.stream.centos.org/9-stream/AppStream/ppc64le/os/Packages/libnetfilter_cttimeout-1.0.0-19.el9.ppc64le.rpm
rpm -ivh http://mirror.stream.centos.org/9-stream/AppStream/ppc64le/os/Packages/libnetfilter_cthelper-1.0.0-22.el9.ppc64le.rpm
rpm -ivh http://mirror.stream.centos.org/9-stream/AppStream/ppc64le/os/Packages/conntrack-tools-1.4.5-15.el9.ppc64le.rpm

Copy Kubelet

cp /bin/kubelet /kubelet

Edit crio.conf

/etc/crio/crio.conf

conmon_cgroup = "pod"
cgroup_manager = "systemd"

Add the plugins:

curl -O https://github.com/containernetworking/plugins/releases/download/v1.1.1/cni-plugins-linux-ppc64le-v1.1.1.tgz -L
cp cni-plugins-linux-ppc64le-v1.1.1.tgz /opt/cni/bin
cd /opt/cni/bin
tar xvfz cni-plugins-linux-ppc64le-v1.1.1.tgz 
chmod +x /opt/cni/bin/*
cd ~
systemctl restart crio kubelet

Download crictl

curl -L --remote-name-all https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.25.0/crictl-v1.25.0-linux-ppc64le.tar.gz
tar xvfz crictl-v1.25.0-linux-ppc64le.tar.gz
chmod +x crictl
mv crictl /bin

Create the kubeadm

kubeadm init --cri-socket=unix:///var/run/crio/crio.sock --pod-network-cidr=192.168.0.0/16

Setup the configuration

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Manually copy over the .kube/config over to the worker node and do a kubeadm reset
Download https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
Edit the containers to point to the right instance, per the notes in the yaml to the ppc64le manifests
Update net-conf.json

  net-conf.json: |
    {
      "Network": "192.168.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }

Join the Cluster

kubeadm join 9.0.90.1:6443 --token xbp7gy.9eem3bta75v0ccw8 \
        --discovery-token-ca-cert-hash sha256:a822342f231db2e730559b4962325a2c2c685d7fc440ae41987e123da47f9118

Add role to the workers

kubectl label node ocp4daily70.ibm.com node-role.kubernetes.io/worker=worker

2022-10-11

Switching to use Kubernetes with Flannel on RHEL on P10
I needed to switch from calico to flannel. Here is the recipe I followed to setting up Kubernetes 1.25.2 on a Power 10 using Flannel.
Switching to use Kubernetes with Flannel on RHEL on P10
1. Connect to both VMs (in split terminal)
```
ssh root@control-1
ssh root@worker-1
```
1. Run Reset (acknowledge that you want to proceed)
```
kubeadm reset
```
1. Remove Calico
```
rm /etc/cni/net.d/10-calico.conflist 
rm /etc/cni/net.d/calico-kubeconfig
iptables-save | grep -i cali | iptables -F
iptables-save | grep -i cali | iptables -X 
```
1. Initialize the cluster
```
kubeadm init --cri-socket=unix:///var/run/crio/crio.sock --pod-network-cidr=192.168.0.0/16
```
1. Setup kubeconfig
```
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
```
1. Add the plugins:
```
curl -O https://github.com/containernetworking/plugins/releases/download/v1.1.1/cni-plugins-linux-ppc64le-v1.1.1.tgz -L
cp cni-plugins-linux-ppc64le-v1.1.1.tgz /opt/cni/bin
cd /opt/cni/bin
tar xvfz cni-plugins-linux-ppc64le-v1.1.1.tgz 
chmod +x /opt/cni/bin/*
cd ~
systemctl restart crio kubelet
```
1. Download https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
2. Edit the containers to point to the right instance, per the notes in the yaml to the ppc64le manifests
3. Update net-conf.json
```
  net-conf.json: |
    {
      "Network": "192.168.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }
```
1. Join the Cluster
kubeadm join 1.1.1.1:6443 –token y004bg.sc65cp7fqqm7ladg
–discovery-token-ca-cert-hash sha256:1c32dacdf9b934b7bbd6d13fde9312a35709e2f5849008acec8f597eb5a5dad9
1. Add role to the workers
```
kubectl label node worker-01.ocp-power.xyz node-role.kubernetes.io/worker=worker
```
Ref: https://gist.github.com/rkaramandi/44c7cea91501e735ea99e356e9ae7883 Ref: https://www.buzzwrd.me/index.php/2022/02/16/calico-to-flannel-changing-kubernetes-cni-plugin/
2022-10-07

Using Kubernetes v1.25.2 on RHEL9 with Power10

My squad is doing work with Kubernetes v1.25.2 on Red Hat Enterprise Linux 9 and IBM Power 10.

As a pre-requisite for the work, we setup two RHEL9 VMs on a P10 with 8cpu and 16GB ram and 100G disk.

Steps

Added Subscription-Manager to Each Machine

set +o history
export rhel_subscription_username="rhn-ee-xxx"
export rhel_subscription_password="xxxxxx"
set -o history

subscription-manager register --username="${rhel_subscription_username}" --password="${rhel_subscription_password}"
subscription-manager refresh

Disabled swap

sudo swapoff -a

On Each Node, run echo $(hostname -i) $(hostname --long) and use the primary ipv4 ip address.
Update /etc/hosts with the output on each node

10.47.90.180 ocp4daily70.ocp-power.xyz
10.47.90.127 ocp4daily17.ocp-power.xyz

Install podman, podman remotes, socat, runc, conmon

yum install -y podman podman-remote socat runc conmon

Enable the podman socket

systemctl enable --now podman.socket

Check Remote podman-remote info should show information
Added these Repos

subscription-manager repos --enable="rhel-9-for-ppc64le-appstream-rpms" --enable="rhel-9-for-ppc64le-baseos-rpms" --enable="rhv-4-tools-for-rhel-9-ppc64le-source-rpms" --enable="fast-datapath-for-rhel-9-ppc64le-rpms"

Install cri-o

rpm -ivh https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable:/cri-o:/1.25:/1.25.0/Fedora_36/ppc64le/cri-o-1.25.0-2.1.fc36.ppc64le.rpm

Start crio

$ sudo systemctl enable crio
Created symlink /etc/systemd/system/cri-o.service → /usr/lib/systemd/system/crio.service.
Created symlink /etc/systemd/system/multi-user.target.wants/crio.service → /usr/lib/systemd/system/crio.service.
$ sudo systemctl start crio

Disable selinux

sudo setenforce 0
sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config

Download Release

sudo curl -L --remote-name-all https://dl.k8s.io/v1.25.2/bin/linux/ppc64le/{kubeadm,kubelet,kubectl}
sudo chmod +x {kubeadm,kubelet,kubectl}

Move files to /bin and kubelet to root

mv kube* /bin/
cp kubelet /

Add kubelet.service

RELEASE_VERSION="v0.14.0"
curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubelet/lib/systemd/system/kubelet.service" | sed "s:/usr/bin:${DOWNLOAD_DIR}:g" | sudo tee /etc/systemd/system/kubelet.service
sudo mkdir -p /etc/systemd/system/kubelet.service.d
curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubeadm/10-kubeadm.conf" | sed "s:/usr/bin:${DOWNLOAD_DIR}:g" | sudo tee /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

Enable and start service

systemctl enable --now kubelet
systemctl start kubelet

Download crictl

curl -L --remote-name-all https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.25.0/crictl-v1.25.0-linux-ppc64le.tar.gz
tar xvfz crictl-v1.25.0-linux-ppc64le.tar.gz
chmod +x crictl
mv crictl /bin

Update the cgroup settings

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

Use modprobe for the modules

sudo modprobe overlay
sudo modprobe br_netfilter

Setup the sysctl.d for k8s.conf

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

Apply sysctl params without reboot

sysctl --system

Install libnetfilter and conntrack-tools

rpm -ivh http://mirror.stream.centos.org/9-stream/AppStream/ppc64le/os/Packages/libnetfilter_queue-1.0.5-1.el9.ppc64le.rpm
rpm -ivh http://mirror.stream.centos.org/9-stream/AppStream/ppc64le/os/Packages/libnetfilter_cttimeout-1.0.0-19.el9.ppc64le.rpm
rpm -ivh http://mirror.stream.centos.org/9-stream/AppStream/ppc64le/os/Packages/libnetfilter_cthelper-1.0.0-22.el9.ppc64le.rpm
rpm -ivh http://mirror.stream.centos.org/9-stream/AppStream/ppc64le/os/Packages/conntrack-tools-1.4.5-15.el9.ppc64le.rpm

Just in case, I setup a calico ignore and loaded the calicoctl

cat << EOF > /etc/NetworkManager/conf.d/calico.conf
[keyfile]
unmanaged-devices=interface-name:cali*;interface-name:tunl*;interface-name:vxlan.calico;interface-name:vxlan-v6.calico;interface-name:wireguard.cali;interface-name:wg-v6.cali
EOF

Download the ctl for calico

curl -L -o calicoctl https://github.com/projectcalico/calico/releases/download/v3.24.1/calicoctl-linux-ppc64le
chmod +x calicoctl
mv calicoctl /bin

Edit crio to add the last two values

vi /etc/crio/crio.conf

[crio.runtime]
conmon_cgroup = "pod"
cgroup_manager = "systemd"

Setup the master node.

[root@ocp4daily17 ~]# kubeadm init --cri-socket=unix:///var/run/crio/crio.sock --pod-network-cidr=192.168.0.0/16
[init] Using Kubernetes version: v1.25.2
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: missing optional cgroups: blkio
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local ocp4daily17.xxxx] and IPs [10.96.0.1 x.x.x.x]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
...
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join x.x.x.x:6443 --token dagtwm.98989 \
	--discovery-token-ca-cert-hash sha256:9898989

Run join on worker

kubeadm join 9.47.90.127:6443 --token dagtwm.9898989 	--discovery-token-ca-cert-hash sha256:9898989

Config kubectl on the Master node.

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Confirm that you are running on a P10 and the nodes are ready.

a. Confirm CPU architecture

[root@ocp4daily70 ~]# cat /proc/cpuinfo | grep cpu | uniq
cpu		: POWER10 (architected), altivec supported

b. confirm nodes are ready

[root@ocp4daily70 ~]# kubectl get nodes
NAME                 STATUS   ROLES           AGE   VERSION
ocp4daily17.nip.io   Ready    control-plane   40m   v1.25.2
ocp4daily70.nip.io   Ready    <none>          38m   v1.25.2

You now have a working P10 with RHEL and Kubernetes.

Debugging

If you see… NetworkReady

Sep 29 13:17:00 ocp4daily17.x.x.x.x kubelet[67264]: E0929 13:17:00.108806 67264 kubelet.go:2373] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: No CNI configuration file in /etc/cni/net.d/. Has your network provider started?"

Check that CRIO is configured with systemd and not cgroupfs
Restart CRIO

systemctl stop crio; sleep 10s; systemctl start crio

Warnings that lead to cgroupfs cgroup driver

You should use systemd for cgroup driver. Check that there is not a /etc/default/kubelet (cgroup-driver setting)

References

http://mirror.stream.centos.org/9-stream/AppStream/ppc64le/os/Packages/
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
https://upcloud.com/resources/tutorials/install-kubernetes-cluster-centos-8
https://github.com/cri-o/cri-o/blob/main/tutorials/kubeadm.md.
https://www.linuxtechi.com/how-to-install-kubernetes-cluster-rhel/
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/
https://kubernetes.io/docs/setup/production-environment/container-runtimes/
https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/troubleshooting-kubeadm/
https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/

2022-09-30

Operator Doesn’t Install Successfully: How to restart it

You see there is an issue with the unpacking your operator in the Operator Hub.

Recreate the Job that does the download by recreating the job and subscription.

Find the Job (per RH 6459071)

$ oc get job -n openshift-marketplace -o json | jq -r '.items[] | select(.spec.template.spec.containers[].env[].value|contains ("myop")) | .metadata.name'

2. Reset the download the Job

for i in $(oc get job -n openshift-marketplace -o json | jq -r '.items[] | select(.spec.template.spec.containers[].env[].value|contains ("myop")) | .metadata.name'); do
  oc delete job $i -n openshift-marketplace; 
  oc delete configmap $i -n openshift-marketplace; 
done

3. Recreate your Subscription and you’ll see more details on the Job’s failure. Keep an eagle eye on the updates as it rolls over quickly.

Message: rpc error: code = Unknown desc = pinging container registry registry.stage.redhat.io: Get "https://xyz/v2/": x509: certificate signed by unknown authority.

You’ve seen how to restart the download/pull through job.

2022-08-26

Downloading pvsadm and getting VIP details

pvsadm is an unsupported tool that helps with Power Virtual Server administration. I needed this detail for my CAPI tests.

Get the latest download_url per StackOverflow

$ curl -s https://api.github.com/repos/ppc64le-cloud/pvsadm/releases/latest | grep browser_download_url | cut -d '"' -f 4
...
https://github.com/ppc64le-cloud/pvsadm/releases/download/v0.1.7/pvsadm-linux-ppc64le
...

Download the pvsadm tool using the url from above.

$ curl -o pvsadm -L https://github.com/ppc64le-cloud/pvsadm/releases/download/v0.1.7/pvsadm-linux-ppc64le
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 21.4M  100 21.4M    0     0  34.9M      0 --:--:-- --:--:-- --:--:-- 34.9M

Make the pvsadm tool executable

$ chmod +x pvsadm

Create the API Key at https://cloud.ibm.com/iam/apikeys
On the terminal, export the IBMCLOUD_API_KEY.

$ export IBMCLOUD_API_KEY=...REDACTED...

Grab the details of your network VIP using your service name and network.

$ ./pvsadm get ports --instance-name demo --network topman-pub-net
I0808 10:41:26.781531  125151 root.go:49] Using an API key from IBMCLOUD_API_KEY environment variable
+-------------+----------------+----------------+-------------------+--------------------------------------+--------+
| DESCRIPTION |   EXTERNALIP   |   IPADDRESS    |    MACADDRESS     |                PORTID                | STATUS |
+-------------+----------------+----------------+-------------------+--------------------------------------+--------+
|             | 1.1.1.1        | 2.2.2.2        | aa:24:7c:5d:cb:bb | aaa-bbb-ccc-ddd-eee                  | ACTIVE |
+-------------+----------------+----------------+-------------------+--------------------------------------+--------+

2022-08-08

PowerVS: Grabbing a VM Instance Console

Create the API Key at https://cloud.ibm.com/iam/apikeys
On the terminal, export the IBMCLOUD_API_KEY.

$  export IBMCLOUD_API_KEY=...REDACTED...

$ ibmcloud login --apikey "${IBMCLOUD_API_KEY}" -r ca-tor
API endpoint: https://cloud.ibm.com
Authenticating...
OK

Targeted account Demo <-> 1012

Targeted region ca-tor

Users of 'ibmcloud login --vpc-cri' need to use this API to login until July 6, 2022: https://cloud.ibm.com/apidocs/vpc-metadata#create-iam-token
                      
API endpoint:      https://cloud.ibm.com   
Region:            ca-tor   
User:              myuser@us.ibm.com   
Account:           Demo <-> 1012   
Resource group:    No resource group targeted, use 'ibmcloud target -g RESOURCE_GROUP'   
CF API endpoint:      
Org:                  
Space:

List your PowerVS services

$ ibmcloud pi sl
Listing services under account Demo as user myuser@us.ibm.com...
ID                                                                                                                   Name   
crn:v1:bluemix:public:power-iaas:mon01:a/999999c1f1c29460e8c2e4bb8888888:ADE123-8232-4a75-a9d4-0e1248fa30c6::     demo-service

Target your PowerVS instance

$ ibmcloud pi st crn:v1:bluemix:public:power-iaas:mon01:a/999999c1f1c29460e8c2e4bb8888888:ADE123-8232-4a75-a9d4-0e1248fa30c6::

List the PowerVS Services’ VMs

$ ibmcloud pi ins                                                  
Listing instances under account Demo as user myuser@us.ibm.com...
ID                                     Name                                   Path   
12345-ae8f-494b-89f3-5678   control-plane-x       /pcloud/v1/cloud-instances/abc-def-ghi-jkl/pvm-instances/12345-ae8f-494b-89f3-5678

Create a Console for the VM instance you want to look at:

$ ibmcloud pi ingc control-plane-x
Getting console for instance control-plane-x under account Demo as user myuser@us.ibm.com...
                 
Name          control-plane-x   
Console URL   https://mon01-console.power-iaas.cloud.ibm.com/console/index.html?path=%3Ftoken%3not-real

Click on the Console URL, and view in your browser. it can be very helpful.

I was able to diagnose that I had the wrong reference image.

2022-08-08

cluster-api: notes
Recently, I’ve begun researching and building code integrating and using the cluster-api and providers (CAPI).

To get acquainted with the cluster-api and related providers, I reviewed:
To visualize a CAPI Setup you can use cluster-api-visualizer

To ask questions of the developers, you can join the Kubernetes Slack, follow the instructions at link.
2022-08-03

Category: Application Development

Tools to know about

References

Switching to use Kubernetes with Flannel on RHEL on P10

Steps

Debugging

If you see… NetworkReady

Warnings that lead to cgroupfs cgroup driver

References