Tag: linux

  • A couple IBM Power related updates

    A couple quick updates…

    opentofus – a terraform Compatible Build for ppc64le

    The Oregon State University Open Source Lab (OSU OSL) provides Power servers to develop and test open source projects on the Power Architecture platform. OSU OSL provides ppc64le VMs and bare metal machines as well as CI. Read more about their Power services here.

    You can download the latest version of OpenTofu for ppc64le here. A pull request for a documentation update has now merged. View the official OpenTofu documentation here.

    https://community.ibm.com/community/user/powerdeveloper/blogs/mick-tarsel/2024/03/04/opentofu-openshift-ppc64le

    Cost Management for OpenShift is a SaaS offering that provides users cost visibility across their hybrid cloud environments. The Cost Management Operator obtains OpenShift usage data by querying Prometheus every hour to create usage reports which is then uploaded to Cost Management at console.redhat.com to be processed and viewed.

    Red Hat Cost Management is now available on IBM Power with the latest release version 3.2

     https://community.ibm.com/community/user/powerdeveloper/blogs/jason-cho2/2024/03/04/red-hat-cost-management-on-ibm-power?CommunityKey=daf9dca2-95e4-4b2c-8722-03cd2275ab63

    FYI: Chandan posted Multi-Architecture Compute: Supporting Architecture Specific Operating System and Kernel Parameters https://community.ibm.com/community/user/powerdeveloper/blogs/chandan-abhyankar/2024/03/06/multi-architecture-compute-supporting-architecture

  • January 2023 – Lessons Learned

    For the month, I learned lots of things, and wanted to share them as part of snippets that you might find useful.

    Create a virtual server instance in IBM Power Virtual Server using Red Hat Ansible Automation Platform

    The Power Developer Exchange article dives into using the Red Hat Ansible Automation Platform and how to create PowerVS instances with Ansible. The collection is available at https://github.com/IBM-Cloud/ansible-collection-ibm

    Per the blog, you learn to start a sample controller UI and running some sample program such as hello_world.yaml playbook to say hello to Ansible. With Ansible the options are infinite, and there is always something more to explore. We would like to know how you are using this solution, so drop us a comment. 

    IBM Power Developer Exchange

    kube-burner is now a CNCF project

    kube-burner is a Kubernetes performance and scale test orchestration framework written in golang

    kube-burner

    Clock Drift Fix for Podman

    To update the default Podman-Machine:

    podman machine ssh --username root -- sed -i 's/^makestep\ .*$/makestep\ 1\ -1/' /etc/chrony.conf
    podman machine ssh --username root -- systemctl restart chronyd

    https://github.com/containers/podman/issues/11541#issuecomment-1416695974

    Advanced Cluster Manage cross Networks

    The cluster wasn’t getting loaded, so I checked the following…. and it pointed to an issue of a call back to a cluster inside my firewall setup. The klusterlet shows that it’s an issue with a callback.

    oc get pod -n open-cluster-management-agent


    ❯ oc get klusterlet klusterlet -oyaml
    Failed to create &SelfSubjectAccessReview{ObjectMeta:{ 0 0001-01-01 00:00:00 +0000 UTC map[] map[] [] [] []},Spec:SelfSubjectAccessReviewSpec{ResourceAttributes:&ResourceAttributes{Namespace:,Verb:create,Group:cluster.open-cluster-management.io,Version:,Resource:managedclusters,Subresource:,Name:,},NonResourceAttributes:nil,},Status:SubjectAccessReviewStatus{Allowed:false,Reason:,EvaluationError:,Denied:false,},} with bootstrap secret “open-cluster-management-agent” “bootstrap-hub-kubeconfig”: Post “https://api.<XYZ>.com:6443/apis/authorization.k8s.io/v1/selfsubjectaccessreviews”: dial tcp: lookup api.acmfunc.cp.fyre.ibm.com on 172.30.0.10:53: no such host

    Fun way to look at design

  • Identifying Kernel Memory Usage Culprits

    After suspecting the Kernel Memory is leaked, using slabtop --sort c where it shows high memory usage. You can use the following steps to confirm the memory usage culprit using slub_debug=U. (Thanks to ServerFault).

    1. Login to OpenShift
    $ oc login
    
    1. Check that you don’t already see 99-master-kargs-slub.
    $ oc get mc 99-master-kargs-slub
    
    1. Create the slub_debug=U kernel argument. Note, that it’s assigned to the master role.
    cat << EOF > 99-master-kargs-slub.yaml
    apiVersion: machineconfiguration.openshift.io/v1
    kind: MachineConfig
    metadata:
      labels:
        machineconfiguration.openshift.io/role: master
      name: 99-master-kargs-slub
    spec:
      kernelArguments:
      - slub_debug=U
    EOF
    
    1. Create the Kernel Arguments Machine Config.
    $ oc apply -f 99-master-kargs-slub.yaml 
    machineconfig.machineconfiguration.openshift.io/99-master-kargs-slub created
    
    1. Wait until the master nodes are updated.
    $ oc wait mcp/master --for condition=updated --timeout=25m
    machineconfigpool.machineconfiguration.openshift.io/master condition met
    
    1. Confirm the node status as soon as it’s up, and list the master nodes.
    $ oc get nodes -l machineconfiguration.openshift.io/role=master
    NAME                                                    STATUS   ROLES    AGE   VERSION
    lon06-master-0.xip.io   Ready    master   30d   v1.23.5+3afdacb
    lon06-master-1.xip.io   Ready    master   30d   v1.23.5+3afdacb
    lon06-master-2.xip.io   Ready    master   30d   v1.23.5+3afdacb
    
    1. Connect to the master node and switch to the root user
    $ ssh core@lon06-master-0.xip.io
    sudo su - 
    
    1. Check the kmalloc-32 allocation
    $  cat /sys/kernel/slab/kmalloc-32/alloc_calls | sort -n  | tail -n 5
       4334 iomap_page_create+0x80/0x190 age=0/654342/2594020 pid=1-39569 cpus=0-7
       5655 selinux_sk_alloc_security+0x5c/0xd0 age=916/1870136/2594937 pid=0-39217 cpus=0-7
      41908 __kernfs_new_node+0x70/0x2d0 age=406911/2326294/2594938 pid=0-38398 cpus=0-7
    9969728 memcg_update_all_list_lrus+0x1bc/0x550 age=2564414/2567167/2594607 pid=1 cpus=0-7
    19861376 __list_lru_init+0x2b8/0x480 age=406870/2007921/2594449 pid=1-38406 cpus=0-7
    

    This points to memcg_update_all_list_lrus is using a lot of resources, which is currently fixed in a patch to the Linux Kernel.

    References

    1. https://serverfault.com/questions/1020241/debugging-kmalloc-64-slab-allocations-memory-leak
    2. http://www.jikos.cz/jikos/Kmalloc_Internals.html
    3. https://stackoverflow.com/questions/20079767/what-is-different-functions-malloc-and-kmalloc
    4. ServerFault: Debugging kmalloc-64 slab allocations / memory leak
    5. Kmalloc Internals: Exploring Linux Kernel Memory Allocation
    6. How I investigated memory leaks in Go using pprof on a large codebase
    7. Using Go 1.10 new trace features to debug an integration test
    8. Kernel Memory Leak Detector
    9. go-slab – slab allocator in go
    10. Red Hat Customer Support Portal: Interpreting /proc/meminfo and free output for Red Hat Enterprise Linux
    11. Red Hat Customer Support Portal: Determine how much memory is being used on the system
    12. Red Hat Customer Support Portal: Determine how much memory and what kind of objects the kernel is allocating
  • Playing with buildah and ubi-micro: Part 1

    buildah is an intriguing open source tool to build of Open Container Initiative (OCI) container images using a scripted approach versus a traditional Dockerfile. It’s fascinating and I’ve started to use podman and buildah to build my project’s images.

    I picked ubi-micro as my startingn point. Per Red Hat, ubi-microis the smallest possible image excludinng the package manager and all of its dependencies which are normally included in a container image. This approach is an alternative to the current release of the IBM FHIR Server image. The following only documents my first stages with Java testing.

    1. On Fedora, install the prerequisites.
    # sudo dnf install buildah -y
    Last metadata expiration check: 0:23:36 ago on Thu 02 Sep 2021 10:06:55 AM EDT.
    Dependencies resolved.
    =====================================================================================================================================================================
     Package                               Architecture                         Version                                      Repository                             Size
    =====================================================================================================================================================================
    Installing:
     buildah                               x86_64                               1.21.4-5.fc33                                updates                               7.9 M
    
    Transaction Summary
    =====================================================================================================================================================================
    Install  1 Package
    
    Total download size: 7.9 M
    Installed size: 29 M
    Downloading Packages:
    buildah-1.21.4-5.fc33.x86_64.rpm                                                                                                     7.2 MB/s | 7.9 MB     00:01
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Total                                                                                                                                6.2 MB/s | 7.9 MB     00:01
    Running transaction check
    Transaction check succeeded.
    Running transaction test
    Transaction test succeeded.
    Running transaction
      Preparing        :                                                                                                                                             1/1
      Installing       : buildah-1.21.4-5.fc33.x86_64                                                                                                                1/1
      Running scriptlet: buildah-1.21.4-5.fc33.x86_64                                                                                                                1/1
      Verifying        : buildah-1.21.4-5.fc33.x86_64                                                                                                                1/1
    
    Installed:
      buildah-1.21.4-5.fc33.x86_64
    
    Complete!
    
    1. Start the new image
    # microcontainer=$(buildah from registry.access.redhat.com/ubi8/ubi-micro)
    Trying to pull registry.access.redhat.com/ubi8/ubi-micro:latest...
    Getting image source signatures
    Copying blob 4f4fb700ef54 done
    Copying blob 098a109c8679 done
    Copying config c5ba898d36 done
    Writing manifest to image destination
    Storing signatures
    
    1. Confirm the container name.
    # echo $microcontainer
    ubi-micro-working-container
    
    1. Mount the layer locally and display the path.
    # micromount=$(buildah mount $microcontainer)
    # echo $micromount
    /var/lib/containers/storage/overlay/14c524d6a5ef0e94887bc52685dbe911b40a5a9e39a6df00dc3b02e5f5ad7796/merged
    
    1. Setup the AdoptOpennJdk repository.
    cat <<'EOF' > $micromount/etc/yum.repos.d/adoptopenjdk.repo
    [AdoptOpenJDK]
    name=AdoptOpenJDK
    baseurl=http://adoptopenjdk.jfrog.io/adoptopenjdk/rpm/rhel/8/$basearch
    enabled=1
    gpgcheck=1
    gpgkey=https://adoptopenjdk.jfrog.io/adoptopenjdk/api/gpg/key/public
    EOF
    
    1. Install to micromount without any ancillary dependencies.
    yum install \
        --installroot $micromount \
        --releasever 8 \
        --setopt install_weak_deps=false \
        --nodocs -y \
        adoptopenjdk-11-openj9xl.x86_64
    

    Results in:

    ------------------------------------------------------------------------------------------------------------------------------------
    Total                                                                                               8.9 MB/s | 193 MB     00:21
    warning: Found bdb Packages database while attempting sqlite backend: using bdb backend.
    warning: /var/lib/containers/storage/overlay/14c524d6a5ef0e94887bc52685dbe911b40a5a9e39a6df00dc3b02e5f5ad7796/merged/var/cache/dnf/AdoptOpenJDK-096a01411439d076/packages/adoptopenjdk-11-openj9xl-11.0.10+9.openj9-0.24.0-3.x86_64.rpm: Header V4 RSA/SHA1 Signature, key ID 74885c03: NOKEY
    AdoptOpenJDK                                                                                         13 kB/s | 3.1 kB     00:00
    warning: Found bdb Packages database while attempting sqlite backend: using bdb backend.
    Importing GPG key 0x74885C03:
     Userid     : "AdoptOpenJDK (used for publishing RPM and DEB files) <adoptopenjdk@gmail.com>"
     Fingerprint: 8ED1 7AF5 D7E6 75EB 3EE3 BCE9 8AC3 B291 7488 5C03
     From       : https://adoptopenjdk.jfrog.io/adoptopenjdk/api/gpg/key/public
    
    1. Clean up the dependencies
    # yum clean all \
     --installroot $micromount
    warning: Found bdb Packages database while attempting sqlite backend: using bdb backend.
    61 files removed
    
    1. Unmount the container
    buildah umount $microcontainer
    
    1. Coommit the image
    buildah commit $microcontainer ubi-micro-java
    
    1. Confirm the image
    # buildah images
    REPOSITORY                                  TAG        IMAGE ID       CREATED          SIZE
    localhost/ubi-micro-java                    latest     334404b8ebf2   22 seconds ago   43 MB
    

    It’s about 40M smaller than the ubi-minimal as it has no docs and ancillary dependencies.

    Tip: Starting with the IBM FHIR Server

    To start with the IBM FHIR Server image, you can use:

    buildah from --pull docker.io/ibmcom/ibm-fhir-server:latest
    
    [root@localhost ~]# buildah from --pull docker.io/ibmcom/ibm-fhir-server:latest
    Trying to pull docker.io/ibmcom/ibm-fhir-server:latest...
    Getting image source signatures
    Copying blob e2bef77118c7 done
    Copying blob 45cc8b7f2b43 done
    Copying blob 5627e846e80f done
    Copying blob 5f6bf015319e done
    Copying blob 87212cfd39ea done
    Copying blob b89ea354ae59 done
    Copying blob 4a939b72e1c6 done
    Copying blob d3cbf41efb4e done
    Copying blob 4feff1abc28e done
    Copying blob 9ff4465d271b done
    Copying blob 5e41012b4001 done
    Copying blob 410af8b678f6 done
    Copying blob 2f26dc40d01f done
    Copying blob 1415c9c2e161 done
    Copying blob e374de62001e done
    Copying blob 94d978ce0b1f done
    Copying blob 1fabae8675b6 done
    Copying blob 7b088cbebf16 done
    Copying blob 4167c1ebbd85 done
    Copying config 637552c186 done
    Writing manifest to image destination
    Storing signatures
    ibm-fhir-server-working-container
    

    Tip: Pullinng Fedora

    If you need to use Fedora, you can use fedora-minimal.

    # buildah from --pull registry.fedoraproject.org/fedora-minimal
    

    To remove the image

    $ podman image rm registry.fedoraproject.org/fedora-minimal:34
    

    Tip: Runnning with SELINUX

    If you are running with SELINUX, you should set specific selinux permissions.

    1. set the permission
    $ setsebool -P container_manage_cgroup 1
    
    1. Confirm the permission
    $ getsebool container_manage_cgroup
    container_manage_cgroup --> on
    

    References

  • htop and docker

    I was recently introduced to htop – it generates a really nice UI to. see what’s going on.

      • Install htop
    yum install htop -y
    ...
    Transaction test succeeded
    Running transaction
      Installing : htop-2.2.0-3.el7.x86_64                                                                                          1/1
      Verifying  : htop-2.2.0-3.el7.x86_64                                                                                         1/1
    Installed:
      htop.x86_64 0:2.2.0-3.el7
    Complete!
    

    2 – Type htop, and start diagnosing (iostats also helps)