Category: Application Development

  • Bash Fu  ${%%}

    Thanks to Gerrit for cluing me in.

    In Bash, symbols like # and % aren’t just random noise—they are powerful operators used for Parameter Expansion. They allow you to “trim” or “slice” strings stored in variables without needing external tools like sed or awk.

    To understand ${%%}, we have to break down how Bash sees those symbols.

    1. The Core Logic: Front vs. Back Think of these symbols as “knives” that cut parts of your string based on a pattern:
    SymbolActionMnemonic
    #Removes from the front (left)The # is on the left side of a standard keyboard (Shift+3).
    %Removes from the back (right)The % is on the right side of the # (Shift+5).
    1. Doubling Up: Small vs. Large The number of symbols determines how “aggressive” the cut is:
    • Single (# or %): Non-greedy. It removes the shortest possible match.
    • Double (“ or %%): Greedy. It removes the longest possible match.
    1. Practical Examples Let’s say we have a variable: file="image.jpg.backup"

    Using # and “ (Removing from the Front)

    • ${file#*.} → Result: jpg.backup (Cut the shortest bit ending in a dot).
    • ${file*.} → Result: backup (Cut everything up to the last dot).

    Using % and %% (Removing from the Back)

    • ${file%.*} → Result: image.jpg (Cut the shortest bit starting from a dot at the end).
    • ${file%%.*} → Result: image (Cut everything from the first dot to the end).

    If you have VAR="long.file.name.txt":

    SyntaxLogicResult
    ${VAR#*.}Delete shortest match from frontfile.name.txt
    ${VAR*.}Delete longest match from fronttxt
    ${VAR%.*}Delete shortest match from backlong.file.name
    ${VAR%%.*}Delete longest match from backlong

    Quick Tip: If you ever forget which is which, remember that on the keyboard, # is to the left of %. Therefore, # handles the left (start) of the string, and % handles the right (end).

  • Docling with IBM Power

    Originally posted to https://community.ibm.com/community/user/blogs/paul-bastide/2026/03/20/docling-with-ibm-power

    If you’ve been following the rapid evolution of document parsing in AI, you’ve likely encountered Docling. It’s a powerhouse for converting complex PDFs and documents into machine-readable formats. The AI Services team and the IBM Power Python Ecosystem team have provided all of the requirements so you can use docling and as it iterates rapidly, stay up-to-date.

    For python developers using IBM Power, this article provides a recipe to use docling with IBM Power. You can also learn more about the using the Python Ecosystem at https://community.ibm.com/community/user/blogs/janani-janakiraman/2025/09/10/developing-apps-using-python-packages-on-ibm-power

    The Recipe: Step-by-Step Installation

    This guide assumes you are working in a Linux environment (specifically optimized for ppc64le architectures, though the logic holds for most setups).

    1. Prepare Your Environment

    Start by setting up a fresh virtual environment to avoid dependency issues

    python3 -m venv ./test-venv
    source ./test-venv/bin/activate
    python3.12 -m venv --upgrade test-venv/
    

    2. Define the Requirements

    The AI Services team has identified a specific “golden set” of versions that play well together. Create a requirements.txt file containing the necessary packages, including doclingtorch, and transformers.

    accelerate==1.13.0
    annotated-doc==0.0.4
    annotated-types==0.7.0
    antlr4-python3-runtime==4.9.3
    attrs==26.1.0
    beautifulsoup4==4.14.3
    certifi==2026.2.25
    charset-normalizer==3.4.6
    click==8.3.1
    colorlog==6.10.1
    defusedxml==0.7.1
    dill==0.4.1
    docling==2.77.0
    docling-core==2.70.2
    docling-ibm-models==3.12.0
    docling-parse==5.3.2
    et_xmlfile==2.0.0
    Faker==40.11.0
    filelock==3.25.2
    filetype==1.2.0
    fsspec==2026.2.0
    huggingface_hub==0.36.2
    idna==3.11
    Jinja2==3.1.6
    jsonlines==4.0.0
    jsonref==1.1.0
    jsonschema==4.26.0
    jsonschema-specifications==2025.9.1
    latex2mathml==3.79.0
    lxml==6.0.2
    markdown-it-py==4.0.0
    marko==2.2.2
    MarkupSafe==3.0.3
    mdurl==0.1.2
    mpire==2.10.2
    mpmath==1.3.0
    multiprocess==0.70.19
    networkx==3.6.1
    numpy==2.4.1
    omegaconf==2.3.0
    opencv-python==4.10.0.84+ppc64le2
    openpyxl==3.1.5
    packaging==26.0
    pandas==2.3.3
    pillow==12.1.1
    pip==26.0.1
    pluggy==1.6.0
    polyfactory==3.3.0
    psutil==7.2.2
    pyclipper==1.4.0
    pydantic==2.12.5
    pydantic_core==2.41.5
    pydantic-settings==2.13.1
    Pygments==2.19.2
    pylatexenc==2.10
    pypdfium2==5.6.0
    python-dateutil==2.9.0.post0
    python-docx==1.2.0
    python-dotenv==1.2.2
    python-pptx==1.0.2
    pytz==2026.1.post1
    PyYAML==6.0.3
    rapidocr==3.7.0
    referencing==0.37.0
    regex==2026.2.28
    requests==2.32.5
    rich==14.3.3
    rpds-py==0.30.0
    rtree==1.4.1
    safetensors==0.7.0
    scipy==1.17.0
    semchunk==3.2.5
    shapely==2.1.2
    shellingham==1.5.4
    six==1.17.0
    soupsieve==2.8.3
    sympy==1.14.0
    tabulate==0.10.0
    tokenizers==0.22.2
    torch==2.9.1
    torchvision==0.24.1
    tqdm==4.67.3
    transformers==4.57.6
    tree-sitter==0.25.2
    tree-sitter-c==0.24.1
    tree-sitter-javascript==0.25.0
    tree-sitter-python==0.25.0
    tree-sitter-typescript==0.23.2
    typer==0.21.2
    typing_extensions==4.15.0
    typing-inspection==0.4.2
    tzdata==2025.3
    urllib3==2.6.3
    xlsxwriter==3.2.9

    Note: Ensure you include the full list of dependencies (like docling==2.77.0 and docling-core==2.66.0) to maintain stability across your build.

    If you need OCR, you will need to run:

     yum install -y --setopt=tsflags=nodocs python3.12-devel python3.12-pip \
            lcms2-devel openblas-devel freetype libicu libjpeg-turbo && \
        yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm && \
        yum install -y spatialindex-devel

    3. The Installation Secret Sauce

    Before running the install, ensure pip is at its latest version. Then, use the --extra-index-url flag to point to the optimized IBM developer wheels. This is the trick to getting the faster compilation mentioned earlier.

    pip install --upgrade pip
    pip install -r requirements.txt \
        --extra-index-url=https://wheels.developerfirst.ibm.com/ppc64le/linux \
        --prefer-binary
    

    Verifying the Build

    Once the installation completes, it’s a good idea to run a “smoke test” to ensure the models can be fetched properly. You can use a simple script to trigger the model downloads:

    # download_docling_models.py
    from docling.pipeline.standard_pdf_pipeline import StandardPdfPipeline
    
    # This triggers the download of Layout & TableFormer models
    pipeline = StandardPdfPipeline()
    print("Download complete.")
    

    When you see the output Downloading ds4sd--docling-models (Layout & TableFormer)..., you’re officially ready to start parsing.

    Why This Matters

    By focusing on the dependencies rather than the wheel itself, the AI Services team has given us a way to stay agile. We get the latest features of Docling without the overhead of waiting for official distribution builds to catch up to the repo’s velocity.

    Special credit to Yussuf and his test!

  • Dynamic GOMAXPROCS

    Go 1.25 add container-ware GOMAXPROCS. Instead of assuming it has all available processors, go respects the cgroupv2 specified CPU limits. This feature ensures resources aren’t incorrectly used or killed for trying to access or use too much CPU.

    You can disable this feature using containermaxprocs=0 or tweaking it as you need (for instance only specifying 1 CPU when you have 2 or 8 threads available).

    Thanks to Karthik for the heads up….

    Go 1.25 Release Notes

  • Great News… IBM has Open Source Wheel Packages for Linux on Power

    Priya Seth posted about Open Source Wheel Packages for Linux on Power:

    IBM provides a dedicated repository of Python wheel packages optimized for the Linux on Power (ppc64le) architecture. These pre-built binaries simplify Python development on Power systems by eliminating the need to compile packages from source—saving time and reducing complexity.

    Wheel files (.whl) are the standard for distributing pre-compiled Python packages. For developers working on Power architecture, having access to architecture-specific wheels ensures compatibility and speeds up development.

    IBM hosts a curated collection of open-source Python wheels for the ppc64le platform listed at https://open-source-edge.developerfirst.ibm.com/

    Use pip to download the package without installing it:

    pip download <package_name>==<version> --prefer-binary --index-url=https://wheels.developerfirst.ibm.com/ppc64le/linux --verbose --no-deps
    

    Replace <package_name> and <version> with the desired values.

    Whether you’re building AI models, data pipelines, or enterprise applications, this repository helps accelerate your Python development on Power.

    You can also refer to https://community.ibm.com/community/user/blogs/nikhil-kalbande/2025/08/01/install-wheels-from-ibm-python-wheel-repository

  • Red Hat OpenShift Container Platform on IBM Power Systems: Exploring Red Hat’s Multi-Arch Tuning Operator

    The Red Hat Multi-Arch Tuning Operator optimizes workload placement within multi-architecture compute clusters. Pods run on the compute architecture for which the containers declare support. Where Operators, Deployments, ReplicaSets, Jobs, CronJob, Pods don’t declare a nodeAffinity, in most cases, the Pods that are generate are updated with the node affinity so it lands on the supported (declared) CPU Architecture.

    For version 1.1.0, the Red Hat Multi-Arch Team, @Prashanth684@aleskandro@AnnaZivkovic and IBM Power Systems team @pkenchap have worked together to give cluster administrators better control and flexibility. The feature adds a plugins field in ClusterPodPlacementConfig and have build a first plugin called nodeAffinityScoring.

    Per the docs, the nodeAffinityScoring plugin adds weights and influence to the scheduler with this process:

    1. Analyzing the Pod’s containers for the supported architectures
    2. Generate the Scheduling predicates for nodeAffinity, e.g., 75 weight on ppc64le
    3. Filter out nodes that do not meet the Pod requirements, using the Predicates
    4. Prioritizes the remaining nodes based on the architecture scores defined in the nodeAffinityScoring.platforms field.

    To take advantages of this feature, use the following to asymmetrically load the Power nodes with work.

    apiVersion: multiarch.openshift.io/v1beta1
    kind: ClusterPodPlacementConfig
    metadata:
      name: cluster
    spec:
      logVerbosityLevel: Normal
      namespaceSelector:
        matchExpressions:
          - key: multiarch.openshift.io/exclude-pod-placement
            operator: Exists
      plugins:
        nodeAffinityScoring:
          enabled: true
          platforms:
            - architecture: ppc64le
              weight: 100
            - architecture: amd64
              weight: 50
    

    Best wishes, and looking forward to hearing how you use the Multi-Arch Tuning Operator on IBM Power with Multi-Arch Compute.

    References

    1. [RHOCP][TE] Multi-arch Tuning Operator: Cluster-wide architecture preferred/weighted affinity
    2. OpenShift 4.18 Docs: Chapter 4. Configuring multi-architecture compute machines on an OpenShift cluster
    3. OpenShift 4.18 Docs: 4.11. Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator
    4. Enhancement: Introducing the namespace-scoped PodPlacementConfig
  • nx-gzip requires active_mem_expansion_capable

    nx-gzip requires the licensed process caability active_mem_expansion_capable

    Login to your HMC

    for MACHINE in my-ranier1 my-ranier2
    do
    echo "MACHINE: ${MACHINE}"
    for CAPABILITY in $(lssyscfg -r sys -F capabilities -m "${MACHINE}" | sed 's|,| |g' | sed 's|"||g')
    do
    echo "CAPABILITY: ${CAPABILITY}" | grep active_mem_expansion_capable
    done
    echo
    done

    The following shows:

    MACHINE: my-ranier1
    CAPABILITY: active_mem_expansion_capable
    CAPABILITY: hardware_active_mem_expansion_capable
    CAPABILITY: active_mem_mirroring_hypervisor_capable
    CAPABILITY: cod_mem_capable
    CAPABILITY: huge_page_mem_capable
    CAPABILITY: persistent_mem_capable
    
    MACHINE: my-ranier2
    CAPABILITY: cod_mem_capable
    CAPABILITY: huge_page_mem_capable
    CAPABILITY: persistent_mem_capable

    Then you should be all set to use nx-gzip on my-ranier1

    Best wishes

  • Helpful Tool – mtr

    I was not aware of the mtr which Network diagnostic tool combining 'traceroute' and 'ping'.

    You can quickly install on RHEL/Centos with sudo dnf install -y mtr

    The output is super helpful to where you have drops:

     mtr --report bastide.org
    Start: 2025-04-04T12:44:43-0400
    HOST: nx-gzip-d557-bastion-0.x Loss%   Snt   Last   Avg  Best  Wrst StDev
      1.|-- 10.20.176.3                0.0%    10    1.9   1.2   0.9   1.9   0.3
      2.|-- 172.16.32.4                0.0%    10    0.7   0.7   0.7   0.8   0.0
      3.|-- att-vc-srx-interconnect.p  0.0%    10   30.2  33.9  25.4  62.6  11.0
      4.|-- XX.5.16.XXX                0.0%    10   11.8  11.8  11.7  12.0   0.1
      5.|-- po97.prv-leaf6a.net.unifi  0.0%    10   62.5  63.2  62.5  67.5   1.5
  • DNS Resolver Hangs with OpenVPN

    Running multiple OpenVPN on the mac, sometimes my DNS hangs and I can’t get the VPNs. I use this hack to get around it.

    ❯ sudo networksetup -setdnsservers Wi-Fi "Empty"
  • Kernel Stack Trace

    Quick hack to find stack trace.

    Look in proc find /proc -name stack

    You can see the last stack for example… /proc/479260/stack

    [<0>] hrtimer_nanosleep+0x89/0x120
    [<0>] __x64_sys_nanosleep+0x96/0xd0
    [<0>] do_syscall_64+0x5b/0x1a0
    [<0>] entry_SYSCALL_64_after_hwframe+0x66/0xcb
    

    It superb to figure out a real-time hang and pattern.

  • vim versus plain vi: One Compelling Reason

    My colleague, Michael Q, introduced me to a vim extension that left me saying… that’s awesome.

    set cuc which enables Cursor Column, and when I use it with set number, it’s awesome to see correct indenting

    The commands are:

    1. Shift + :
    2. set cuc and enter
    3. Shift + :
    4. set number and enter
    `set cuc` which enables *Cursor Column*, and when I use it with `set number`, it's awesome to see correct indenting

    Use set nocuc to disable

    Good luck…

    Post Script

    • Install vim with dnf install -y vim

    Reference VimTrick: set cuc