Remove Duplicates in DB2 Columnar Format

I had dupe data in my OLAP table, where the columnar data can be duplicated based on event id. (I loaded data 2x). I had to differentiate the data and remove the duplicates, so I assigned row_numbers over a partition ordered by.

I hope this helps you.
db2 "update (select OME.*, row_number() over(partition by IDN_EVENT_ID order by IDN_EVENT_ID) as rnk from X.OLAP OME) set APP_NM = rnk"

Then I removed using this.
db2 "DELETE X.OLAP OME WHERE APP_NM = 2"

I recommend the two-phase, as you can in theory run this en batch, or async, and double check, versus hope it works.

Advanced HBase Shell

Recently, I’ve had to do some advanced HBase Shell scripting to check data consistency.

Tip # 1 – You can easily establish min/max times in nanoseconds from BASH and feed them into your script.

  $(date +%s%6N) 

gets you 6 digits of precision 3 for millis 3 for nanos.

Tip # 2 – Use the ‘include Java’ line to get access to all JARs that HBase has access to.

Tip # 3 – Forget about Bytes…. convert to Base64 to make it human readable (and you don’t want to break lines – the number 8 keeps it from wrapping). https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/util/Base64.html#DONT_BREAK_LINES

import org.apache.hadoop.hbase.util.Base64;
content = Bytes.toString(tableName.getValue(Bytes.toBytes("m"),Bytes.toBytes("d")))
x = Base64.encodeBytes(r.getRow(), 8)
puts "#{x}"

Tip # 4 – use GSON to parse JSON efficiently across a scan.

import com.google.gson.JsonParser;
parser = JsonParser.new
jsonBody = Bytes.toString(tableName.getValue(Bytes.toBytes("d"),Bytes.toBytes("b")))

json = parser.parse(jsonBody)
object = json.getAsJsonObject()
metaObj = object.get('mx')
objVer = metaObj.get('vid').getAsString()
objId = object.get('id').getAsString()

Tip #5 – use it as a script

time /usr/iop/current/hbase-client/bin/hbase org.jruby.Main /tmp/check.rb > check.log

Maven Animal Sniffer Plugin

For the past few years, most of my personal and professional projects are built using Maven.  The dependency management and corresponding build lifecycle enable me to do some complex builds (for instance HBase 1.2.5 Client Jars).

Command

 mvn dependency:tree -f demo-app/pom.xml

Result

[INFO] demo.app:demo-app:jar:1.0-SNAPSHOT
[INFO] +- junit:junit:jar:4.12:test
[INFO] |  \- org.hamcrest:hamcrest-core:jar:1.3:test
[INFO] \- org.apache.hbase:hbase:pom:1.2.5:compile
[INFO]    +- com.github.stephenc.findbugs:findbugs-annotations:jar:1.3.9-1:compile
[INFO]    \- log4j:log4j:jar:1.2.17:compile

When these dependencies change underlying the build change, the build and debugging is a pain. I ran into this plugin Maven Animal Sniffer Plugin. I did have to enable a specific configuration <configuration><includeJavaHome>false</includeJavaHome></configuration>

Command

mvn animal-sniffer:build -f demo-app/pom.xml 

Result

[INFO] --- animal-sniffer-maven-plugin:1.16:build (default-cli) @ demo-app ---
[INFO] Parsing signatures from com.github.stephenc.findbugs:findbugs-annotations:jar:1.3.9-1
[INFO] Parsing signatures from log4j:log4j:jar:1.2.17
[INFO] Parsing signatures from /Users/paulbastide/Downloads/animal/demo-app/target/classes
[INFO] Wrote signatures for 337 classes.

I can now store this along with my build. The generated file demo-app/target/demo-app-1.0-SNAPSHOT.signature can then be used to check plugin

Jenkinsfile Triggers

It tooks me far too long to get Jenkinsfile to stop overwriting my Triggers as noted in https://github.com/jenkinsci/gitlab-plugin/issues/692 . You’ll see

WARNING: The properties step will remove all JobPropertys currently configured in this job, either from the UI or from an earlier properties step.

References
https://github.com/jenkinsci/gitlab-plugin
https://github.com/jenkinsci/gitlab-plugin/blob/master/src/main/java/com/dabsquared/gitlabjenkins/GitLabPushTrigger.java
https://github.com/jenkinsci/gitlab-plugin/blob/master/src/main/resources/com/dabsquared/gitlabjenkins/GitLabPushTrigger/config.jelly
https://github.com/jenkinsci/job-dsl-plugin/blob/master/job-dsl-core/src/main/groovy/javaposse/jobdsl/dsl/helpers/triggers/GitLabTriggerContext.groovy
https://github.com/jenkinsci/pipeline-model-definition-plugin/wiki/Trigger-runs
https://dev.to/pencillr/jenkins-pipelines-and-their-dirty-secrets-2
https://github.com/jenkinsci/pipeline-model-definition-plugin/wiki/Parametrized-pipelines
https://issues.jenkins-ci.org/browse/JENKINS-45053
https://jenkins.io/doc/book/pipeline/syntax/#declarative-pipeline
https://jenkins.io/doc/pipeline/steps/
https://gitlab.switch.ch/etienne.dysli-metref/idpv3-mfa/commit/6d2974b1199fb2101f5f4299a974cac66a220080
https://github.com/jenkinsci/gitlab-plugin/issues/417

Using jjs to confirm issue with DatatypeConverter in WebSphere Liberty

I kept running into a funky ‘java.lang.NullPointerException’ with the WebSphere Liberty included DataValidator. To debug the issue, I used the jjs – nashorn engine

If you need, to figure out where the class is located

try {
Class c = Class.forName("javax.xml.bind.DatatypeConverter");
System.out.println("Location " + c.getProtectionDomain().getCodeSource().getLocation());
} catch (ClassNotFoundException e1) {
TODO Auto-generated catch block
e1.printStackTrace();
}

You’ll get output like this:

 
Location /opt/ibm/wlp/dev/api/spec/com.ibm.ws.javaee.jaxb.2.2_1.0.12.jar

Then you can look at the default JDK – Java SDK – 1.8

 
/opt/ibm/ibm-java-sdk-8.0-4.5/jre/bin/jjs 
jjs> javax.xml.bind.DatatypeConverter.printHexBinary("get".getBytes())
676574
jjs>

Then you can look at the specific jar with jdk Java SDK – 1.8 + Liberty Jaxb

 
/opt/ibm/ibm-java-sdk-8.0-4.5/jre/bin/jjs --dump-on-error -classpath "/opt/ibm/wlp/dev/api/spec/com.ibm.ws.javaee.jaxb.2.2_1.0.12.jar:." 
jjs> javax.xml.bind.DatatypeConverter.printHexBinary("get".getBytes())
java.lang.NullPointerException
jjs>

From this, I determined that I needed to update in “com.ibm.websphere.javaee.jaxb.2.2_1.0.20.jar”

Formatting JSON with VIM

I am working on an analytics project where we generate very complicated medical analysis and put it in a hierarchical data model.

{ "test" : { "test1" : "val1" } }

Open the JSON in vim and use python -m json.tool

:%!python -m json.tool

Results

{
    "test": {
        "test1": "val1"
    }
}

References
https://coderwall.com/p/faceag/format-json-in-vim
https://til.hashrocket.com/posts/ha0ci0pvkj-format-json-in-vim-with-jq (jq is another option, but… not always available on every system, python tends to work everywhere).

Co-Inventing Model with Gephi

I was curious who my frequent co-inventors were, so I downloaded Gephi. I downloaded my patent data from Google Patents and built a fun little model from CSV data (similar to the attached).

SOURCE WEIGHT TARGET
Matthew E. Broomhall 121 PAUL Bastide
Robert E. Loredo 119 PAUL Bastide
Fang Lu 60 PAUL Bastide
Alaa Abou Mahmoud 43 PAUL Bastide
Lisa Seacat Deluca 23 PAUL Bastide
Lydia M. Do 17 PAUL Bastide
Dale M. Schultz 13 PAUL Bastide
Andrew E. Davis 7 PAUL Bastide
Ralph E. LeBlanc 7 PAUL Bastide
Sean Callanan 7 PAUL Bastide
Donna K. Byron 5 PAUL Bastide
Sandra L. Kogan 5 PAUL Bastide
Asima Silva 4 PAUL Bastide
Aaron J. Quirk 3 PAUL Bastide
Aaron M. Cohen 3 PAUL Bastide
Daniel B. Harris 3 PAUL Bastide
Eric S. Portner 3 PAUL Bastide
John M. Boyer 3 PAUL Bastide
Michael L. Taylor 3 PAUL Bastide
Alexander Pikovsky 2 PAUL Bastide
Corville O. Allen 2 PAUL Bastide
Dana L. Price 2 PAUL Bastide
Eric M. Wilcox 2 PAUL Bastide
Jeffrey R. Hoy 2 PAUL Bastide
John A. Jacobson 2 PAUL Bastide
Kulvir S. Bhogal 2 PAUL Bastide
Liam Harpur 2 PAUL Bastide
Marco A. Vicente 2 PAUL Bastide
Patrick J. O’Sullivan 2 PAUL Bastide
Scott J. Martin 2 PAUL Bastide
Shane M. Kilmon 2 PAUL Bastide
Stephen Crawford 2 PAUL Bastide
Thomas J. Evans IV 2 PAUL Bastide
Vijay Francis 2 PAUL Bastide
Weisong Wang 2 PAUL Bastide
Adam L. Cutler 1 PAUL Bastide
Amanda N. Savitzky 1 PAUL Bastide
Andrew L. Schirmer 1 PAUL Bastide
Arun Vishwanath 1 PAUL Bastide
Bernadette A. Carter 1 PAUL Bastide
Beth Anne M. Collopy 1 PAUL Bastide
Beth L. Hoffman 1 PAUL Bastide
Bradley W. Hurley 1 PAUL Bastide
Brenton P. Chasse 1 PAUL Bastide
Brian M. Walsh 1 PAUL Bastide
Carl J. Kraenzel 1 PAUL Bastide
Christopher W. Desforges 1 PAUL Bastide
Damian E.A. Garcia 1 PAUL Bastide
Dan DUMONT 1 PAUL Bastide
Dwarikanath Mahapatra 1 PAUL Bastide
Fred Raguillat 1 PAUL Bastide
Isabell Kiral-Kornek 1 PAUL Bastide
Jaime M. Stockton 1 PAUL Bastide
James A. Hart 1 PAUL Bastide
Jennifer L. Vargus 1 PAUL Bastide
Jodi RAJANIEMI 1 PAUL Bastide
Jose L. Lopez 1 PAUL Bastide
Juliana M. Leong 1 PAUL Bastide
Katherine M. Parsons 1 PAUL Bastide
Kelley L. ANDERS 1 PAUL Bastide
King Shing K. Lui 1 PAUL Bastide
Leah A. Lawrence 1 PAUL Bastide
Leho Nigul 1 PAUL Bastide
Lei Wang 1 PAUL Bastide
Lorelei M. McCollum 1 PAUL Bastide
Margo L. Ezekiel 1 PAUL Bastide
Mark Gargan 1 PAUL Bastide
Mary E. Miller 1 PAUL Bastide
Matthew Stephen Rosno 1 PAUL Bastide
Melissa A. Lord 1 PAUL Bastide
Michael G. Alexander 1 PAUL Bastide
Na Pei 1 PAUL Bastide
Neal Fishman 1 PAUL Bastide
Pei Sun 1 PAUL Bastide
Richard Gorzela 1 PAUL Bastide
Richard T. Bassemir 1 PAUL Bastide
Shelbee D. Smith-Eigenbrode 1 PAUL Bastide
Shu Qiang Li 1 PAUL Bastide
Shunguo Yan 1 PAUL Bastide
Stacy M. Cannon 1 PAUL Bastide
Stanley K. Jerrard-Dunne 1 PAUL Bastide
Stefan von Cavallar 1 PAUL Bastide
Susmita Saha 1 PAUL Bastide
Tamer E. Abuelsaad 1 PAUL Bastide
Thomas J. Evans 1 PAUL Bastide
Trudy L. Hewitt 1 PAUL Bastide
Xujin Liu 1 PAUL Bastide
Ying Mo 1 PAUL Bastide
PAUL Bastide 0 PAUL Bastide
Graph of Co-Inventors