Remove Duplicates in DB2 Columnar Format

I had dupe data in my OLAP table, where the columnar data can be duplicated based on event id. (I loaded data 2x). I had to differentiate the data and remove the duplicates, so I assigned row_numbers over a partition ordered by.

I hope this helps you.
db2 "update (select OME.*, row_number() over(partition by IDN_EVENT_ID order by IDN_EVENT_ID) as rnk from X.OLAP OME) set APP_NM = rnk"

Then I removed using this.
db2 "DELETE X.OLAP OME WHERE APP_NM = 2"

I recommend the two-phase, as you can in theory run this en batch, or async, and double check, versus hope it works.

Advanced HBase Shell

Recently, I’ve had to do some advanced HBase Shell scripting to check data consistency.

Tip # 1 – You can easily establish min/max times in nanoseconds from BASH and feed them into your script.

  $(date +%s%6N) 

gets you 6 digits of precision 3 for millis 3 for nanos.

Tip # 2 – Use the ‘include Java’ line to get access to all JARs that HBase has access to.

Tip # 3 – Forget about Bytes…. convert to Base64 to make it human readable (and you don’t want to break lines – the number 8 keeps it from wrapping). https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/util/Base64.html#DONT_BREAK_LINES

import org.apache.hadoop.hbase.util.Base64;
content = Bytes.toString(tableName.getValue(Bytes.toBytes("m"),Bytes.toBytes("d")))
x = Base64.encodeBytes(r.getRow(), 8)
puts "#{x}"

Tip # 4 – use GSON to parse JSON efficiently across a scan.

import com.google.gson.JsonParser;
parser = JsonParser.new
jsonBody = Bytes.toString(tableName.getValue(Bytes.toBytes("d"),Bytes.toBytes("b")))

json = parser.parse(jsonBody)
object = json.getAsJsonObject()
metaObj = object.get('mx')
objVer = metaObj.get('vid').getAsString()
objId = object.get('id').getAsString()

Tip #5 – use it as a script

time /usr/iop/current/hbase-client/bin/hbase org.jruby.Main /tmp/check.rb > check.log

Maven Animal Sniffer Plugin

For the past few years, most of my personal and professional projects are built using Maven.  The dependency management and corresponding build lifecycle enable me to do some complex builds (for instance HBase 1.2.5 Client Jars).

Command

 mvn dependency:tree -f demo-app/pom.xml

Result

[INFO] demo.app:demo-app:jar:1.0-SNAPSHOT
[INFO] +- junit:junit:jar:4.12:test
[INFO] |  \- org.hamcrest:hamcrest-core:jar:1.3:test
[INFO] \- org.apache.hbase:hbase:pom:1.2.5:compile
[INFO]    +- com.github.stephenc.findbugs:findbugs-annotations:jar:1.3.9-1:compile
[INFO]    \- log4j:log4j:jar:1.2.17:compile

When these dependencies change underlying the build change, the build and debugging is a pain. I ran into this plugin Maven Animal Sniffer Plugin. I did have to enable a specific configuration <configuration><includeJavaHome>false</includeJavaHome></configuration>

Command

mvn animal-sniffer:build -f demo-app/pom.xml 

Result

[INFO] --- animal-sniffer-maven-plugin:1.16:build (default-cli) @ demo-app ---
[INFO] Parsing signatures from com.github.stephenc.findbugs:findbugs-annotations:jar:1.3.9-1
[INFO] Parsing signatures from log4j:log4j:jar:1.2.17
[INFO] Parsing signatures from /Users/paulbastide/Downloads/animal/demo-app/target/classes
[INFO] Wrote signatures for 337 classes.

I can now store this along with my build. The generated file demo-app/target/demo-app-1.0-SNAPSHOT.signature can then be used to check plugin

Using jjs to confirm issue with DatatypeConverter in WebSphere Liberty

I kept running into a funky ‘java.lang.NullPointerException’ with the WebSphere Liberty included DataValidator. To debug the issue, I used the jjs – nashorn engine

If you need, to figure out where the class is located

try {
Class c = Class.forName("javax.xml.bind.DatatypeConverter");
System.out.println("Location " + c.getProtectionDomain().getCodeSource().getLocation());
} catch (ClassNotFoundException e1) {
TODO Auto-generated catch block
e1.printStackTrace();
}

You’ll get output like this:

 
Location /opt/ibm/wlp/dev/api/spec/com.ibm.ws.javaee.jaxb.2.2_1.0.12.jar

Then you can look at the default JDK – Java SDK – 1.8

 
/opt/ibm/ibm-java-sdk-8.0-4.5/jre/bin/jjs 
jjs> javax.xml.bind.DatatypeConverter.printHexBinary("get".getBytes())
676574
jjs>

Then you can look at the specific jar with jdk Java SDK – 1.8 + Liberty Jaxb

 
/opt/ibm/ibm-java-sdk-8.0-4.5/jre/bin/jjs --dump-on-error -classpath "/opt/ibm/wlp/dev/api/spec/com.ibm.ws.javaee.jaxb.2.2_1.0.12.jar:." 
jjs> javax.xml.bind.DatatypeConverter.printHexBinary("get".getBytes())
java.lang.NullPointerException
jjs>

From this, I determined that I needed to update in “com.ibm.websphere.javaee.jaxb.2.2_1.0.20.jar”

Formatting JSON with VIM

I am working on an analytics project where we generate very complicated medical analysis and put it in a hierarchical data model.

{ "test" : { "test1" : "val1" } }

Open the JSON in vim and use python -m json.tool

:%!python -m json.tool

Results

{
    "test": {
        "test1": "val1"
    }
}

References
https://coderwall.com/p/faceag/format-json-in-vim
https://til.hashrocket.com/posts/ha0ci0pvkj-format-json-in-vim-with-jq (jq is another option, but… not always available on every system, python tends to work everywhere).

Urban Code Deploy: When a value doesn’t exist?

Urban Code Deploy: When a value doesn’t exist?

Recently, I ran into an issue with a resource referece in an Urban Code Deploy (UCD) resource that did not yet exist. When the value, that doesn’t yet exist I found this a great tip (UCD Documentation).

Change from
${p:resource/value-not-yet-populated}
to
${p?:resource/value-not-yet-populated}

UCD replaces the missing value with an empty string. It saved my sanity and ease of use.

Generating Swagger as part of a Maven Build

Most of my projects use Maven to build and coordinate dependencies and run unit and build integration tests. I’ve found it a real pain that I could not generate my swagger docs as part of the build.

I have found a really easy way. In Jenkins, I have a downstream job that runs specific maven goals. The WebSphere Liberty maven goals are documented WebSphere Liberty Maven Documentation. Please note my downstream job is an experimental downstream project; I am a bit concerned about edge cases where the process dies.

The commands of the downstream job are:

mvn dependency:copy-dependencies liberty:install-apps liberty:test-start-server

The commands execute three key elements:

  1. Dummy Server.xml – attached at the bottom
  2. Execution for Goal for pre-integration-test
  3. Execution must follow the copy-dependencies so the war file is copied on in

You can check your server is up and running:

tail -f target/liberty/wlp/usr/servers/defaultServer/logs/messages.log

Download the Swagger (YAML)

curl http://localhost:9080/swagger/api/swagger.yaml -o myswagger.yaml

The full code is below

Locked out of Terminal Services

<bang-head/>

I locked my team out of a terminal services session when I stopped the terminal server.  I am on a Mac and could not restart the service using the MMC snap-in.

I installed Samba4 – samba4-common and used the net rpc service start on TermService and bam… back in business.

[root@hostname ~]# net rpc service start TermService -I 1.1.1.1 --user Administrator
Enter Administrator's password:
...
Successfully started service: TermService

https://blog.trackets.com/2014/05/17/ssh-tunnel-local-and-remote-port-forwarding-explained-with-examples.html
https://www.linuxquestions.org/questions/linux-networking-3/samba-has-nt_status_logon_failure-67766/