Ambari All Sorts of Messed Up

My team and I run Ambari and Ambari agents which controls our HDFS/HBase and general HADOOP/Apache ecosystem machines.  Our bare metal machines hung, and we could not get anything restarted.

In the logs, we had:

{'msg': 'Unable to read structured output from /var/lib/ambari-agent/data/structured-out-status.json'}

We found a link at https://community.hortonworks.com/content/supportkb/49517/services-are-running-but-ambari-reports-them-faile.html and the fix.

  1. Remove /var/lib/ambari-agent/data/structured-out-status.json
  2. Restart ambari agent.

Our ambari and setup now works.

VIM – JOIN Conditions with Unicode and ASCII

JOIN Conditions with Unicode and ASCII

I cannot stress the dangers of copying data from Excel or HTML and assuming that it’s ASCII. For example U+0040 is the unicode version of @. We ingested the unicode version and couldn’t see why a JOIN condition on the data table wasn’t working.

I looked at the source JSON ( a FHIR DSTU2 Group ) and loaded in VIM and used the following trick:

set encoding=latin1

We ended up showing that our data table’s contents were different using:

SELECT HEX(RESOURCE_VALUE) FROM FHIR.DIM_GROUP
0A40 vs 40

References

https://unix.stackexchange.com/questions/108020/can-vim-display-ascii-characters-only-and-treat-other-bytes-as-binary-data

Remove Duplicates in DB2 Columnar Format

I had dupe data in my OLAP table, where the columnar data can be duplicated based on event id. (I loaded data 2x). I had to differentiate the data and remove the duplicates, so I assigned row_numbers over a partition ordered by.

I hope this helps you.
db2 "update (select OME.*, row_number() over(partition by IDN_EVENT_ID order by IDN_EVENT_ID) as rnk from X.OLAP OME) set APP_NM = rnk"

Then I removed using this.
db2 "DELETE X.OLAP OME WHERE APP_NM = 2"

I recommend the two-phase, as you can in theory run this en batch, or async, and double check, versus hope it works.

Advanced HBase Shell

Recently, I’ve had to do some advanced HBase Shell scripting to check data consistency.

Tip # 1 – You can easily establish min/max times in nanoseconds from BASH and feed them into your script.

  $(date +%s%6N) 

gets you 6 digits of precision 3 for millis 3 for nanos.

Tip # 2 – Use the ‘include Java’ line to get access to all JARs that HBase has access to.

Tip # 3 – Forget about Bytes…. convert to Base64 to make it human readable (and you don’t want to break lines – the number 8 keeps it from wrapping). https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/util/Base64.html#DONT_BREAK_LINES

import org.apache.hadoop.hbase.util.Base64;
content = Bytes.toString(tableName.getValue(Bytes.toBytes("m"),Bytes.toBytes("d")))
x = Base64.encodeBytes(r.getRow(), 8)
puts "#{x}"

Tip # 4 – use GSON to parse JSON efficiently across a scan.

import com.google.gson.JsonParser;
parser = JsonParser.new
jsonBody = Bytes.toString(tableName.getValue(Bytes.toBytes("d"),Bytes.toBytes("b")))

json = parser.parse(jsonBody)
object = json.getAsJsonObject()
metaObj = object.get('mx')
objVer = metaObj.get('vid').getAsString()
objId = object.get('id').getAsString()

Tip #5 – use it as a script

time /usr/iop/current/hbase-client/bin/hbase org.jruby.Main /tmp/check.rb > check.log

Maven Animal Sniffer Plugin

For the past few years, most of my personal and professional projects are built using Maven.  The dependency management and corresponding build lifecycle enable me to do some complex builds (for instance HBase 1.2.5 Client Jars).

Command

 mvn dependency:tree -f demo-app/pom.xml

Result

[INFO] demo.app:demo-app:jar:1.0-SNAPSHOT
[INFO] +- junit:junit:jar:4.12:test
[INFO] |  \- org.hamcrest:hamcrest-core:jar:1.3:test
[INFO] \- org.apache.hbase:hbase:pom:1.2.5:compile
[INFO]    +- com.github.stephenc.findbugs:findbugs-annotations:jar:1.3.9-1:compile
[INFO]    \- log4j:log4j:jar:1.2.17:compile

When these dependencies change underlying the build change, the build and debugging is a pain. I ran into this plugin Maven Animal Sniffer Plugin. I did have to enable a specific configuration <configuration><includeJavaHome>false</includeJavaHome></configuration>

Command

mvn animal-sniffer:build -f demo-app/pom.xml 

Result

[INFO] --- animal-sniffer-maven-plugin:1.16:build (default-cli) @ demo-app ---
[INFO] Parsing signatures from com.github.stephenc.findbugs:findbugs-annotations:jar:1.3.9-1
[INFO] Parsing signatures from log4j:log4j:jar:1.2.17
[INFO] Parsing signatures from /Users/paulbastide/Downloads/animal/demo-app/target/classes
[INFO] Wrote signatures for 337 classes.

I can now store this along with my build. The generated file demo-app/target/demo-app-1.0-SNAPSHOT.signature can then be used to check plugin

Jenkinsfile Triggers

It tooks me far too long to get Jenkinsfile to stop overwriting my Triggers as noted in https://github.com/jenkinsci/gitlab-plugin/issues/692 . You’ll see

WARNING: The properties step will remove all JobPropertys currently configured in this job, either from the UI or from an earlier properties step.

References
https://github.com/jenkinsci/gitlab-plugin
https://github.com/jenkinsci/gitlab-plugin/blob/master/src/main/java/com/dabsquared/gitlabjenkins/GitLabPushTrigger.java
https://github.com/jenkinsci/gitlab-plugin/blob/master/src/main/resources/com/dabsquared/gitlabjenkins/GitLabPushTrigger/config.jelly
https://github.com/jenkinsci/job-dsl-plugin/blob/master/job-dsl-core/src/main/groovy/javaposse/jobdsl/dsl/helpers/triggers/GitLabTriggerContext.groovy
https://github.com/jenkinsci/pipeline-model-definition-plugin/wiki/Trigger-runs
https://dev.to/pencillr/jenkins-pipelines-and-their-dirty-secrets-2
https://github.com/jenkinsci/pipeline-model-definition-plugin/wiki/Parametrized-pipelines
https://issues.jenkins-ci.org/browse/JENKINS-45053
https://jenkins.io/doc/book/pipeline/syntax/#declarative-pipeline
https://jenkins.io/doc/pipeline/steps/
https://gitlab.switch.ch/etienne.dysli-metref/idpv3-mfa/commit/6d2974b1199fb2101f5f4299a974cac66a220080
https://github.com/jenkinsci/gitlab-plugin/issues/417