Helpful Tips – October 2018

XCode Tools shortcut

xcode-select –install
If you hit an issue trying to install, CommandLineTools after you install mavericks. use xcode-select –install it made things so much easier. (I was installing LaTex)

Solid OO Design

I was clued into this software design approach on HackerNoon. It’s a great approach to software design and I use the principle in my current software projects.

Extreme Presentation

The Extreme Presentation process is a brilliantly simple way of presenting useful data insights in the right way.I give the author’s all the credit on their website.

Telemetry Reporting in Visual Studio Code

You should disable the telemetry reporting in Visual Studio Code. It’s reporting all the time.  The continual reporting is a potential issue.

Some other ancillary links from the month:
Training and Other Links
https://jupyter.readthedocs.io/en/latest/install.html
https://github.com/SciSpike/kafka-lab/
https://www.ibm.com/support/knowledgecenter/en/SS6NHC/com.ibm.swg.im.dashdb.admin.mon.doc/doc/r0054077.html
https://dzone.com/articles/kafka-clients-at-most-once-at-least-once-exactly-o

Software Design
https://www.12factor.net/
https://martinfowler.com/articles/microservices.html
https://samnewman.io/patterns/architectural/bff/
https://www.ibm.com/design/thinking/page/toolkit

Maven Thread Speed Up

Like many developers, I have tons of jobs running to compile, unit and integration test my code.  These jobs take anywhere from 30 seconds to 30 minutes.

Some simple operations took a while…. I wondered why… Thanks to Oleg @ ZeroTurnAround I have an answer – Your Maven build is slow. Speed it up!

I applied the setting to speed up my build (30 minutes dropped to 10 minutes)

mvn clean package -T 4 -S local-m2/settings.xml

I hope this helps others.

Hadoop KMS Ranger API – Tips and cURLs

I use Hadoop KMS Ranger in one environment. Some sample rest api calls are below, along with two tips.

versionName is used in multiple queries.

When not using kerberos – set ?user.name=hdfs on the URL

 

References
https://hadoop.apache.org/docs/current/hadoop-kms/index.html#KMS_HTTP_REST_API
https://hadoop.apache.org/docs/current/hadoop-kms/index.html#Get_Key_Names
https://stackoverflow.com/questions/37601763/authentication-issue-with-kms-hadoop

Ambari All Sorts of Messed Up

My team and I run Ambari and Ambari agents which controls our HDFS/HBase and general HADOOP/Apache ecosystem machines.  Our bare metal machines hung, and we could not get anything restarted.

In the logs, we had:

{'msg': 'Unable to read structured output from /var/lib/ambari-agent/data/structured-out-status.json'}

We found a link at https://community.hortonworks.com/content/supportkb/49517/services-are-running-but-ambari-reports-them-faile.html and the fix.

  1. Remove /var/lib/ambari-agent/data/structured-out-status.json
  2. Restart ambari agent.

Our ambari and setup now works.

VIM – JOIN Conditions with Unicode and ASCII

JOIN Conditions with Unicode and ASCII

I cannot stress the dangers of copying data from Excel or HTML and assuming that it’s ASCII. For example U+0040 is the unicode version of @. We ingested the unicode version and couldn’t see why a JOIN condition on the data table wasn’t working.

I looked at the source JSON ( a FHIR DSTU2 Group ) and loaded in VIM and used the following trick:

set encoding=latin1

We ended up showing that our data table’s contents were different using:

SELECT HEX(RESOURCE_VALUE) FROM FHIR.DIM_GROUP
0A40 vs 40

References

https://unix.stackexchange.com/questions/108020/can-vim-display-ascii-characters-only-and-treat-other-bytes-as-binary-data

Remove Duplicates in DB2 Columnar Format

I had dupe data in my OLAP table, where the columnar data can be duplicated based on event id. (I loaded data 2x). I had to differentiate the data and remove the duplicates, so I assigned row_numbers over a partition ordered by.

I hope this helps you.
db2 "update (select OME.*, row_number() over(partition by IDN_EVENT_ID order by IDN_EVENT_ID) as rnk from X.OLAP OME) set APP_NM = rnk"

Then I removed using this.
db2 "DELETE X.OLAP OME WHERE APP_NM = 2"

I recommend the two-phase, as you can in theory run this en batch, or async, and double check, versus hope it works.