Datastage Randomly Locked out

Suddenly, my datastage pipeline stopped working. I hit this error:

DB2_Connector_2: [Input link 0] SQLConnect reported: SQLSTATE = 42724: Native Error Code = -10,013: Msg = [IBM][CLI Driver] SQL10013N The specified library "GSKit Error: 408" could not be loaded. SQLSTATE=42724 (CC_DB2Connection::connect, file CC_DB2Connection.cpp, line 856)

The error was due to permission change on our SSL credentials (p12/jks)

DSEngine Status Code 81016

I hit the following issue with the dsadmin

[dsadm@server-1 DSEngine]$ bin/dsadmin -listprojects
ERROR: Connection to the specified engine tier host failed or was refused. Check that the RPC daemon service is running on the host and that no firewall is blocking the connection
Status code = 81016

As I don’t own the server implementation, I had to check the services setup

cat /etc/services | grep -i dsrpc 

You should see:

dsrpc 31538/tcp # RPCdaemon DSEngine@/opt/IBM/InformationServer/Server/DSEngine

netstat -plutn | grep -i 31538 

the process wasn’t started and I restarted DSEngine

Stop the InfoSphere DataStage engine services by using the following command:

su - dsadm 
cd /opt/IBM/InformationServer/Server/DSEngine
source dsenv
bin/uv -admin -stop

Stopping JobMonApp
No running PX Yarn Client for dsadm found to stop.
JobMonApp has been shut down.
Stopping DSAppWatcher
AppWatcher:ALREADY_STOPPED
ResMonApp:STOPPING
ResMonApp:ALREADY_STOPPED
ODBQueryApp:STOPPING
ODBQueryApp:ALREADY_STOPPED
EngMonApp:STOPPING
EngMonApp:ALREADY_STOPPED
DataStage Engine <<VERSION>> instance "ade" has been brought down.

Wait 30 seconds so that the engine services stop.

Start the engine services by using the following command:

su - dsadm 
cd /opt/IBM/InformationServer/Server/DSEngine
source dsenv
bin/uv -admin -start

Checking NLS locale OFF
Checking NLS locale DEFAULT
Checking NLS locale CA-ENGLISH
Checking NLS locale GB-ENGLISH
Checking NLS locale IE-ENGLISH
Checking NLS locale AU-ENGLISH
Checking NLS locale ZA-ENGLISH
Checking NLS locale US-ENGLISH
Checking NLS locale NZ-ENGLISH
Loading NLS map file UNICODE
Loading NLS map file UTF8
Loading NLS map file MS1256-WIN2K-CS
Loading NLS map file ISO8859-2-CS
Loading NLS map file MNEMONICS
Loading NLS map file ISO8859-3
Loading NLS map file ISO8859-7+MARKS
Loading NLS map file MS949
Loading NLS map file ASCII
Loading NLS map file MS1255
Loading NLS map file PC437
Loading NLS map file ISO8859-5
Loading NLS map file ISO8859-4
Loading NLS map file MS949-CS
Loading NLS map file PC874-CS
Loading NLS map file ISO8859-10-CS
Loading NLS map file ISO8859-10+MARKS
Loading NLS map file ISO8859-9-CS
Loading NLS map file ISO8859-6+MARKS
Loading NLS map file ISO8859-4-CS
Loading NLS map file MS1252+MARKS
Loading NLS map file ISO8859-2+MARKS
Loading NLS map file ISO8859-10
Loading NLS map file MS1256-CS
Loading NLS map file ISO8859-8
Loading NLS map file MS1251
Loading NLS map file MS936
Loading NLS map file ISO8859-3+MARKS
Loading NLS map file MS936-CS
Loading NLS map file MS1256
Loading NLS map file ISO8859-5+MARKS
Loading NLS map file ISO8859-6-CS
Loading NLS map file MS932-CS
Loading NLS map file ISO8859-8+MARKS
Loading NLS map file MS1250-CS
Loading NLS map file MS1253
Loading NLS map file MS1256-WIN2K
Loading NLS map file ISO8859-9
Loading NLS map file TIS620-CS
Loading NLS map file MS932
Loading NLS map file MS1253-CS
Loading NLS map file ISO8859-2
Loading NLS map file MS1254
Loading NLS map file ISO8859-6
Loading NLS map file MS1252-CS
Loading NLS map file ISO8859-1-CS
Loading NLS map file MS1251-CS
Loading NLS map file PC850
Loading NLS map file MS1250
Loading NLS map file ISO8859-1+MARKS
Loading NLS map file ISO8859-1
Loading NLS map file MS1254-CS
Loading NLS map file MS950-CS
Loading NLS map file ISO8859-5-CS
Loading NLS map file ISO8859-3-CS
Loading NLS map file ISO8859-8-CS
Loading NLS map file MS1255-CS
Loading NLS map file ISO8859-15-CS
Loading NLS map file ISO8859-7
Loading NLS map file UTF8-CS
Loading NLS map file ISO8859-4+MARKS
Loading NLS map file MS1252
Loading NLS map file ISO8859-15
Loading NLS map file ISO8859-9+MARKS
Loading NLS map file ASCII+MARKS
Loading NLS map file ISO8859-7-CS
Loading NLS map file MS950
Loading NLS map file ISO8859-15+MARKS
68 NLS Character Set Maps loaded in 4055120 bytes.
Loading 9 NLS Locales
9 NLS Locales loaded in 793824 bytes
DataStage Engine <<VERSION>> instance "ade" has been brought up.
Starting JobMonApp
JobMonApp has been started.
resource_tracker has been started.
Starting DSAppWatcher

It was restarted after the stop/start.

Reference
http://www-01.ibm.com/support/docview.wss?uid=swg21402853
http://www-01.ibm.com/support/docview.wss?uid=swg21452589
https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_8.7.0/com.ibm.swg.im.iis.productization.iisinfsv.migrate.doc/topics/a_start_stop_ds_server.html

Spark and Data Tips for November 2018

Full Hadoop / HBase data platform for testing spark

I found the following docker very handy for testing hadoop.
https://hub.docker.com/r/bigdatauniversity/spark2/
docker pull bigdatauniversity/spark2
docker run -it –name bdu_spark2 -P -p 4040:4040 -p 4041:4041 -p 8080:8080 -p 8081:8081 bigdatauniversity/spark2:latest /etc/bootstrap.sh -bash

Spark Notebooks

I found these sites useful – http://spark-notebook.io/ and https://github.com/spark-notebook/spark-notebook and https://github.com/IBM?language=jupyter+notebook

Version Mismatch

If you hit this error

Exception: Python in worker has different version 2.6 than that in driver 2.7, PySpark cannot run with different minor versions

It’s a quick fix

Update spark-env.sh 
export PYSPARK_PYTHON=/usr/local/bin/python3

Thanks to http://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/Python-in-worker-has-different-version-than-that-in-driver/td-p/60620

Kernels

A handy list of kernels https://github.com/jupyter/jupyter/wiki/Jupyter-kernels

I used the Python 3 kernel.

pip3 install spylon-kernel

I did also play with Toreee
https://github.com/apache/incubator-toree

pip3 install toree
jupyter toree install

UCD: Application Processes Branching

Urban Code Deploy (UCD) is a tool we use to manage the deployment of our healthcare platform. I needed to branch between two different processes, and the setup and steps to get branching done between Application processes was not clearly documented.

I built some custom bash logic to switch based on results:
#!/bin/bash
if [ -f /etc/SHORT_CIRCUIT ]
then
echo "result=YES"
else
echo "result=NO"
fi

I created a Post Processing Step

var exit = properties.get('exitCode');

scanner.register("result=", function(lineNumber, line) {
var value=line.replace("result=","");
properties.put("ShortCircuit",value);
});

if (exit == 0) {
properties.put('Status', 'Success');
}
else {
properties.put('Status', 'Failure');
}

scanner.scan();

the scanner matches the text output for a value that has result= in it. I highly suggest having this in your scanner, otherwise you might see ucd propertyValue=script content: which indicates that it did not find any result to set.

Once you get the value in the step, it’ll be available downstream once you set Process Properties.  In this case, I pickup the property from the <STEP NAME><SLASH><PROPERTY NAME> and set it into ShortCircuitStatus.

I then use it to make a decision downstream in a switch statement.  I do suggest not using a default path in the switch statement, it ensures there is a deterministic and expected outcome of the prior process.  I used YES/NO in the switch statement.

 

 

References

 

Maven Repository – Go Offline with dependencies

Maven Repository

My team uses the pom.xml to generate a repository which is handed off to the secondary developers. For instance, I have a custom db2 jar

## Update your localRepository
Start a Shell
cd ~/.m2
vim settings.xml
add `<localRepository>/Users/userid/git/client-app/documentation/repo/local_repo</localRepository>`
Note: the path is relative to the location of my repo
You may have to create the folder local_repo
Save the file to where it makes most sense

## Take the Repo Offline
Change directory to the `repo` folder
`mvn dependency:go-offline`
cd local_repo
remove any previous zip
Run `zip -9 -v -r repo.zip ‘base-path-for-jars-you-want/’` (Only archives into the zip the dependencies you want)
Upload the zip to the `client-app` wiki

## Remove your local repository line
Start a shell
cd ~/.m2
vim settings.xml
remove the `<localRepository>` node
Save the file

Helpful Tips – October 2018

XCode Tools shortcut

xcode-select –install
If you hit an issue trying to install, CommandLineTools after you install mavericks. use xcode-select –install it made things so much easier. (I was installing LaTex)

Solid OO Design

I was clued into this software design approach on HackerNoon. It’s a great approach to software design and I use the principle in my current software projects.

Extreme Presentation

The Extreme Presentation process is a brilliantly simple way of presenting useful data insights in the right way.I give the author’s all the credit on their website.

Telemetry Reporting in Visual Studio Code

You should disable the telemetry reporting in Visual Studio Code. It’s reporting all the time.  The continual reporting is a potential issue.

Some other ancillary links from the month:
Training and Other Links
https://jupyter.readthedocs.io/en/latest/install.html
https://github.com/SciSpike/kafka-lab/
https://www.ibm.com/support/knowledgecenter/en/SS6NHC/com.ibm.swg.im.dashdb.admin.mon.doc/doc/r0054077.html
https://dzone.com/articles/kafka-clients-at-most-once-at-least-once-exactly-o

Software Design
https://www.12factor.net/
https://martinfowler.com/articles/microservices.html
https://samnewman.io/patterns/architectural/bff/
https://www.ibm.com/design/thinking/page/toolkit

Maven Thread Speed Up

Like many developers, I have tons of jobs running to compile, unit and integration test my code.  These jobs take anywhere from 30 seconds to 30 minutes.

Some simple operations took a while…. I wondered why… Thanks to Oleg @ ZeroTurnAround I have an answer – Your Maven build is slow. Speed it up!

I applied the setting to speed up my build (30 minutes dropped to 10 minutes)

mvn clean package -T 4 -S local-m2/settings.xml

I hope this helps others.

Hadoop KMS Ranger API – Tips and cURLs

I use Hadoop KMS Ranger in one environment. Some sample rest api calls are below, along with two tips.

versionName is used in multiple queries.

When not using kerberos – set ?user.name=hdfs on the URL

 

References
https://hadoop.apache.org/docs/current/hadoop-kms/index.html#KMS_HTTP_REST_API
https://hadoop.apache.org/docs/current/hadoop-kms/index.html#Get_Key_Names
https://stackoverflow.com/questions/37601763/authentication-issue-with-kms-hadoop

Lightweight HBase Client

As many developers know, HBase’s default client has everything, netting 10s of Megabytes of size. Lilyproject reduces this to a more manageable and useful size.

https://github.com/NGDATA/lilyproject/blob/master/global/hbase-client/pom.xml

<dependency>
    <groupId>org.lilyproject</groupId>
    <artifactId>lily-hbase-client</artifactId>
    <version>2.6.1</version>
</dependency>

Avoiding Thrashing with Lots and Lots of Files on the Mac

To Avoid thrashing with Lots and Lots of Files on the Mac, I had to refer to http://blog.hostilefork.com/trashes-fseventsd-and-spotlight-v100/

defaults write com.apple.desktopservices DSDontWriteNetworkStores true