Advanced HBase Shell

Recently, I’ve had to do some advanced HBase Shell scripting to check data consistency.

Tip # 1 – You can easily establish min/max times in nanoseconds from BASH and feed them into your script.

  $(date +%s%6N) 

gets you 6 digits of precision 3 for millis 3 for nanos.

Tip # 2 – Use the ‘include Java’ line to get access to all JARs that HBase has access to.

Tip # 3 – Forget about Bytes…. convert to Base64 to make it human readable (and you don’t want to break lines – the number 8 keeps it from wrapping). https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/util/Base64.html#DONT_BREAK_LINES

import org.apache.hadoop.hbase.util.Base64;
content = Bytes.toString(tableName.getValue(Bytes.toBytes("m"),Bytes.toBytes("d")))
x = Base64.encodeBytes(r.getRow(), 8)
puts "#{x}"

Tip # 4 – use GSON to parse JSON efficiently across a scan.

import com.google.gson.JsonParser;
parser = JsonParser.new
jsonBody = Bytes.toString(tableName.getValue(Bytes.toBytes("d"),Bytes.toBytes("b")))

json = parser.parse(jsonBody)
object = json.getAsJsonObject()
metaObj = object.get('mx')
objVer = metaObj.get('vid').getAsString()
objId = object.get('id').getAsString()

Tip #5 – use it as a script

time /usr/iop/current/hbase-client/bin/hbase org.jruby.Main /tmp/check.rb > check.log

Primer: Tips for using Random Test Files

On Mac and Linux, create a random file (https://stackoverflow.com/questions/257844/quickly-create-a-large-file-on-a-linux-system)
dd if=/dev/urandom of=random-test-file-100m bs=1024k count=100

Example:
$ dd if=/dev/urandom of=random-test-file-100m bs=1024k count=100
100+0 records in
100+0 records out
102400000 bytes transferred in 10.143507 secs (10095128 bytes/sec)

This is a random file. You should also use shasum -A 256 and make sure the file you are using for test is what you get back. Do also remember 1024 bytes to a KB… use blocksizes of 1024

On linux, create a fast file (https://stackoverflow.com/a/11779492):
fallocate -l 100M test-file-100m

This takes less than a second. The file is not good for testing random behaviors with large files as the generated file is huge, but the same bytes repated for 100M.

To check the size of the sample files, you can use human readable format settings (https://unix.stackexchange.com/a/281113).

MacBook-Air:test user$ ls -lSh
total 24
-rw-r–r– 1 user staff 10K Dec 20 19:53 setup.log

Maven Tips

Here are some tips for Maven Java Projects in Eclipse.

#1 If you pom.xml is missing maven-compiler-plugin, Eclipse (when you do Maven > Update Project), defaults the compiler level for the project to 1.5.

<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.0.2</version>
<configuration>
<source>1.7</source>
<target>1.7</target>
</configuration>
</plugin>

Now, it gets even more tricky… if your project has defined features that are from Java 1.6, 1.7 or 1.8

#2 if you your web.xml defines a version not compatible with a project facet you depend on, you get a lot of problems/errors

<web-app xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://java.sun.com/xml/ns/javaee" xmlns:web="http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" id="WebApp_ID" version="2.5"/>

Convert the web.xml to 3.1, and compatibility should now be fixed

<web-app id="WebApp_ID" version="3.1" xmlns="http://xmlns.jcp.org/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee http://xmlns.jcp.org/xml/ns/javaee/web-app_3_1.xsd">

I found this stackoverflow entry helpful and this one.

#3 Update > Maven Project overwrites manual changes to #1/#2.

I found this stackoverflow entry helpful.

#4 Make sure you don’t have two web.xml files… the convert dynamic web project to maven project does some funky stuff.

OAuth Tips

For developers that are beginning to investigate oAuth and IBM Connections Cloud, you’ll find two interesting things about the oAuth 1.0a web flows and the oAuth 2.0 web flows.

1- The flows don’t support extra on the flow – for instance the state parameter. state=XYZ123

2- The oAuth 2.0 flow expects callback_uri, not the common redirect_uri parameter.

The various flows are located at oAuth 1.0a web flow and oAuth 2.0 web flow

A Few OAuth Notes

While I was working with a fellow developer building an integration for IBM Connections on premises, I found out about a couple of key items with the OAuth Provider.

1 – You can see a JSON Array of the Current User Tokens, when logged in as that user.

Navigate to https://sbdev.server:444/oauth2/authzMgmt/connectionsProvider

Login to IBM Connections

Look at the JSON Data to see the Granted Applications for the Logged in User

Apps
Apps

2 – You can see all the applications which are granted user oauth tokens – automatically authorized and manually authorized.

Navigate to https://sbtdev.server/common/oauth/apps?autoAuth=true

3 – OAuth Whitelists

Per
http://www-01.ibm.com/support/docview.wss?uid=swg21627911 , you can update your OAuth whitelists based on the client-id you set.

“As a measure to reduce hassle to users for trusted OAuth clients, IBM Connections implements an extension to the OAuth protocol that allows whitelisted clients to skip the authorization request when utilized from within the Connections user interface. In order to list an application as a trusted auto-authorization enabled client, an administrator must perform steps that are covered by the product documentation topic http://www-10.lotus.com/ldd/lcwiki.nsf/dx/Registering_an_OAuth_client_with_a_provider_ic40 .

Edit the connectionsProvider.xml in the Deployment Manager profile.

clients-oauth

Synchronize your nodes, and restart the server.

Volia… you have

Finally you can read more about OAuth at  http://www-01.ibm.com/support/knowledgecenter/SSYGQH_5.0.0/admin/admin/t_admin_registeroauthclientwprovider.dita