Created_at :Aug 2011

Hadoop Useful Utility Classes


Some handy classes for using Hadoop / Map Reduce / Hbase

IdentityMapper  / IdentityReducer


org.apache.hadoop.mapreduce.Mapper<KEYIN,VALUEIN,KEYOUT,VALUEOUT>

org.apache.hadoop.mapreduce.Reducer<KEYIN,VALUEIN,KEYOUT,VALUEOUT>

jar : hadoop-core.jar

if your mappers and reducers write inputs to outputs, then use these guys.  No need to receate them.


Shell  / ShellCommandExecutor


org.apache.hadoop.util.Shell

org.apache.hadoop.util.Shell.ShellCommandExecutor

jar : hadoop-core.jar

handy for executing commands on local machine and inspect outputs



StringUtils


org.apache.hadoop.util.StringUtils

jar : hadoop-core.jar

lots of functions to deal with Strings.  I will highlight a few


StringUtils.byteDesc() : User-friendly / human-readable byte lengths

how many megabytes is 10000000 bytes?   this will tell you.



StringUtils.byteToHexString() : Convert Bytes to Hex strings and vice-versa

We deal with byte arrays in Hadoop / map reduce.  This is a handy way to print / debug issues


StringUtils.formatTime() :  human readable elapsed time

how long is 100000000 ms?   see below



Hadoop Cluster Status


ClusterStatus : org.apache.hadoop.mapred.ClusterStatus

jar : hadoop-core.jar

Find out how many nodes are in the cluster, how many mappers, reducers ...etc



Hbase Handy Classes



Bytes


org.apache.hadoop.hbase.util.Bytes

jar : hbase*.jar

handy utility for dealing with bytes and byte arrays


Bytes.toBytes() : convert objects to bytes




Bytes.add()  : create composite keys




** Comment on this article **