Created_at :Aug 2011
Hadoop Useful Utility Classes
Some handy classes for using Hadoop / Map Reduce / Hbase
org.apache.hadoop.mapreduce.Mapper<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
org.apache.hadoop.mapreduce.Reducer<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
jar : hadoop-core.jar
if your mappers and reducers write inputs to outputs, then use these guys. No need to receate them.
Shell / ShellCommandExecutor
org.apache.hadoop.util.Shell
org.apache.hadoop.util.Shell.ShellCommandExecutor
jar : hadoop-core.jar
handy for executing commands on local machine and inspect outputs
org.apache.hadoop.util.StringUtils
jar : hadoop-core.jar
lots of functions to deal with Strings. I will highlight a few
StringUtils.formatTime() : human readable elapsed time
how long is 100000000 ms? see below
ClusterStatus : org.apache.hadoop.mapred.ClusterStatus
jar : hadoop-core.jar
Find out how many nodes are in the cluster, how many mappers, reducers ...etc
org.apache.hadoop.hbase.util.Bytes
jar : hbase*.jar
handy utility for dealing with bytes and byte arrays
** Comment on this article **
Hadoop Useful Utility Classes
Some handy classes for using Hadoop / Map Reduce / Hbase
IdentityMapper / IdentityReducer
org.apache.hadoop.mapreduce.Mapper<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
org.apache.hadoop.mapreduce.Reducer<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
jar : hadoop-core.jar
if your mappers and reducers write inputs to outputs, then use these guys. No need to receate them.
Shell / ShellCommandExecutor
org.apache.hadoop.util.Shell
org.apache.hadoop.util.Shell.ShellCommandExecutor
jar : hadoop-core.jar
handy for executing commands on local machine and inspect outputs
StringUtils
org.apache.hadoop.util.StringUtils
jar : hadoop-core.jar
lots of functions to deal with Strings. I will highlight a few
StringUtils.byteDesc() : User-friendly / human-readable byte lengths
how many megabytes is 10000000 bytes? this will tell you.StringUtils.byteToHexString() : Convert Bytes to Hex strings and vice-versa
We deal with byte arrays in Hadoop / map reduce. This is a handy way to print / debug issuesStringUtils.formatTime() : human readable elapsed time
how long is 100000000 ms? see belowHadoop Cluster Status
ClusterStatus : org.apache.hadoop.mapred.ClusterStatus
jar : hadoop-core.jar
Find out how many nodes are in the cluster, how many mappers, reducers ...etc
Hbase Handy Classes
Bytes
org.apache.hadoop.hbase.util.Bytes
jar : hbase*.jar
handy utility for dealing with bytes and byte arrays
Bytes.toBytes() : convert objects to bytes
Bytes.add() : create composite keys
** Comment on this article **