org.apache.hadoop.tools.mapred
Class UniformSizeInputFormat
java.lang.Object
org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.fs.FileStatus>
org.apache.hadoop.tools.mapred.UniformSizeInputFormat
public class UniformSizeInputFormat
- extends org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.fs.FileStatus>
UniformSizeInputFormat extends the InputFormat<> class, to produce
input-splits for DistCp.
It looks at the copy-listing and groups the contents into input-splits such
that the total-number of bytes to be copied for each input split is
uniform.
Method Summary |
org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.fs.FileStatus> |
createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
org.apache.hadoop.mapreduce.TaskAttemptContext context)
Implementation of InputFormat::createRecordReader(). |
List<org.apache.hadoop.mapreduce.InputSplit> |
getSplits(org.apache.hadoop.mapreduce.JobContext context)
Implementation of InputFormat::getSplits(). |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
UniformSizeInputFormat
public UniformSizeInputFormat()
getSplits
public List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext context)
throws IOException,
InterruptedException
- Implementation of InputFormat::getSplits(). Returns a list of InputSplits,
such that the number of bytes to be copied for all the splits are
approximately equal.
- Specified by:
getSplits
in class org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.fs.FileStatus>
- Parameters:
context:
- JobContext for the job.
- Returns:
- The list of uniformly-distributed input-splits.
- Throws:
IOException:
- On failure.
InterruptedException
IOException
createRecordReader
public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.fs.FileStatus> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
org.apache.hadoop.mapreduce.TaskAttemptContext context)
throws IOException,
InterruptedException
- Implementation of InputFormat::createRecordReader().
- Specified by:
createRecordReader
in class org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.fs.FileStatus>
- Parameters:
split:
- The split for which the RecordReader is sought.context:
- The context of the current task-attempt.
- Returns:
- A SequenceFileRecordReader instance, (since the copy-listing is a
simple sequence-file.)
- Throws:
IOException
InterruptedException
Copyright © 2014 InMobi. All rights reserved.