UniformSizeInputFormat (Conduit 2.3.0 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.hadoop.tools.mapred
Class UniformSizeInputFormat

java.lang.Object
  org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.fs.FileStatus>
      org.apache.hadoop.tools.mapred.UniformSizeInputFormat

public class UniformSizeInputFormat
extends org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.fs.FileStatus>
extends org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.fs.FileStatus>

UniformSizeInputFormat extends the InputFormat<> class, to produce input-splits for DistCp. It looks at the copy-listing and groups the contents into input-splits such that the total-number of bytes to be copied for each input split is uniform.

Constructor Summary
`UniformSizeInputFormat()`

Method Summary
`org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.fs.FileStatus>`	`createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)` Implementation of InputFormat::createRecordReader().
`List<org.apache.hadoop.mapreduce.InputSplit>`	`getSplits(org.apache.hadoop.mapreduce.JobContext context)` Implementation of InputFormat::getSplits().

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

UniformSizeInputFormat

public UniformSizeInputFormat()

Method Detail

getSplits

public List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext context)
                                                       throws IOException,
                                                              InterruptedException

Implementation of InputFormat::getSplits(). Returns a list of InputSplits, such that the number of bytes to be copied for all the splits are approximately equal.

Specified by:: getSplits in class org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.fs.FileStatus>

Parameters:: context: - JobContext for the job.
Returns:: The list of uniformly-distributed input-splits.
Throws:: IOException: - On failure.; InterruptedException; IOException

createRecordReader

public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.fs.FileStatus> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
                                                                                                                              org.apache.hadoop.mapreduce.TaskAttemptContext context)
                                                                                                                       throws IOException,
                                                                                                                              InterruptedException

Implementation of InputFormat::createRecordReader().

Specified by:: createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.fs.FileStatus>

Parameters:: split: - The split for which the RecordReader is sought.; context: - The context of the current task-attempt.
Returns:: A SequenceFileRecordReader instance, (since the copy-listing is a simple sequence-file.)
Throws:: IOException; InterruptedException