org.apache.hadoop.tools.mapred.lib
Class DynamicInputChunkSet

java.lang.Object
  extended by org.apache.hadoop.tools.mapred.lib.DynamicInputChunkSet

public class DynamicInputChunkSet
extends Object

The DynamicInputChunkSet abstracts the context in which a DynamicInputChunk is constructed, acquired or released. There is one instance of DynamicInputChunkSet for each unique Hadoop job that uses the DynamicInputFormat.


Constructor Summary
DynamicInputChunkSet(org.apache.hadoop.conf.Configuration configuration)
          Constructor, to initialize the context in which DynamicInputChunks are used.
 
Method Summary
 org.apache.hadoop.tools.mapred.lib.DynamicInputChunk acquire(org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext)
          Factory method that 1.
 org.apache.hadoop.tools.mapred.lib.DynamicInputChunk createChunkForWrite(String chunkId)
          Factory method to create chunk-files for writing to.
 String getChunkFilePrefix()
          The string with which all chunk-file-names are prefixed.
 org.apache.hadoop.fs.Path getChunkRootPath()
          The root-path of the directory where DynamicInputChunks are stored.
 org.apache.hadoop.conf.Configuration getConf()
          Getter for the Configuration with which the DynamicInputChunkSet was constructed.
 org.apache.hadoop.fs.FileSystem getFileSystem()
          FileSystem instance, for the file-system where the chunk-files are stored.
 int getNumChunksLeft()
          Number of chunk-files left, on last directory scan.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DynamicInputChunkSet

public DynamicInputChunkSet(org.apache.hadoop.conf.Configuration configuration)
                     throws IOException
Constructor, to initialize the context in which DynamicInputChunks are used.

Parameters:
configuration - The Configuration instance, as received from the DynamicInputFormat or DynamicRecordReader.
Throws:
IOException - Exception in case of failure.
Method Detail

getConf

public org.apache.hadoop.conf.Configuration getConf()
Getter for the Configuration with which the DynamicInputChunkSet was constructed.

Returns:
Configuration object.

getChunkRootPath

public org.apache.hadoop.fs.Path getChunkRootPath()
The root-path of the directory where DynamicInputChunks are stored.

Returns:
The chunk-directory location.

getChunkFilePrefix

public String getChunkFilePrefix()
The string with which all chunk-file-names are prefixed.

Returns:
Prefix string, for all chunk files.

getNumChunksLeft

public int getNumChunksLeft()
Number of chunk-files left, on last directory scan.

Returns:
If chunk-directory hasn't been scanned yet, -1. Otherwise, the number of chunk-files left.

getFileSystem

public org.apache.hadoop.fs.FileSystem getFileSystem()
FileSystem instance, for the file-system where the chunk-files are stored.

Returns:
FileSystem instance.

createChunkForWrite

public org.apache.hadoop.tools.mapred.lib.DynamicInputChunk createChunkForWrite(String chunkId)
                                                                         throws IOException
Factory method to create chunk-files for writing to. (For instance, when the DynamicInputFormat splits the input-file into chunks.)

Parameters:
chunkId - String to identify the chunk.
Returns:
A DynamicInputChunk, corresponding to a chunk-file, with the name incorporating the chunk-id.
Throws:
IOException - Exception on failure to create the chunk.

acquire

public org.apache.hadoop.tools.mapred.lib.DynamicInputChunk acquire(org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext)
                                                             throws IOException,
                                                                    InterruptedException
Factory method that 1. acquires a chunk for the specified map-task attempt 2. returns a DynamicInputChunk associated with the acquired chunk-file.

Parameters:
taskAttemptContext - The attempt-context for the map task that's trying to acquire a chunk.
Returns:
The acquired dynamic-chunk. The chunk-file is renamed to the attempt-id (from the attempt-context.)
Throws:
IOException - Exception on failure.
InterruptedException - Exception on failure.


Copyright © 2014 InMobi. All rights reserved.