org.apache.hadoop.tools
Class DistCp

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.hadoop.tools.DistCp
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool
Direct Known Subclasses:
ConduitDistCp

public class DistCp
extends org.apache.hadoop.conf.Configured
implements org.apache.hadoop.util.Tool


Field Summary
protected  DistCpOptions inputOptions
           
protected  org.apache.hadoop.fs.Path metaFolder
           
static Random rand
           
 
Constructor Summary
DistCp(org.apache.hadoop.conf.Configuration configuration, DistCpOptions inputOptions)
          Public Constructor.
 
Method Summary
protected  org.apache.hadoop.fs.Path createInputFileListing(org.apache.hadoop.mapreduce.Job job)
          Create input listing by invoking an appropriate copy listing implementation.
protected  org.apache.hadoop.mapreduce.Job createJob()
          Create Job object for submitting it, with all the configuration
 org.apache.hadoop.mapreduce.Job execute()
          Implements the core-execution.
protected  org.apache.hadoop.fs.Path getFileListingPath()
          Get default name of the copy listing file.
static void main(String[] argv)
          Main function of the DistCp program.
 int run(String[] argv)
          Implementation of Tool::run().
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Field Detail

inputOptions

protected DistCpOptions inputOptions

metaFolder

protected org.apache.hadoop.fs.Path metaFolder

rand

public static final Random rand
Constructor Detail

DistCp

public DistCp(org.apache.hadoop.conf.Configuration configuration,
              DistCpOptions inputOptions)
       throws Exception
Public Constructor. Creates DistCp object with specified input-parameters. (E.g. source-paths, target-location, etc.)

Parameters:
inputOptions: - Options (indicating source-paths, target-location.)
configuration: - The Hadoop configuration against which the Copy-mapper must run.
Throws:
Exception, - on failure.
Exception
Method Detail

run

public int run(String[] argv)
Implementation of Tool::run(). Orchestrates the copy of source file(s) to target location, by: 1. Creating a list of files to be copied to target. 2. Launching a Map-only job to copy the files. (Delegates to execute().)

Specified by:
run in interface org.apache.hadoop.util.Tool
Parameters:
argv: - List of arguments passed to DistCp, from the ToolRunner.
Returns:
On success, it returns 0. Else, -1.

execute

public org.apache.hadoop.mapreduce.Job execute()
                                        throws Exception
Implements the core-execution. Creates the file-list for copy, and launches the Hadoop-job, to do the copy.

Returns:
Job handle
Throws:
Exception, - on failure.
Exception

createJob

protected org.apache.hadoop.mapreduce.Job createJob()
                                             throws IOException
Create Job object for submitting it, with all the configuration

Returns:
Reference to job object.
Throws:
IOException - - Exception if any

createInputFileListing

protected org.apache.hadoop.fs.Path createInputFileListing(org.apache.hadoop.mapreduce.Job job)
                                                    throws IOException
Create input listing by invoking an appropriate copy listing implementation. Also add delegation tokens for each path to job's credential store

Parameters:
job - - Handle to job
Returns:
Returns the path where the copy listing is created
Throws:
IOException - - If any

getFileListingPath

protected org.apache.hadoop.fs.Path getFileListingPath()
                                                throws IOException
Get default name of the copy listing file. Use the meta folder to create the copy listing file

Returns:
- Path where the copy listing file has to be saved
Throws:
IOException - - Exception if any

main

public static void main(String[] argv)
Main function of the DistCp program. Parses the input arguments (via OptionsParser), and invokes the DistCp::run() method, via the ToolRunner.

Parameters:
argv: - Command-line arguments sent to DistCp.


Copyright © 2014 InMobi. All rights reserved.