org.apache.hadoop.tools
Class FileBasedCopyListing

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.hadoop.tools.CopyListing
          extended by org.apache.hadoop.tools.FileBasedCopyListing
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable

public class FileBasedCopyListing
extends CopyListing

FileBasedCopyListing implements the CopyListing interface, to create the copy-listing for DistCp, by iterating over all source paths mentioned in a specified input-file.


Constructor Summary
FileBasedCopyListing(org.apache.hadoop.conf.Configuration configuration, org.apache.hadoop.security.Credentials credentials)
          Constructor, to initialize base-class.
 
Method Summary
 void doBuildListing(org.apache.hadoop.fs.Path pathToListFile, DistCpOptions options)
          Implementation of CopyListing::buildListing().
protected static List<org.apache.hadoop.fs.Path> fetchFileList(org.apache.hadoop.fs.Path sourceListing, org.apache.hadoop.conf.Configuration conf)
           
protected  long getBytesToCopy()
          Return the total bytes that distCp should copy for the source paths This doesn't consider whether file is same should be skipped during copy
protected  long getNumberOfPaths()
          Return the total number of paths to distcp, includes directories as well This doesn't consider whether file/dir is already present and should be skipped during copy
protected  void validatePaths(DistCpOptions options)
          Validate input and output paths
 
Methods inherited from class org.apache.hadoop.tools.CopyListing
buildListing, checkForDuplicates, getCopyListing, getCredentials, setCredentials
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FileBasedCopyListing

public FileBasedCopyListing(org.apache.hadoop.conf.Configuration configuration,
                            org.apache.hadoop.security.Credentials credentials)
Constructor, to initialize base-class.

Parameters:
configuration: - The input Configuration object.
credentials - - Credentials object on which the FS delegation tokens are cached. If null delegation token caching is skipped
Method Detail

validatePaths

protected void validatePaths(DistCpOptions options)
                      throws IOException,
                             org.apache.hadoop.tools.CopyListing.InvalidInputException
Validate input and output paths

Specified by:
validatePaths in class CopyListing
Parameters:
options - - Input options
Throws:
IOException
org.apache.hadoop.tools.CopyListing.InvalidInputException

doBuildListing

public void doBuildListing(org.apache.hadoop.fs.Path pathToListFile,
                           DistCpOptions options)
                    throws IOException
Implementation of CopyListing::buildListing(). Iterates over all source paths mentioned in the input-file.

Specified by:
doBuildListing in class CopyListing
Parameters:
pathToListFile: - Path on HDFS where the listing file is written.
options: - Input Options for DistCp (indicating source/target paths.)
Throws:
IOException

fetchFileList

protected static List<org.apache.hadoop.fs.Path> fetchFileList(org.apache.hadoop.fs.Path sourceListing,
                                                               org.apache.hadoop.conf.Configuration conf)
                                                        throws IOException
Throws:
IOException

getBytesToCopy

protected long getBytesToCopy()
Return the total bytes that distCp should copy for the source paths This doesn't consider whether file is same should be skipped during copy

Specified by:
getBytesToCopy in class CopyListing
Returns:
total bytes to copy

getNumberOfPaths

protected long getNumberOfPaths()
Return the total number of paths to distcp, includes directories as well This doesn't consider whether file/dir is already present and should be skipped during copy

Specified by:
getNumberOfPaths in class CopyListing
Returns:
Total number of paths to distcp


Copyright © 2014 InMobi. All rights reserved.