|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
public interface InputFormat
An input data format. Input files are stored in a FileSystem
.
The processing of an input file may be split across multiple machines.
Files are processed as sequences of records, implementing RecordReader
. Files must thus be split on record boundaries.
Method Summary | |
---|---|
boolean[] |
areValidInputDirectories(FileSystem fileSys,
Path[] inputDirs)
Are the input directories valid? This method is used to test the input directories when a job is submitted so that the framework can fail early with a useful error message when the input directory does not exist. |
RecordReader |
getRecordReader(FileSystem fs,
FileSplit split,
JobConf job,
Reporter reporter)
Construct a RecordReader for a FileSplit . |
FileSplit[] |
getSplits(FileSystem fs,
JobConf job,
int numSplits)
Splits a set of input files. |
Method Detail |
---|
boolean[] areValidInputDirectories(FileSystem fileSys, Path[] inputDirs) throws IOException
fileSys
- the file system to check for the directoriesinputDirs
- the list of input directories
IOException
FileSplit[] getSplits(FileSystem fs, JobConf job, int numSplits) throws IOException
fs
- the filesystem containing the files to be splitjob
- the job whose input files are to be splitnumSplits
- the desired number of splits
IOException
RecordReader getRecordReader(FileSystem fs, FileSplit split, JobConf job, Reporter reporter) throws IOException
RecordReader
for a FileSplit
.
fs
- the FileSystem
split
- the FileSplit
job
- the job that this split belongs to
RecordReader
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |