org.apache.hadoop.mapred
Class TaskTracker

java.lang.Object
  extended by org.apache.hadoop.mapred.TaskTracker
All Implemented Interfaces:
Runnable

public class TaskTracker
extends Object
implements Runnable

TaskTracker is a process that starts and tracks MR Tasks in a networked environment. It contacts the JobTracker for Task assignments and reporting results.

Author:
Mike Cafarella

Nested Class Summary
static class TaskTracker.Child
          The main() for child processes.
static class TaskTracker.MapOutputServlet
          This class is used in TaskTracker's Jetty to serve the map outputs to other nodes.
 
Field Summary
static int FILE_NOT_FOUND
           
static long HEARTBEAT_INTERVAL
           
static org.apache.commons.logging.Log LOG
           
static int SUCCESS
           
static long TASKTRACKER_EXPIRY_INTERVAL
           
static long versionID
           
 
Constructor Summary
TaskTracker(JobConf conf)
          Start with the local machine name, and the default JobTracker
 
Method Summary
 void close()
          Close down the TaskTracker and all its components.
 void done(String taskid)
          The task is done.
 void fsError(String message)
          A child task had a local filesystem error.
 FileSystem getFileSystem()
          Return the DFS filesystem
 org.apache.hadoop.mapred.InterTrackerProtocol getJobClient()
          The connection to the JobTracker, used by the TaskRunner for locating remote files.
 long getProtocolVersion(String protocol, long clientVersion)
          Return protocol version corresponding to protocol interface.
 org.apache.hadoop.mapred.Task getTask(String taskid)
          Called upon startup by the child process, to fetch Task data.
 boolean isIdle()
          Is this task tracker idle?
static void main(String[] argv)
          Start the TaskTracker, point toward the indicated JobTracker
 void mapOutputLost(String taskid, String errorMsg)
          A completed map task's output has been lost.
 boolean ping(String taskid)
          Child checking to see if we're alive.
 void progress(String taskid, float progress, String state, org.apache.hadoop.mapred.Phase phase)
          Called periodically to report Task progress, from 0.0 to 1.0.
 void reportDiagnosticInfo(String taskid, String info)
          Called when the task dies before completion, and we want to report back diagnostic info
 void run()
          The server retry loop.
 void shutdown()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG

HEARTBEAT_INTERVAL

public static final long HEARTBEAT_INTERVAL
See Also:
Constant Field Values

TASKTRACKER_EXPIRY_INTERVAL

public static final long TASKTRACKER_EXPIRY_INTERVAL
See Also:
Constant Field Values

SUCCESS

public static final int SUCCESS
See Also:
Constant Field Values

FILE_NOT_FOUND

public static final int FILE_NOT_FOUND
See Also:
Constant Field Values

versionID

public static final long versionID
See Also:
Constant Field Values
Constructor Detail

TaskTracker

public TaskTracker(JobConf conf)
            throws IOException
Start with the local machine name, and the default JobTracker

Throws:
IOException
Method Detail

getProtocolVersion

public long getProtocolVersion(String protocol,
                               long clientVersion)
Description copied from interface: VersionedProtocol
Return protocol version corresponding to protocol interface.

Parameters:
protocol - The classname of the protocol interface
clientVersion - The version of the protocol that the client speaks
Returns:
the version that the server will speak

shutdown

public void shutdown()
              throws IOException
Throws:
IOException

close

public void close()
           throws IOException
Close down the TaskTracker and all its components. We must also shutdown any running tasks or threads, and cleanup disk space. A new TaskTracker within the same process space might be restarted, so everything must be clean.

Throws:
IOException

getJobClient

public org.apache.hadoop.mapred.InterTrackerProtocol getJobClient()
The connection to the JobTracker, used by the TaskRunner for locating remote files.


getFileSystem

public FileSystem getFileSystem()
Return the DFS filesystem

Returns:

run

public void run()
The server retry loop. This while-loop attempts to connect to the JobTracker. It only loops when the old TaskTracker has gone bad (its state is stale somehow) and we need to reinitialize everything.

Specified by:
run in interface Runnable

getTask

public org.apache.hadoop.mapred.Task getTask(String taskid)
                                      throws IOException
Called upon startup by the child process, to fetch Task data.

Throws:
IOException

progress

public void progress(String taskid,
                     float progress,
                     String state,
                     org.apache.hadoop.mapred.Phase phase)
              throws IOException
Called periodically to report Task progress, from 0.0 to 1.0.

Parameters:
taskid - the id of the task
progress - value between zero and one
state - description of task's current state
phase - current phase of the task.
Throws:
IOException

reportDiagnosticInfo

public void reportDiagnosticInfo(String taskid,
                                 String info)
                          throws IOException
Called when the task dies before completion, and we want to report back diagnostic info

Parameters:
taskid - the id of the task involved
info - the text to report
Throws:
IOException

ping

public boolean ping(String taskid)
             throws IOException
Child checking to see if we're alive. Normally does nothing.

Returns:
True if the task is known
Throws:
IOException

done

public void done(String taskid)
          throws IOException
The task is done.

Throws:
IOException

fsError

public void fsError(String message)
             throws IOException
A child task had a local filesystem error. Exit, so that no future jobs are accepted.

Throws:
IOException

mapOutputLost

public void mapOutputLost(String taskid,
                          String errorMsg)
                   throws IOException
A completed map task's output has been lost.

Throws:
IOException

isIdle

public boolean isIdle()
Is this task tracker idle?

Returns:
has this task tracker finished and cleaned up all of its tasks?

main

public static void main(String[] argv)
                 throws Exception
Start the TaskTracker, point toward the indicated JobTracker

Throws:
Exception


Copyright © 2006 The Apache Software Foundation