StarCluster - Mailing List Archive

New Grid Engine Hadoop Integration HOWTO

From: Rayson Ho <no email>
Date: Fri, 1 Jun 2012 14:52:04 -0400

If you are running Hadoop on StarCluster, you may also be interested
in this new method contributed by Prakashan Korambath of UCLA.

http://gridscheduler.sourceforge.net/howto/GridEngineHadoop.html

The difference between the original SGE 6.2u5 method vs the new one is
that with Prakashan's approach, Grid Engine is used for resource
allocation, and the Hadoop job scheduler/Job Tracker is used to handle
all the MapReduce operations. A Hadoop cluster is created on demand
with Prakashan's approach, but in the original SGE 6.2u5 method Grid
Engine replaces the Hadoop job scheduler.

As standard Grid Engine PEs are used in this new approach, one can
call "qrsh -inherit" and use Grid Engine's method to start Hadoop
services on remote nodes, and thus get full job control, job
accounting, and cleanup at terminate benefits like any other tight PE
jobs!

Rayson

================================
Open Grid Scheduler / Grid Engine
http://gridscheduler.sourceforge.net/

Scalable Grid Engine Support Program
http://www.scalablelogic.com/
Received on Fri Jun 01 2012 - 14:52:05 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject