StarCluster - Mailing List Archive

Re: SGE issue

From: Justin Riley <no email>
Date: Sun, 19 Dec 2010 01:09:51 -0500

On 12/6/10 7:30 PM, Boris Fain wrote:
> A great 12 node cluster computing away on Amazon. We did encounter an
> issue when using Ubuntu 64 bit 9.04 AMI's.
> The queue was not properly set up on all the nodes - the queue all.q
> only had the head node in it, and none of the compute nodes.
Sorry for the late response. I just tested a 2-node spot cluster with
the 64 bit 9.04 AMI (ami-a5c42dcc). I'm able to see both the master and
a worker node in all.q:

root_at_master:~# qhost
HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO
SWAPUS
-------------------------------------------------------------------------------
global - - - - -
- -
domU-12-31-38-06-F8-C2 lx24-amd64 2 0.04 7.5G 373.1M
0.0 0.0
domU-12-31-38-06-F9-D2 lx24-amd64 2 0.05 7.5G 442.4M
0.0 0.0
root_at_master:~# qconf -sq all.q
qname all.q
hostlist _at_allhosts
seq_no 0
load_thresholds np_load_avg=1.75
suspend_thresholds NONE
nsuspend 1
suspend_interval 00:05:00
priority 0
min_cpu_interval 00:05:00
processors UNDEFINED
qtype BATCH INTERACTIVE
ckpt_list NONE
pe_list make orte
rerun FALSE
slots 1,[domU-12-31-38-06-F9-D2.compute-1.internal=2], \
                       [domU-12-31-38-06-F8-C2.compute-1.internal=2]
...

I'm using the latest github code here so I really need to install the
stable version and give it a shot before anything's conclusive. Have you
tried multiple times with ami-a5c42dcc?
> We then changed the config back to the default 64 bit image and there
> SGE is set up perfectly. We're actually totally fine with the
> configuration now, but just a heads up. (We wanted to use 9.04 because
> it had older gcc's available and there was no need to
> add a package).
Just to clarify, which AMI id worked for you?

~Justin
Received on Sun Dec 19 2010 - 01:09:37 EST
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject