StarCluster - Mailing List Archive

[Starcluster] Load Balancer Problems

From: Amaro Taylor <no email>
Date: Fri, 30 Jul 2010 13:40:18 -0700

Hey,

So I was testing out the Load Balancer today and it doesnt appear to be
working. Here is the output I was getting and the output from the job on
startcluster.

ssh.py:248 - ERROR - command source /etc/profile && qacct -j -b 201007301725
failed with status 1
>>> Oldest job is from None. # queued jobs = 0. # hosts = 2.
>>> Avg job duration = 0 sec, Avg wait time = 0 sec.
>>> Cluster change was made less than 180 seconds ago (2010-07-30
20:24:13.398974).
>>> Not changing cluster size until cluster stabilizes.
>>> Sleeping, looping again in 60 seconds.


It says 0 queued jobs but thats not accurate.
this is what qstat says on the master node

#########################################################################
      1 0.55500 Bone_Estim sgeadmin qw 07/30/2010 20:26:20 1
7-1000:1
sgeadmin_at_domU-12-31-39-01-5D-67:~/jacobian-parallel/test/bone$ qstat -q
all.q -f -u "*"
queuename qtype resv/used/tot. load_avg arch
states
---------------------------------------------------------------------------------
all.q_at_domU-12-31-39-01-5C-97.c BIP 0/1/1 0.52 lx24-x86
      1 0.55500 Bone_Estim sgeadmin r 07/30/2010 20:29:03 1 6
---------------------------------------------------------------------------------
all.q_at_domU-12-31-39-01-5D-67.c BIP 0/1/1 1.22 lx24-x86
      1 0.55500 Bone_Estim sgeadmin r 07/30/2010 20:28:33 1 5

############################################################################
 - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
############################################################################
      1 0.55500 Bone_Estim sgeadmin qw 07/30/2010 20:26:20 1
7-1000:1
sgeadmin_at_domU-12-31-39-01-5D-67:~/jacobian-parallel/test/bone$ qstat -q
all.q -f -u "*"
queuename qtype resv/used/tot. load_avg arch
states
---------------------------------------------------------------------------------
all.q_at_domU-12-31-39-01-5C-97.c BIP 0/1/1 0.63 lx24-x86
      1 0.55500 Bone_Estim sgeadmin r 07/30/2010 20:31:03 1 8
---------------------------------------------------------------------------------
all.q_at_domU-12-31-39-01-5D-67.c BIP 0/1/1 1.38 lx24-x86
      1 0.55500 Bone_Estim sgeadmin r 07/30/2010 20:28:33 1 5

Any suggestions?



Best,
Amaro Taylor
RES Group, Inc.
1 Broadway • Cambridge, MA 02142 • U.S.A.
Tel: 310 880-1906 (Direct) • Fax: 617-812-8042 • Email:
amaro.taylor_at_resgroupinc.com

Disclaimer: The information contained in this email message may be
confidential. Please be careful if you forward, copy or print this message.
If you have received this email in error, please immediately notify the
sender and delete the message.
Received on Fri Jul 30 2010 - 16:40:40 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject