StarCluster - Mailing List Archive

master node not in qstat

From: Adam Marsh <no email>
Date: Fri, 9 Mar 2012 16:46:33 -0500


Setup a cluster today (0.93.2) and suddenly noticed that the 'master' node was not being reported in a "qstat -f" command and was not accepting run jobs from the queue . . . i.e., with 12 nodes x 8 cpus each (96), when 96 jobs are submitted, only 88 run (nodes 1-11) while 8 remain in the queue waiting.

I tried restarting the cluster using the 'sge' plugin to manually ensure that master_is_exec_host was set to 'True'. But the result was the same: 88 running - 8 waiting.

But this brings up a future request. I would like to be able to run a cluster of 8-core servers, but have the MASTER as a non_exec node BUT with a different configuration (simple 2-cores, m1.large) just to handle file and job monitoring tasks independent of the cluster activity. Anyway, I know you've put more work than I can imagine into configuring and maintaining this package. I'm deeply appreciative of your skills and dedication. So I don't want to seem ungrateful by requesting a feature that is more of a luxury than anything else. Just file it aside.



Adam G. Marsh, Ph.D.
Associate Professor
Marine Biological Sciences
University of Delaware
Received on Fri Mar 09 2012 - 16:46:53 EST
This archive was generated by hypermail 2.3.0.


Sort all by: