Re: cluster with 30 nodes, 2 slots each, 50 job max?
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi John,
You can look through your debug log to see if there were errors adding
nodes to SGE: $HOME/.starcluster/logs/debug.log. Also when this
happens check to see if the nodes are listed in the qconf for all.q:
$ starcluster sshmaster mycluster
$ qconf -sq all.q
If you can still get this info I'd like to take a look to see what
happened.
~Justin
On 10/18/2012 06:55 PM, John St. John wrote:
> The issue appears to be that the nodes were added with
> `starcluster loadbalance` and for some reason SGE was not
> configured properly for one round of 5 nodes that were added. They
> show up as being part of the cluster in starcluster, but SGE does
> not recognize them. Wonder what went wrong? On Oct 18, 2012, at
> 3:25 PM, John St. John <johnthesaintjohn_at_gmail.com
> <mailto:johnthesaintjohn_at_gmail.com>> wrote:
>
>> Hello, I am running a cluster with 30 nodes that have 2 slots
>> each. That should give me up to 60 1 slot jobs that can run at a
>> time. For some reason though after 50 jobs are running, the
>> system just queues jobs. I have tried kicking off simple sleep
>> jobs to see if those can run over the 50 wall I am hitting, but
>> no luck (qsub -V -b y -cwd sleep 10). I do not see anything odd
>> about my settings. Is this something hard-coded somewhere that I
>> can change? Has anyone been able to run more than 50 jobs at a
>> time with starcluster?
>>
>> Thanks! John
>
>
>
> _______________________________________________ StarCluster mailing
> list StarCluster_at_mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla -
http://www.enigmail.net/
iEYEARECAAYFAlCFXTYACgkQ4llAkMfDcrlI5gCdEOd20CNzrJTDUHnG6ig+cf97
HKYAoIf0v1JdfYjclL2quZS1jRrfsqdV
=QU07
-----END PGP SIGNATURE-----
Received on Mon Oct 22 2012 - 10:50:35 EDT
This archive was generated by
hypermail 2.3.0.