StarCluster - Mailing List Archive

Re: cluster with 30 nodes, 2 slots each, 50 job max?

From: Justin Riley <no email>
Date: Mon, 22 Oct 2012 10:50:31 -0400

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi John,

You can look through your debug log to see if there were errors adding
nodes to SGE: $HOME/.starcluster/logs/debug.log. Also when this
happens check to see if the nodes are listed in the qconf for all.q:

$ starcluster sshmaster mycluster
$ qconf -sq all.q

If you can still get this info I'd like to take a look to see what
happened.

~Justin

On 10/18/2012 06:55 PM, John St. John wrote:
> The issue appears to be that the nodes were added with
> `starcluster loadbalance` and for some reason SGE was not
> configured properly for one round of 5 nodes that were added. They
> show up as being part of the cluster in starcluster, but SGE does
> not recognize them. Wonder what went wrong? On Oct 18, 2012, at
> 3:25 PM, John St. John <johnthesaintjohn_at_gmail.com
> <mailto:johnthesaintjohn_at_gmail.com>> wrote:
>
>> Hello, I am running a cluster with 30 nodes that have 2 slots
>> each. That should give me up to 60 1 slot jobs that can run at a
>> time. For some reason though after 50 jobs are running, the
>> system just queues jobs. I have tried kicking off simple sleep
>> jobs to see if those can run over the 50 wall I am hitting, but
>> no luck (qsub -V -b y -cwd sleep 10). I do not see anything odd
>> about my settings. Is this something hard-coded somewhere that I
>> can change? Has anyone been able to run more than 50 jobs at a
>> time with starcluster?
>>
>> Thanks! John
>
>
>
> _______________________________________________ StarCluster mailing
> list StarCluster_at_mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iEYEARECAAYFAlCFXTYACgkQ4llAkMfDcrlI5gCdEOd20CNzrJTDUHnG6ig+cf97
HKYAoIf0v1JdfYjclL2quZS1jRrfsqdV
=QU07
-----END PGP SIGNATURE-----
Received on Mon Oct 22 2012 - 10:50:35 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject