Re: starcluster starts but not all nodes added as exec nodes
This archive was generated by
-----BEGIN PGP SIGNED MESSAGE-----
I just requested to up my EC2 instance limit so that I can test things
out at this scale and see what the issue is. In the mean time would you
mind sending me any logs found in /opt/sge6/default/common/install_logs
and also the /opt/sge6/ec2_sge.conf for a failed run?
Also if this happens again you could try reinstalling SGE manually
assuming all the nodes are up:
$ starcluster sshmaster mycluster
$ cd /opt/sge6
$ ./inst_sge -m -x -auto ./ec2_sge.conf
On 03/15/2011 06:30 PM, Kyeong Soo (Joseph) Kim wrote:
> Hi Jeff,
> I experienced the same thing with my 50-node configuration (c1.xlarge).
> Out of 50 nodes, only 29 nodes are successfully identified by the SGE.
> On Sat, Mar 5, 2011 at 10:15 PM, Jeff White <jeff_at_decide.com> wrote:
>> I can frequently reproduce an issue where 'starcluster start' completes
>> without error, but not all nodes are added to the SGE pool, which I verify
>> by running 'qconf -sel' on the master. The latest example I have is creating
>> a 25-node cluster, where only the first 12 nodes are successfully installed.
>> The remaining instances are running and I can ssh to them but they aren't
>> running sge_execd. There are only install log files for the first 12 nodes
>> in /opt/sge6/default/common/install_logs. I have not found any clues in the
>> starcluster debug log or the logs inside master:/opt/sge6/.
>> I am running starcluster development snapshot 8ef48a3 downloaded on
>> 2011-02-15, with the following relevant settings:
>> NODE_INSTANCE_TYPE = m1.small
>> I have seen this behavior with the latest 32-bit and 64-bit starcluster
>> AMIs. Our workaround is to start a small cluster and progressively add nodes
>> one at a time, which is time-consuming.
>> Has anyone else noticed this and have a better workaround or an idea for a
>> StarCluster mailing list
> StarCluster mailing list
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
-----END PGP SIGNATURE-----
Received on Wed Mar 16 2011 - 11:58:16 EDT