StarCluster - Mailing List Archive

Re: starcluster starts but not all nodes added as exec nodes

From: Joseph <Kyeong>
Date: Tue, 15 Mar 2011 22:30:27 +0000

Hi Jeff,

I experienced the same thing with my 50-node configuration (c1.xlarge).
Out of 50 nodes, only 29 nodes are successfully identified by the SGE.


On Sat, Mar 5, 2011 at 10:15 PM, Jeff White <> wrote:
> I can frequently reproduce an issue where 'starcluster start' completes
> without error, but not all nodes are added to the SGE pool, which I verify
> by running 'qconf -sel' on the master. The latest example I have is creating
> a 25-node cluster, where only the first 12 nodes are successfully installed.
> The remaining instances are running and I can ssh to them but they aren't
> running sge_execd. There are only install log files for the first 12 nodes
> in /opt/sge6/default/common/install_logs. I have not found any clues in the
> starcluster debug log or the logs inside master:/opt/sge6/.
> I am running starcluster development snapshot 8ef48a3 downloaded on
> 2011-02-15, with the following relevant settings:
> NODE_IMAGE_ID=ami-8cf913e5
> I have seen this behavior with the latest 32-bit and 64-bit starcluster
> AMIs. Our workaround is to start a small cluster and progressively add nodes
> one at a time, which is time-consuming.
> Has anyone else noticed this and have a better workaround or an idea for a
> fix?
> jeff
> _______________________________________________
> StarCluster mailing list
Received on Tue Mar 15 2011 - 18:30:28 EDT
This archive was generated by hypermail 2.3.0.


Sort all by: