StarCluster - Mailing List Archive

Re: Issue creating a cluster of 30 nodes with starcluster

From: Sumita Sinha <no email>
Date: Wed, 9 Nov 2011 05:30:23 +0530

Thanks for your response.

I tried creating the cluster again with 30 nodes, this time it was
successfully done . I am not using spot instances.

I tried creating the cluster again with 30 nodes, this time it was
successfully done in 14 min

When a cluster create request is sent i see that the message on the terminal
>>>Waiting for all nodes to be in a 'running' state...
>>> Waiting for SSH to come up on all nodes...
>>> Setting up the cluster...
>>> Configuring hostnames...
>>> Creating cluster user: sgeadmin (uid: 1001, gid: 1001)

So when any node is up and running in EC2, does starcluster wait for all
the nodes to be up and then it starts configuring them all at one time.
Is there any parameter in the config file or any options in the starcluster
start command that says "configuration of the cluster and installing
SGE/Configuring NFS to be a parallel operation. any node should not wait
for the other nodes to be up for getiing configured that's if we post a job
on that ready node it should start executing the job with the available no
of nodes that are running and configured."

If the above is not possible , is there any specific reason while starting
a cluster, starcluster does the configuration of nodes only when all are
running.
If anything bad happens at the EC2 level and some of the nodes are taking a
lot of time to start, is there any "fault tolerant technique" or "time out"
.

Regards
Sumita

On Tue, Nov 8, 2011 at 7:42 PM, Justin Riley <jtriley_at_mit.edu> wrote:

>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi Sumita,
>
> Were you using spot instances? If not I believe there's a default limit of
> 20 instances by default for flat-rate instances which *could* be related to
> your issue. With spot instances you can create up to 100 instances by
> default. So, if you need more than 20 nodes and do not wish to submit a
> request to Amazon to increase your flat-rate instance limit, you should be
> using spot instances:
>
> $ starcluster start -s 30 -b 0.50 mycluster
>
> With that said, StarCluster has no limit to the number of nodes you can
> create, however, as you've seen, sometimes EC2 instances can take longer to
> become 'running' than usual. Unfortunately this is purely an EC2 back-end
> issue that cannot be resolved directly by StarCluster. In my experience 22
> minutes *is* quite a while to wait for any instance to come up, however, I
> have had instances take up to 15 min before in the past so this is not a
> total surprise to me.
>
> In the future if you run into this problem of waiting for an instance to
> change from 'pending' to 'running' for too long (e.g. 15min+) I would
> recommend simply terminating the faulty instance from the AWS console and
> then restart the cluster using:
>
> $ starcluster restart mycluster
>
> This should reboot all the currently running instances and begin
> configuring the cluster and avoid having to terminate the entire cluster
> and lose instance hours.
>
> HTH,
>
> ~Justin
>
>
> On 11/8/11 6:39 AM, Sumita Sinha wrote:
> > Hello ,
> >
> > Currently working with starcluster on EC2.
> >
> > Tried creating a cluster with 30 nodes of type m1.small using AMI -
> ami-8cf913e5.
> > Cluster creation was never completed as i found out that one node
> node025 was showing pending status.
> > I waited for almost 22 minutes then terminated the cluster.
> > Cluster was terminated properly. Is there any limit to the creation of
> nodes .
> >
> >
> >
> >
> > --
> > Regards
> > Sumita Sinha
> >
> >
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (Darwin)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAk65OL4ACgkQ4llAkMfDcrm9MACghU/Ey4v653fsD8XmpbQKONNp
> vdkAniIfFExWjqGAOWRolMrtePHfl4AL
> =Q8NI
> -----END PGP SIGNATURE-----
>
>


-- 
Regards
Sumita Sinha
Received on Tue Nov 08 2011 - 19:00:25 EST
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject