StarCluster - Mailing List Archive

Re: Issue creating a cluster of 30 nodes with starcluster

From: Sumita Sinha <no email>
Date: Wed, 9 Nov 2011 11:03:12 +0530

Hi,

Awaiting for your response for the below query

On Wed, Nov 9, 2011 at 5:30 AM, Sumita Sinha <
sumita.sinha_at_claricetechnologies.com> wrote:

> Thanks for the quick response.
>
>
> I tried with instance-store instances.Is there any reason that EBS backed
> instances take less time to boot.
>
> I tried creating the cluster again with 30 nodes, this time it was
> successfully done in 14min
>
> When a cluster create request is sent i see that the message on the
> terminal
> >>>Waiting for all nodes to be in a 'running' state...
> >>> Waiting for SSH to come up on all nodes...
> >>> Setting up the cluster...
> >>> Configuring hostnames...
> >>> Creating cluster user: sgeadmin (uid: 1001, gid: 1001)
>
> So when any node is up and running in EC2, does starcluster wait for all
> the nodes to be up and then it starts configuring them all at one time.
> Is there any parameter in the config file or any options in the
> starcluster start command that says "configuration of the cluster and
> installing SGE/Configuring NFS to be a parallel operation. any node should
> not wait for the other nodes to be up for getiing configured that's if we
> post a job on that ready node it should start executing the job with the
> available no of nodes that are running and configured."
>
> If the above is not possible , is there any specific reason while
> starting a cluster, starcluster does the configuration of nodes only when
> all are running.
> If anything bad happens at the EC2 level and some of the nodes are taking
> a lot of time to start, is there any "fault tolerant technique" or "time
> out" .
>
> Regards
> Sumita
>
>
>
>
> On Tue, Nov 8, 2011 at 7:55 PM, Paolo Di Tommaso <Paolo.DiTommaso_at_crg.eu>wrote:
>
>> Are you using instance-store instance or EBS backed instances?
>>
>> The latter are much more faster to boot.
>>
>>
>> Cheers,
>> Paolo
>>
>>
>>
>> On Nov 8, 2011, at 3:12 PM, Justin Riley wrote:
>>
>>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Hi Sumita,
>>
>> Were you using spot instances? If not I believe there's a default limit
>> of 20 instances by default for flat-rate instances which *could* be related
>> to your issue. With spot instances you can create up to 100 instances by
>> default. So, if you need more than 20 nodes and do not wish to submit a
>> request to Amazon to increase your flat-rate instance limit, you should be
>> using spot instances:
>>
>> $ starcluster start -s 30 -b 0.50 mycluster
>>
>> With that said, StarCluster has no limit to the number of nodes you can
>> create, however, as you've seen, sometimes EC2 instances can take longer to
>> become 'running' than usual. Unfortunately this is purely an EC2 back-end
>> issue that cannot be resolved directly by StarCluster. In my experience 22
>> minutes *is* quite a while to wait for any instance to come up, however, I
>> have had instances take up to 15 min before in the past so this is not a
>> total surprise to me.
>>
>> In the future if you run into this problem of waiting for an instance to
>> change from 'pending' to 'running' for too long (e.g. 15min+) I would
>> recommend simply terminating the faulty instance from the AWS console and
>> then restart the cluster using:
>>
>> $ starcluster restart mycluster
>>
>> This should reboot all the currently running instances and begin
>> configuring the cluster and avoid having to terminate the entire cluster
>> and lose instance hours.
>>
>> HTH,
>>
>> ~Justin
>>
>> On 11/8/11 6:39 AM, Sumita Sinha wrote:
>> > Hello ,
>> >
>> > Currently working with starcluster on EC2.
>> >
>> > Tried creating a cluster with 30 nodes of type m1.small using AMI -
>> ami-8cf913e5.
>> > Cluster creation was never completed as i found out that one node
>> node025 was showing pending status.
>> > I waited for almost 22 minutes then terminated the cluster.
>> > Cluster was terminated properly. Is there any limit to the creation of
>> nodes .
>> >
>> >
>> >
>> >
>> > --
>> > Regards
>> > Sumita Sinha
>> >
>> >
>>
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.11 (Darwin)
>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>>
>> iEYEARECAAYFAk65OL4ACgkQ4llAkMfDcrm9MACghU/Ey4v653fsD8XmpbQKONNp
>> vdkAniIfFExWjqGAOWRolMrtePHfl4AL
>> =Q8NI
>> -----END PGP SIGNATURE-----
>>
>> _______________________________________________
>> StarCluster mailing list
>> StarCluster_at_mit.edu
>> http://mailman.mit.edu/mailman/listinfo/starcluster
>>
>>
>>
>
>
> --
> Regards
> Sumita Sinha
>
>
>


-- 
Regards
Sumita Sinha
Received on Wed Nov 09 2011 - 00:33:16 EST
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject