StarCluster - Mailing List Archive

Re: Large cluster (125 nodes) launch failure

From: Joseph <Kyeong>
Date: Tue, 15 Mar 2011 22:03:34 +0000

Hi Austin,

Yes, I requested to increase the limit to 300 (got a confirmation, of
course) and now successfully running a 50-node cluster (it took 32
mins BTW).
I wonder now if it simply is a matter of huge delay involved with
setting up such a large number nodes.

Regards,
Joseph


On Tue, Mar 15, 2011 at 9:53 PM, Austin Godber <godber_at_uberhip.com> wrote:
> Does it work at 20 and fail at 21?  I think Amazon still has a 20 AMIs
> limit, which you can request that they raise.  Have you done that?
>
> http://aws.amazon.com/ec2/faqs/#How_many_instances_can_I_run_in_Amazon_EC2
>
> Austin
>
> On 03/15/2011 05:29 PM, Kyeong Soo (Joseph) Kim wrote:
>> Hi Justin and All,
>>
>> This is to report a failure in launching a large cluster with 125
>> nodes (c1.xlarge).
>>
>> I tried to launch the said cluster two times but starcluster hung (for
>> more than hours) at the following steps:
>>
>> .....
>>
>>>>> Launching node121 (ami: ami-2857a641, type: c1.xlarge)
>>>>> Launching node122 (ami: ami-2857a641, type: c1.xlarge)
>>>>> Launching node123 (ami: ami-2857a641, type: c1.xlarge)
>>>>> Launching node124 (ami: ami-2857a641, type: c1.xlarge)
>>>>> Creating security group _at_sc-hnrlcluster...
>> Reservation:r-7c264911
>>>>> Waiting for cluster to come up... (updating every 30s)
>>>>> Waiting for all nodes to be in a 'running' state...
>> 125/125 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> 100%
>>>>> Waiting for SSH to come up on all nodes...
>> 125/125 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> 100%
>>>>> The master node is ec2-75-101-230-197.compute-1.amazonaws.com
>>>>> Setting up the cluster...
>>>>> Attaching volume vol-467ecc2e to master node on /dev/sdz ...
>>>>> Configuring hostnames...
>>>>> Mounting EBS volume vol-467ecc2e on /home...
>>>>> Creating cluster user: kks (uid: 1001, gid: 1001)
>>>>> Configuring scratch space for user: kks
>>>>> Configuring /etc/hosts on each node
>>
>> I have succeeded with the configuration up to 15 nodes so far.
>>
>> Any idea?
>>
>> With Regards,
>> Joseph
>> --
>> Kyeong Soo (Joseph) Kim, Ph.D.
>> Senior Lecturer in Networking
>> Room 112, Digital Technium
>> Multidisciplinary Nanotechnology Centre, College of Engineering
>> Swansea University, Singleton Park, Swansea SA2 8PP, Wales UK
>> TEL: +44 (0)1792 602024
>> EMAIL: k.s.kim_at_swansea.ac.uk
>> HOME: http://iat-hnrl.swan.ac.uk/ (group)
>>              http://iat-hnrl.swan.ac.uk/~kks/ (personal)
>>
>> _______________________________________________
>> StarCluster mailing list
>> StarCluster_at_mit.edu
>> http://mailman.mit.edu/mailman/listinfo/starcluster
>
> _______________________________________________
> StarCluster mailing list
> StarCluster_at_mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
>
Received on Tue Mar 15 2011 - 18:03:35 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject