StarCluster - Mailing List Archive

Re: Starcluster stuck during setup

From: Steve Darnell <no email>
Date: Tue, 25 Mar 2014 22:14:27 +0000

Hi Cory,

Rayson offered the following advice last year: http://star.mit.edu/cluster/mlarchives/1742.html

The "right" way is to first boot a cluster of say, 8-10 nodes, submit
jobs, and then use the StarCluster addnode command to grow your
cluster.

I do not see why you cannot add spot instances using addnode. The documentation says it is supported: http://star.mit.edu/cluster/docs/latest/manual/addremovenode.html

The addnode command has additional options for customizing the new node's instance type, AMI, spot bid, and more.
See the help menu for a detailed list of all available options:
$ starcluster addnode -help

Best regards,
Steve

--
Steve Darnell
DNASTAR, Inc.
Madison, WI USA
From: starcluster-bounces_at_mit.edu [mailto:starcluster-bounces_at_mit.edu] On Behalf Of Cory Dolphin
Sent: Tuesday, March 25, 2014 4:16 PM
To: Butson, Christopher
Cc: starcluster_at_mit.edu
Subject: Re: [StarCluster] Starcluster stuck during setup
I have had similar issues starting large (30+) node clusters. Anyone else find a good pattern for doing so? Sadly I cannot add nodes incrementally since I need spot instances.
On Tue, Mar 25, 2014 at 8:04 AM, Butson, Christopher <cbutson_at_mcw.edu<mailto:cbutson_at_mcw.edu>> wrote:
Interesting: I let it go and it eventually continued but it took over an hour to Configuring passwordless ssh for root. Still waiting for the cluster to finish startup...
Christopher R. Butson, Ph.D.
Associate Professor
Biotechnology & Bioengineering Center
Departments of Neurology, Neurosurgery, Psychiatry & Behavioral Medicine
Medical College of Wisconsin
(414) 955-2678<tel:%28414%29%20955-2678>
cbutson_at_mcw.edu<mailto:cbutson_at_mcw.edu><mailto:cbutson_at_mcw.edu<mailto:cbutson_at_mcw.edu>>
From: <Butson>, Christopher Butson <cbutson_at_mcw.edu<mailto:cbutson_at_mcw.edu><mailto:cbutson_at_mcw.edu<mailto:cbutson_at_mcw.edu>>>
Date: Tuesday, March 25, 2014 12:13 PM
To: "starcluster_at_mit.edu<mailto:starcluster_at_mit.edu><mailto:starcluster_at_mit.edu<mailto:starcluster_at_mit.edu>>" <starcluster_at_mit.edu<mailto:starcluster_at_mit.edu><mailto:starcluster_at_mit.edu<mailto:starcluster_at_mit.edu>>>
Subject: Starcluster stuck during setup
I'm on a slow internet connection overseas, trying to initiate a cluster using StarCluster. Once I type "starcluster start mycluster" everything seems to go ok but it gets stuck at the following point and never seems to get past it:
>>> Mounting all NFS export path(s) on 79 worker node(s)
79/79 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| 100%
>>> Setting up NFS took 2.777 mins
>>> Configuring passwordless ssh for root
Any idea why this might occur? Thanks,
Chris
Christopher R. Butson, Ph.D.
Associate Professor
Biotechnology & Bioengineering Center
Departments of Neurology, Neurosurgery, Psychiatry & Behavioral Medicine
Medical College of Wisconsin
(414) 955-2678<tel:%28414%29%20955-2678>
cbutson_at_mcw.edu<mailto:cbutson_at_mcw.edu><mailto:cbutson_at_mcw.edu<mailto:cbutson_at_mcw.edu>>
_______________________________________________
StarCluster mailing list
StarCluster_at_mit.edu<mailto:StarCluster_at_mit.edu>
http://mailman.mit.edu/mailman/listinfo/starcluster
Received on Tue Mar 25 2014 - 18:14:31 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject