StarCluster - Mailing List Archive

can I remove a pending node?

From: Stephen How <no email>
Date: Tue, 12 Mar 2013 06:55:35 -0700

Hi everyone,

I'm having problems starting a 50-node cluster now. 49/50 nodes are running, but node027 will not enter the running state (it's stuck in pending).

I've had this problem before. I prefer not to terminate the cluster, because it causes problems (I have to wait 15 minutes before I can re-request the 50 spot instances. And I could have the problem again.)

I killed the starcluster start command. It was stuck:
>>> Waiting for all nodes to be in a 'running' state...
49/50

The starcluster removenode command doesn't work, because SGE wasn't completed.

Is there any way to recover from this point, and to get the cluster running with 49 nodes?

Thanks,
Steve
Received on Tue Mar 12 2013 - 09:55:38 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject