StarCluster - Mailing List Archive

[Star cluster] error tolerance design when adding nodes

From: Jin Yu <no email>
Date: Sun, 20 Jul 2014 14:08:00 -0500

Hello,

For an example, I just found it is not uncommon to have one or two
instances not communicable after you adding 50 instances in the cluster.
The progress bar got stuck when waiting for ssh. And I have to manually
restart those problematic instances.

I have not yet went through the codes of starcluster, I wonder if
StarCluster already has some error tolerance design for these situation?

Thanks!
Jin
Received on Sun Jul 20 2014 - 15:08:01 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject