StarCluster - Mailing List Archive

[Starcluster] StarCluster timeout problem

From: Justin Riley <no email>
Date: Mon, 29 Mar 2010 14:39:50 -0400

Hash: SHA1

Hi Nasser

I've cc'd the starcluster mailing list, hope you don't mind.

BTW, I'd like to invite you to join the starcluster mailing list. It's a
good place to keep up with things and submit issues
such as these. You can join the list here:

Thanks for reporting this issue. I've made a quick-fix change in the
development version of the code on github by bumping the timeout to 5
sec. This still might not help you if the latency is really bad.

My current thinking on this is to 'throttle' the timeout time the longer
it takes for the cluster to appear to be up. So, at first it would
attempt a 5 second timeout, and then incrementally raise it up to 15
seconds as necessary. After a maximum of 15 seconds and enough retries,
it would likely just error out.

This is on my list for the next version.

Thanks for reporting!


> Problem:
> I've installed & configured StartCluster correctly. However, when I
try to start it with "startcluster -s", everything goes fine until it
reach the line ">>> Waiting for cluster to start..." and that when it
run forever(infinite loop). Even after all the instances are in
"running" state.
> Solution:
> After debugging, I found out that the value of socket's timeout(0.25) in:
> File: starcluster/
> Function: is_ssh_up()
> Line: s.settimeout(0.25)
> is too small for my connection; due to a latency issue.
> So I've commented, as a quick fix, that line and everything work fine.
> A bigger value would solve this.
> Thanks for your great work and keep it up
> Nasser
Version: GnuPG v2.0.14 (GNU/Linux)
Comment: Using GnuPG with Mozilla -

Received on Mon Mar 29 2010 - 14:39:52 EDT
This archive was generated by hypermail 2.3.0.


Sort all by: