Hey Starcluster,
I'm getting the same error as this guy:
http://star.mit.edu/cluster/mlarchives/1592.html
Briefly:
When I go to use addnode, a spot request opens on amazon (i'm starting a
spot cluster, so addnode bids). But starcluster proceeds to try to install
ssh without waiting for the node to come up.
>>> Launching node(s): node002
SpotInstanceRequest:sir-b35acc5e
>>> Waiting for spot requests to propagate...
>>> Waiting for node(s) to come up... (updating every 30s)
>>> Waiting for all nodes to be in a 'running' state...
2/2 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Waiting for SSH to come up on all nodes...
2/2 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Waiting for cluster to come up took 0.020 mins
!!! ERROR - node 'node002' does not exist
Morever, this only happens when addnode tried to bid (either by defualt
becuase im running a spot cluster or by inline directive)
I don't know what to try next tho. Do you guys have any ideas where to
start?
thanks
Yoshi
Received on Thu Feb 13 2014 - 23:32:09 EST
This archive was generated by
hypermail 2.3.0.