Hi all,
We're seeing issues with using clusters consisting of multiple nodes today.
Launch of clusters with >=3 nodes fails, with report of being unable to
assign aliases to nodes other than "master". The same problem is seen with
addnode. Adding one node is fine, but adding more than one gives the alias
problem again. Terminating these clusters fails due to the missing alias
as well, you have to use the ec2toolkit to shut down the offending nodes
that were not named.
I'm on the latest developmental version, and I've noticed that there's a
lot of gibberish in the node user data (as viewed through the web console)
as of today.
Debug at the end, and thanks for the help.
Dan
> starcluster -d start -c testing brokenCluster -s 3
...
...
>>> Waiting for all nodes to be in a 'running' state...
2013-11-25 14:13:06,323 cluster.py:734 - DEBUG - existing nodes: {}
2013-11-25 14:13:06,323 cluster.py:742 - DEBUG - adding node i-323f504f to
self._nodes list
2013-11-25 14:13:06,839 cluster.py:742 - DEBUG - adding node i-2c3f5051 to
self._nodes list
2013-11-25 14:13:07,001 node.py:147 - DEBUG - invalid aliases file in
user_data:
3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
!!! ERROR - instance i-2c3f5051 has no alias
2013-11-25 14:13:07,003 cli.py:301 - DEBUG - instance i-2c3f5051 has no
alias
Traceback (most recent call last):
File
"/Library/Python/2.7/site-packages/StarCluster-0.9999-py2.7.egg/starcluster/cli.py",
line 274, in main
sc.execute(args)
File
"/Library/Python/2.7/site-packages/StarCluster-0.9999-py2.7.egg/starcluster/commands/start.py",
line 220, in execute
validate_running=validate_running)
File
"/Library/Python/2.7/site-packages/StarCluster-0.9999-py2.7.egg/starcluster/cluster.py",
line 1534, in start
return self._start(create=create, create_only=create_only)
File "<string>", line 2, in _start
File
"/Library/Python/2.7/site-packages/StarCluster-0.9999-py2.7.egg/starcluster/utils.py",
line 111, in wrap_f
res = func(*arg, **kargs)
File
"/Library/Python/2.7/site-packages/StarCluster-0.9999-py2.7.egg/starcluster/cluster.py",
line 1557, in _start
self.setup_cluster()
File
"/Library/Python/2.7/site-packages/StarCluster-0.9999-py2.7.egg/starcluster/cluster.py",
line 1565, in setup_cluster
self.wait_for_cluster()
File "<string>", line 2, in wait_for_cluster
File
"/Library/Python/2.7/site-packages/StarCluster-0.9999-py2.7.egg/starcluster/utils.py",
line 111, in wrap_f
res = func(*arg, **kargs)
File
"/Library/Python/2.7/site-packages/StarCluster-0.9999-py2.7.egg/starcluster/cluster.py",
line 1350, in wait_for_cluster
self.wait_for_running_instances()
File
"/Library/Python/2.7/site-packages/StarCluster-0.9999-py2.7.egg/starcluster/cluster.py",
line 1305, in wait_for_running_instances
nodes = nodes or self.get_nodes_or_raise()
File
"/Library/Python/2.7/site-packages/StarCluster-0.9999-py2.7.egg/starcluster/cluster.py",
line 754, in get_nodes_or_raise
nodes = self.nodes
File
"/Library/Python/2.7/site-packages/StarCluster-0.9999-py2.7.egg/starcluster/cluster.py",
line 744, in nodes
if n.is_master():
File
"/Library/Python/2.7/site-packages/StarCluster-0.9999-py2.7.egg/starcluster/node.py",
line 898, in is_master
return self.alias == 'master' or self.alias.endswith("-master")
File
"/Library/Python/2.7/site-packages/StarCluster-0.9999-py2.7.egg/starcluster/node.py",
line 150, in alias
"instance %s has no alias" % self.id)
BaseException: instance i-2c3f5051 has no alias
--
Daniel G Polhamus, PhD
Metrum Research Group, LLC
Received on Mon Nov 25 2013 - 14:14:54 EST