[Starcluster] Starcluster hangs at Creating Cluster User
Hi,
I'm using Starcluster from the git repo. I think I have everything
configured properly. But when I try to a 1-node cluster, the process
hangs at the "create user" step:
>>> Validating cluster settings...
>>> Cluster settings are valid
>>> Starting cluster...
>>> Launching a 1-node cluster...
>>> Launching master node...
>>> Master AMI: ami-a19e71c8
>>> Creating security group _at_sc-testcluster...
Reservation:r-56c3ca3e
>>> Waiting for cluster to start.../>>> The master node is
ec2-184-73-33-230.compute-1.amazonaws.com
>>> Attaching volume vol-c3d927aa to master node...
>>> Setting up the cluster...
>>> Mounting EBS volume vol-c3d927aa on /home...
>>> Using private key /Users/danielyamins/amazon/id_rsa-gsg-keypair (rsa)
>>> Creating cluster user: gotdata
... and that's where it hangs.
I CAN log into the individual nodes -- both as master AND as "gotdata" --
using passwordless ssh. Here's what the /etc/hosts file looks like:
127.0.0.1 localhost.localdomain localhost
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
Since this is a 1-node cluster, I can't test the passwordless login.
I can reproduce this problem both with both the 32-bit and 64-bit base
starcluster AMIs as well as the AMIs that I created from those.
When I try to create a 2-node cluster, the process hangs a step later:
>>> Validating cluster settings...
>>> Cluster settings are valid
>>> Starting cluster...
>>> Launching a 2-node cluster...
>>> Launching master node...
>>> Master AMI: ami-f129c798
>>> Creating security group _at_sc-testcluster...
Reservation:r-e8d9d080
>>> Launching worker nodes...
>>> Node AMI: ami-f129c798
Reservation:r-ead9d082
>>> Waiting for cluster to start...
>>> The master node is ec2-184-73-111-239.compute-1.amazonaws.com
>>> Attaching volume vol-c3d927aa to master node...
>>> Setting up the cluster...
>>> Mounting EBS volume vol-c3d927aa on /home...
>>> Using private key /Users/danielyamins/amazon/id_rsa-gsg-keypair (rsa)
>>> Creating cluster user: gotdata
>>> Using private key /Users/danielyamins/amazon/id_rsa-gsg-keypair (rsa)
.... and there it hangs.
In this case, I can:
-- log into the master and worker nodes as root: e.g. "starcluster
sshmaster testcluster" and "starcluster sshnode testcluster 1" work fine
-- log into the master as user gotdata, but NOT into the other worker
node, e.g. "starcluster sshnode -u gotdata testcluster 0" works but
"starclsuter sshnode -u gotdata testcluster 1" DOESN'T.
Thanks!
Dan
Received on Thu Apr 15 2010 - 14:18:09 EDT
This archive was generated by
hypermail 2.3.0.