StarCluster - Mailing List Archive

Re: starcluster (potential) bug report

From: Yevgeny Popkov <no email>
Date: Wed, 13 Mar 2013 22:02:30 -0400

Just FYI, once I changed CLUSTER_USER in the config back to sgeadmin the
error disappeared.

Thanks,
Yevgeny

On Wed, Mar 13, 2013 at 9:44 PM, Yevgeny Popkov <ypopkov_at_gmail.com> wrote:

> ubuntu_at_ip-10-149-30-54:~$ sc start smallcluster
> StarCluster - (http://star.mit.edu/cluster) (v. 0.9999)
> Software Tools for Academics and Researchers (STAR)
> Please submit bug reports to starcluster_at_mit.edu
>
> >>> Using default cluster template: smallcluster
> >>> Validating cluster template settings...
> >>> Cluster template settings are valid
> >>> Starting cluster...
> >>> Launching a 2-node cluster...
> >>> Launching master node (ami: ami-c4801ead, type: m3.xlarge)...
> >>> Creating security group _at_sc-smallcluster...
> Reservation:r-a3a18bd9
> >>> Launching node001 (ami: ami-c4801ead, type: m3.xlarge)
> SpotInstanceRequest:sir-f3ac7014
> >>> Waiting for cluster to come up... (updating every 30s)
> >>> Waiting for all nodes to be in a 'running' state...
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> >>> Waiting for SSH to come up on all nodes...
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> >>> Waiting for cluster to come up took 2.215 mins
> >>> The master node is ec2-50-19-195-49.compute-1.amazonaws.com
> >>> Configuring cluster...
> >>> Attaching volume vol-79faa709 to master node on /dev/sdz ...
> >>> Waiting for vol-79faa709 to transition to: attached...
> >>> Running plugin starcluster.clustersetup.DefaultClusterSetup
> >>> Configuring hostnames...
> 2/2 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> !!! ERROR - Error occured while running plugin
> 'starcluster.clustersetup.DefaultClusterSetup':
> !!! ERROR - error occurred in job (id=node001): failed to connect to host
> ec2-23-22-91-158.compute-1.amazonaws.com on port 22
> Traceback (most recent call last):
> File
> "/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/threadpool.py",
> line 31, in run
> job.run()
> File
> "/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/threadpool.py",
> line 58, in run
> r = self.method(*self.args, **self.kwargs)
> File
> "/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/node.py",
> line 788, in set_hostname
> hostname_file = self.ssh.remote_file("/etc/hostname", "w")
> File
> "/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/sshutils/__init__.py",
> line 291, in remote_file
> rfile = self.sftp.open(file, mode)
> File
> "/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/sshutils/__init__.py",
> line 181, in sftp
> self._sftp = paramiko.SFTPClient.from_transport(self.transport)
> File
> "/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/sshutils/__init__.py",
> line 130, in transport
> port=self._port, timeout=self._timeout)
> File
> "/usr/local/lib/python2.7/dist-packages/StarCluster-0.9999-py2.7.egg/starcluster/sshutils/__init__.py",
> line 97, in connect
> raise exception.SSHConnectionError(host, port)
> SSHConnectionError: failed to connect to host
> ec2-23-22-91-158.compute-1.amazonaws.com on port 22
>
>
> !!! ERROR - Oops! Looks like you've found a bug in StarCluster
> !!! ERROR - Crash report written to:
> /home/ubuntu/.starcluster/logs/crash-report-2092.txt
> !!! ERROR - Please remove any sensitive data from the crash report
> !!! ERROR - and submit it to starcluster_at_mit.edu
>
>
Received on Wed Mar 13 2013 - 22:02:31 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject