StarCluster - Mailing List Archive

Re: ? re CreateUsers error in .95.6

From: Lyn Gerner <no email>
Date: Fri, 7 Aug 2015 12:45:15 -1000

Update/Close: Strangely, this particular issue was resolved by going to the
master and zeroing the known_hosts file (as in "> known_hosts").

On Fri, Aug 7, 2015 at 11:39 AM, Lyn Gerner <schedulerqueen_at_gmail.com>
wrote:

> Hi Developers,
>
> Sorry for the Fri afternoon query, but I'm getting an error never before
> seen on an addnode, and it recurs even on a -x retry. Appreciate any
> workaround/recovery suggestions for the following:
>
> *# sc an -x -a node002 w2c*
>
> *StarCluster - (http://star.mit.edu/cluster <http://star.mit.edu/cluster>)
> (v. 0.95.6)*
>
> *Software Tools for Academics and Researchers (STAR)*
>
> *Please submit bug reports to starcluster_at_mit.edu <starcluster_at_mit.edu>*
>
>
> *>>> Waiting for node(s) to come up... (updating every 30s)*
>
> *>>> Waiting for all nodes to be in a 'running' state...*
>
> *3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100% *
>
> *>>> Waiting for SSH to come up on all nodes...*
>
> *3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100% *
>
> *>>> Waiting for cluster to come up took 0.206 mins*
>
> *>>> Running plugin starcluster.clustersetup.DefaultClusterSetup*
>
> *>>> Configuring hostnames...*
>
> *1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100% *
>
> *>>> Configuring /etc/hosts on each node*
>
> *3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100% *
>
> *>>> Configuring NFS exports path(s):*
>
> */home /jobs/ /usr/share/jobs/ /pipe/*
>
> *>>> Mounting all NFS export path(s) on 1 worker node(s)*
>
> *1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100% *
>
> *>>> Setting up NFS took 0.021 mins*
>
> *1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100% *
>
> *>>> Configuring scratch space for user(s): sgeadmin*
>
> *1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100% *
>
> *>>> Configuring passwordless ssh for root*
>
> *>>> Configuring passwordless ssh for sgeadmin*
>
> *>>> Running plugin swap_addnode_w2c.VISwapConfigurator*
>
> *>>> Configuring Swap on node002*
>
> *>>> Running plugin starcluster.plugins.users.CreateUsers*
>
> *>>> Creating 1 users on node002*
>
> *>>> Adding node002 to known_hosts for 1 users*
>
> *!!! ERROR - Error occured while running plugin
> 'starcluster.plugins.users.CreateUsers':*
>
> *!!! ERROR - Unhandled exception occured*
>
> *Traceback (most recent call last):*
>
> * File
> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cli.py",
> line 274, in main*
>
> * sc.execute(args)*
>
> * File
> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/commands/addnode.py",
> line 128, in execute*
>
> * no_create=self.opts.no_create)*
>
> * File
> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cluster.py",
> line 189, in add_nodes*
>
> * no_create=no_create)*
>
> * File
> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cluster.py",
> line 1042, in add_nodes*
>
> * self.run_plugins(method_name="on_add_node", node=node)*
>
> * File
> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cluster.py",
> line 1690, in run_plugins*
>
> * self.run_plugin(plug, method_name=method_name, node=node)*
>
> * File
> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cluster.py",
> line 1715, in run_plugin*
>
> * func(*args)*
>
> * File
> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/plugins/users.py",
> line 164, in on_add_node*
>
> * master.add_to_known_hosts(user, [node])*
>
> * File
> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/node.py",
> line 578, in add_to_known_hosts*
>
> * khostsf = self.ssh.remote_file(known_hosts_file, 'a')*
>
> * File
> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/sshutils.py",
> line 320, in remote_file*
>
> * rfile = self.sftp.open(file, mode)*
>
> * File
> "/usr/lib/python2.6/site-packages/paramiko-1.15.1-py2.6.egg/paramiko/sftp_client.py",
> line 327, in open*
>
> * t, msg = self._request(CMD_OPEN, filename, imode, attrblock)*
>
> * File
> "/usr/lib/python2.6/site-packages/paramiko-1.15.1-py2.6.egg/paramiko/sftp_client.py",
> line 729, in _request*
>
> * return self._read_response(num)*
>
> * File
> "/usr/lib/python2.6/site-packages/paramiko-1.15.1-py2.6.egg/paramiko/sftp_client.py",
> line 776, in _read_response*
>
> * self._convert_status(msg)*
>
> * File
> "/usr/lib/python2.6/site-packages/paramiko-1.15.1-py2.6.egg/paramiko/sftp_client.py",
> line 802, in _convert_status*
>
> * raise IOError(errno.ENOENT, text)*
>
> *IOError: [Errno 2] No such file*
>
>
> *!!! ERROR - Oops! Looks like you've found a bug in StarCluster*
>
> *!!! ERROR - Crash report written to:
> /root/.starcluster/logs/crash-report-11317.txt*
>
> *!!! ERROR - Please remove any sensitive data from the crash report*
>
> *!!! ERROR - and submit it to starcluster_at_mit.edu <starcluster_at_mit.edu>*
>
>
> There's not much more in the crash report, but I can send it, if it will
> help. Thanks in advance.
>
>
> Best,
>
> Lyn
>
Received on Fri Aug 07 2015 - 18:45:19 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject