Hi Developers,
Sorry for the Fri afternoon query, but I'm getting an error never before
seen on an addnode, and it recurs even on a -x retry. Appreciate any
workaround/recovery suggestions for the following:
*# sc an -x -a node002 w2c*
*StarCluster - (
http://star.mit.edu/cluster <
http://star.mit.edu/cluster>)
(v. 0.95.6)*
*Software Tools for Academics and Researchers (STAR)*
*Please submit bug reports to starcluster_at_mit.edu <starcluster_at_mit.edu>*
*>>> Waiting for node(s) to come up... (updating every 30s)*
*>>> Waiting for all nodes to be in a 'running' state...*
*3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100% *
*>>> Waiting for SSH to come up on all nodes...*
*3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100% *
*>>> Waiting for cluster to come up took 0.206 mins*
*>>> Running plugin starcluster.clustersetup.DefaultClusterSetup*
*>>> Configuring hostnames...*
*1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100% *
*>>> Configuring /etc/hosts on each node*
*3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100% *
*>>> Configuring NFS exports path(s):*
*/home /jobs/ /usr/share/jobs/ /pipe/*
*>>> Mounting all NFS export path(s) on 1 worker node(s)*
*1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100% *
*>>> Setting up NFS took 0.021 mins*
*1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100% *
*>>> Configuring scratch space for user(s): sgeadmin*
*1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100% *
*>>> Configuring passwordless ssh for root*
*>>> Configuring passwordless ssh for sgeadmin*
*>>> Running plugin swap_addnode_w2c.VISwapConfigurator*
*>>> Configuring Swap on node002*
*>>> Running plugin starcluster.plugins.users.CreateUsers*
*>>> Creating 1 users on node002*
*>>> Adding node002 to known_hosts for 1 users*
*!!! ERROR - Error occured while running plugin
'starcluster.plugins.users.CreateUsers':*
*!!! ERROR - Unhandled exception occured*
*Traceback (most recent call last):*
* File
"/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cli.py",
line 274, in main*
* sc.execute(args)*
* File
"/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/commands/addnode.py",
line 128, in execute*
* no_create=self.opts.no_create)*
* File
"/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cluster.py",
line 189, in add_nodes*
* no_create=no_create)*
* File
"/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cluster.py",
line 1042, in add_nodes*
* self.run_plugins(method_name="on_add_node", node=node)*
* File
"/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cluster.py",
line 1690, in run_plugins*
* self.run_plugin(plug, method_name=method_name, node=node)*
* File
"/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cluster.py",
line 1715, in run_plugin*
* func(*args)*
* File
"/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/plugins/users.py",
line 164, in on_add_node*
* master.add_to_known_hosts(user, [node])*
* File
"/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/node.py",
line 578, in add_to_known_hosts*
* khostsf = self.ssh.remote_file(known_hosts_file, 'a')*
* File
"/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/sshutils.py",
line 320, in remote_file*
* rfile = self.sftp.open(file, mode)*
* File
"/usr/lib/python2.6/site-packages/paramiko-1.15.1-py2.6.egg/paramiko/sftp_client.py",
line 327, in open*
* t, msg = self._request(CMD_OPEN, filename, imode, attrblock)*
* File
"/usr/lib/python2.6/site-packages/paramiko-1.15.1-py2.6.egg/paramiko/sftp_client.py",
line 729, in _request*
* return self._read_response(num)*
* File
"/usr/lib/python2.6/site-packages/paramiko-1.15.1-py2.6.egg/paramiko/sftp_client.py",
line 776, in _read_response*
* self._convert_status(msg)*
* File
"/usr/lib/python2.6/site-packages/paramiko-1.15.1-py2.6.egg/paramiko/sftp_client.py",
line 802, in _convert_status*
* raise IOError(errno.ENOENT, text)*
*IOError: [Errno 2] No such file*
*!!! ERROR - Oops! Looks like you've found a bug in StarCluster*
*!!! ERROR - Crash report written to:
/root/.starcluster/logs/crash-report-11317.txt*
*!!! ERROR - Please remove any sensitive data from the crash report*
*!!! ERROR - and submit it to starcluster_at_mit.edu <starcluster_at_mit.edu>*
There's not much more in the crash report, but I can send it, if it will
help. Thanks in advance.
Best,
Lyn
Received on Fri Aug 07 2015 - 17:39:53 EDT