Re: ? re CreateUsers error in .95.6
I wonder if, instead of zeroing-out the file, '> known_hosts' actually
created it?
I noticed the mode for opening the file originally is:
*add_to_known_hosts*
* khostsf = self.ssh.remote_file(known_hosts_file, 'a')*
'a', rather than 'a+', so it will fail if the file doesn't exist for some
reasons.
–
C
On Fri, Aug 7, 2015 at 3:47 PM Lyn Gerner <schedulerqueen_at_gmail.com> wrote:
> Update/Close: Strangely, this particular issue was resolved by going to
> the master and zeroing the known_hosts file (as in "> known_hosts").
>
> On Fri, Aug 7, 2015 at 11:39 AM, Lyn Gerner <schedulerqueen_at_gmail.com>
> wrote:
>
>> Hi Developers,
>>
>> Sorry for the Fri afternoon query, but I'm getting an error never before
>> seen on an addnode, and it recurs even on a -x retry. Appreciate any
>> workaround/recovery suggestions for the following:
>>
>> *# sc an -x -a node002 w2c*
>>
>> *StarCluster - (http://star.mit.edu/cluster
>> <http://star.mit.edu/cluster>) (v. 0.95.6)*
>>
>> *Software Tools for Academics and Researchers (STAR)*
>>
>> *Please submit bug reports to starcluster_at_mit.edu <starcluster_at_mit.edu>*
>>
>>
>> *>>> Waiting for node(s) to come up... (updating every 30s)*
>>
>> *>>> Waiting for all nodes to be in a 'running' state...*
>>
>> *3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> 100% *
>>
>> *>>> Waiting for SSH to come up on all nodes...*
>>
>> *3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> 100% *
>>
>> *>>> Waiting for cluster to come up took 0.206 mins*
>>
>> *>>> Running plugin starcluster.clustersetup.DefaultClusterSetup*
>>
>> *>>> Configuring hostnames...*
>>
>> *1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> 100% *
>>
>> *>>> Configuring /etc/hosts on each node*
>>
>> *3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> 100% *
>>
>> *>>> Configuring NFS exports path(s):*
>>
>> */home /jobs/ /usr/share/jobs/ /pipe/*
>>
>> *>>> Mounting all NFS export path(s) on 1 worker node(s)*
>>
>> *1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> 100% *
>>
>> *>>> Setting up NFS took 0.021 mins*
>>
>> *1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> 100% *
>>
>> *>>> Configuring scratch space for user(s): sgeadmin*
>>
>> *1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
>> 100% *
>>
>> *>>> Configuring passwordless ssh for root*
>>
>> *>>> Configuring passwordless ssh for sgeadmin*
>>
>> *>>> Running plugin swap_addnode_w2c.VISwapConfigurator*
>>
>> *>>> Configuring Swap on node002*
>>
>> *>>> Running plugin starcluster.plugins.users.CreateUsers*
>>
>> *>>> Creating 1 users on node002*
>>
>> *>>> Adding node002 to known_hosts for 1 users*
>>
>> *!!! ERROR - Error occured while running plugin
>> 'starcluster.plugins.users.CreateUsers':*
>>
>> *!!! ERROR - Unhandled exception occured*
>>
>> *Traceback (most recent call last):*
>>
>> * File
>> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cli.py",
>> line 274, in main*
>>
>> * sc.execute(args)*
>>
>> * File
>> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/commands/addnode.py",
>> line 128, in execute*
>>
>> * no_create=self.opts.no_create)*
>>
>> * File
>> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cluster.py",
>> line 189, in add_nodes*
>>
>> * no_create=no_create)*
>>
>> * File
>> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cluster.py",
>> line 1042, in add_nodes*
>>
>> * self.run_plugins(method_name="on_add_node", node=node)*
>>
>> * File
>> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cluster.py",
>> line 1690, in run_plugins*
>>
>> * self.run_plugin(plug, method_name=method_name, node=node)*
>>
>> * File
>> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/cluster.py",
>> line 1715, in run_plugin*
>>
>> * func(*args)*
>>
>> * File
>> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/plugins/users.py",
>> line 164, in on_add_node*
>>
>> * master.add_to_known_hosts(user, [node])*
>>
>> * File
>> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/node.py",
>> line 578, in add_to_known_hosts*
>>
>> * khostsf = self.ssh.remote_file(known_hosts_file, 'a')*
>>
>> * File
>> "/usr/lib/python2.6/site-packages/StarCluster-0.95.6-py2.6.egg/starcluster/sshutils.py",
>> line 320, in remote_file*
>>
>> * rfile = self.sftp.open(file, mode)*
>>
>> * File
>> "/usr/lib/python2.6/site-packages/paramiko-1.15.1-py2.6.egg/paramiko/sftp_client.py",
>> line 327, in open*
>>
>> * t, msg = self._request(CMD_OPEN, filename, imode, attrblock)*
>>
>> * File
>> "/usr/lib/python2.6/site-packages/paramiko-1.15.1-py2.6.egg/paramiko/sftp_client.py",
>> line 729, in _request*
>>
>> * return self._read_response(num)*
>>
>> * File
>> "/usr/lib/python2.6/site-packages/paramiko-1.15.1-py2.6.egg/paramiko/sftp_client.py",
>> line 776, in _read_response*
>>
>> * self._convert_status(msg)*
>>
>> * File
>> "/usr/lib/python2.6/site-packages/paramiko-1.15.1-py2.6.egg/paramiko/sftp_client.py",
>> line 802, in _convert_status*
>>
>> * raise IOError(errno.ENOENT, text)*
>>
>> *IOError: [Errno 2] No such file*
>>
>>
>> *!!! ERROR - Oops! Looks like you've found a bug in StarCluster*
>>
>> *!!! ERROR - Crash report written to:
>> /root/.starcluster/logs/crash-report-11317.txt*
>>
>> *!!! ERROR - Please remove any sensitive data from the crash report*
>>
>> *!!! ERROR - and submit it to starcluster_at_mit.edu <starcluster_at_mit.edu>*
>>
>>
>> There's not much more in the crash report, but I can send it, if it will
>> help. Thanks in advance.
>>
>>
>> Best,
>>
>> Lyn
>>
>
> _______________________________________________
> StarCluster mailing list
> StarCluster_at_mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
>
Received on Fri Aug 07 2015 - 18:52:00 EDT
This archive was generated by
hypermail 2.3.0.