What is the outout of qstat and qacct without any arguments to those commands? And did your cluster finish running any jobs?
The file /opt/sge6/default/common/accounting is only there if there were jobs finished running.
-Ron
************************************************************************
Open Grid Scheduler - the official open source Grid Engine:
http://gridscheduler.sourceforge.net/
________________________________
From: Kai Li <kai.li.jx_at_gmail.com>
To: starcluster_at_mit.edu
Sent: Saturday, February 23, 2013 7:32 PM
Subject: [StarCluster] error of loadbalance ( can not list current job )
Hi,
When I use Starcluster, I got the following error message when I tried to use "starcluster loadbalance"
>>> Loading full job history
*** WARNING - Failed to retrieve stats (5/5):
Traceback (most recent call last):
File "/home/kli/.local/lib/python2.7/site-packages/StarCluster-0.9999-py2.7.egg/starcluster/balancers/sge/__init__.py", line 515, in get_stats
self.stat = self._get_stats()
File "/home/kli/.local/lib/python2.7/site-packages/StarCluster-0.9999-py2.7.egg/starcluster/balancers/sge/__init__.py", line 493, in _get_stats
qacct = '\n'.join(master.ssh.execute(qacct_cmd))
File "/home/kli/.local/lib/python2.7/site-packages/StarCluster-0.9999-py2.7.egg/starcluster/sshutils/__init__.py", line 538, in execute
msg, command, exit_status, out_str)
RemoteCommandFailed: remote command 'source /etc/profile && qacct -j -b 201302232051' failed with status 1:
no jobs running since startup
/opt/sge6/default/common/accounting: No such file or directory
*** WARNING - Retrying in 60s
!!! ERROR - Failed to retrieve SGE stats after trying 5 times,
!!! ERROR - exiting...
And I've tried qacct -j -b 201302232046 on masternode and also got the error message of "/opt/sge6/default/common/accounting: No such file or directory"Can anyone give me some hint to fix it? Thanks!
--
李凯 ( Kai Li )
_______________________________________________
StarCluster mailing list
StarCluster_at_mit.edu
http://mailman.mit.edu/mailman/listinfo/starcluster
Received on Sun Feb 24 2013 - 18:04:27 EST