Possible bug in loadbalancer?
Hi,
Trolling though the log files I found several errors which were all of the
following form.
I believe I have been using ELB correctly, but anything is possible and I do
not yet
have much experience with it. This version is from the git repo which I
cloned last
Friday.
I doubt that I can do much to reproduce this error, so if this trace helps,
that's great.
I will keep an eye out for other misbehavior as well.
Thanks for the great work you guys have put into starcluster and the elb.
Regards,
Don MacMillen
PhysWare
PID: 1822 __init__.py:481 - INFO - Jobstats cache is not full. Pulling full
job history.
PID: 1822 __init__.py:486 - DEBUG - getting past 10800 seconds worth of job
history.
PID: 1822 ssh.py:397 - ERROR - command 'source /etc/profile && qhost -xml'
failed with status 127
PID: 1822 ssh.py:397 - ERROR - command 'source /etc/profile && qstat -q
all.q -u "*" -xml' failed with status 127
PID: 1822 ssh.py:400 - DEBUG - command source /etc/profile && qacct -j -b
201105162111 failed with status 127
PID: 1822 __init__.py:524 - DEBUG - sizes: qhost: 30, qstat: 30, qacct:
30.
PID: 1822 cli.py:184 - DEBUG - Traceback (most recent call
last):
File "build/bdist.linux-i686/egg/starcluster/cli.py", line 160, in
main
sc.execute(args)
File "build/bdist.linux-i686/egg/starcluster/commands/loadbalance.py",
line 91, in execute
lb.run(cluster)
File "build/bdist.linux-i686/egg/starcluster/balancers/sge/__init__.py",
line 570, in run
if self.get_stats() ==
-1:
File "build/bdist.linux-i686/egg/starcluster/balancers/sge/__init__.py",
line 525, in get_stats
self.stat.parse_qhost(qhostxml)
File "build/bdist.linux-i686/egg/starcluster/balancers/sge/__init__.py",
line 50, in parse_qhost
doc =
xml.dom.minidom.parseString(string)
File "/usr/lib/python2.6/xml/dom/minidom.py", line 1928, in
parseString
return
expatbuilder.parseString(string)
File "/usr/lib/python2.6/xml/dom/expatbuilder.py", line 940, in
parseString
return
builder.parseString(string)
File "/usr/lib/python2.6/xml/dom/expatbuilder.py", line 223, in
parseString
parser.Parse(string,
True)
ExpatError: syntax error: line 1, column
0
PID: 1822 cli.py:129 - ERROR - Oops! Looks like you've found a bug in
StarCluster
PID: 1822 cli.py:130 - ERROR - Debug file written to:
/tmp/starcluster-debug-staruser.log
PID: 1822 cli.py:131 - ERROR - Look for lines starting with PID:
1822
PID: 1822 cli.py:132 - ERROR - Please submit this file, minus any private
information,
PID: 1822 cli.py:133 - ERROR - to starcluster_at_mit.edu
PID: 1822 ssh.py:534 - DEBUG - __del__
called
PID: 1822 ssh.py:534 - DEBUG - __del__
called
Received on Tue May 17 2011 - 00:15:10 EDT
This archive was generated by
hypermail 2.3.0.