StarCluster - Mailing List Archive

Re: loadbalance error

From: Justin Riley <no email>
Date: Wed, 11 Jan 2012 17:49:24 -0500

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Wei,

How long did you run the loadbalancer before you got the error? This
is clearly a memory-leak issue as you can see by the exception raised
(MemoryError). I'll have to look closer to be sure but it appears that
the balancer just endlessly appends to a list which eventually causes
the load balancer process to run out of memory.

I've filed an issue on github for this to keep track:

https://github.com/jtriley/StarCluster/issues/65

~Justin


On 01/11/2012 10:01 AM, Wei Tao wrote:
> Hi all,
>
> I was running loadbalance. After a while, I got the following
> error. Can someone shed some light on this? This happened before
> with earlier versions of Starcluster as well.
>
>>>> Loading full job history
> !!! ERROR - command 'source /etc/profile && qhost -xml' failed with
> status 1 Traceback (most recent call last): File
> "/usr/local/lib/python2.6/dist-packages/StarCluster-0.93-py2.6.egg/starcluster/cli.py",
>
>
line 251, in main
> sc.execute(args) File
> "/usr/local/lib/python2.6/dist-packages/StarCluster-0.93-py2.6.egg/starcluster/commands/loadbalance.py",
>
>
line 89, in execute
> lb.run(cluster) File
> "/usr/local/lib/python2.6/dist-packages/StarCluster-0.93-py2.6.egg/starcluster/balancers/sge/__init__.py",
>
>
line 583, in run
> if self.get_stats() == -1: File
> "/usr/local/lib/python2.6/dist-packages/StarCluster-0.93-py2.6.egg/starcluster/balancers/sge/__init__.py",
>
>
line 529, in get_stats
> self.stat.parse_qhost(qhostxml) File
> "/usr/local/lib/python2.6/dist-packages/StarCluster-0.93-py2.6.egg/starcluster/balancers/sge/__init__.py",
>
>
line 49, in parse_qhost
> doc = xml.dom.minidom.parseString(string) File
> "/usr/lib/python2.6/xml/dom/minidom.py", line 1928, in parseString
> return expatbuilder.parseString(string) File
> "/usr/lib/python2.6/xml/dom/expatbuilder.py", line 940, in
> parseString return builder.parseString(string) File
> "/usr/lib/python2.6/xml/dom/expatbuilder.py", line 223, in
> parseString parser.Parse(string, True) ExpatError: syntax error:
> line 1, column 0
>
> ---------------------------------------------------------------------------
>
>
MemoryError Traceback (most recent call last)
>
> /usr/local/bin/starcluster in <module>() 7 if __name__ ==
> '__main__': 8 sys.exit( ----> 9
> load_entry_point('StarCluster==0.93', 'console_scripts',
> 'starcluster')() 10 ) 11
>
> /usr/local/lib/python2.6/dist-packages/StarCluster-0.93-py2.6.egg/starcluster/cli.pyc
>
>
in main()
> 306 logger.configure_sc_logging() 307
> warn_debug_file_moved() --> 308 StarClusterCLI().main() 309 310
> if __name__ == '__main__':
>
> /usr/local/lib/python2.6/dist-packages/StarCluster-0.93-py2.6.egg/starcluster/cli.pyc
>
>
in main(self)
> 283 log.debug(traceback.format_exc()) 284
> print --> 285 self.bug_found() 286 287
>
> /usr/local/lib/python2.6/dist-packages/StarCluster-0.93-py2.6.egg/starcluster/cli.pyc
>
>
in bug_found(self)
> 150 crashfile = open(static.CRASH_FILE, 'w') 151
> crashfile.write(header % "CRASH DETAILS") --> 152
> crashfile.write(session.stream.getvalue()) 153
> crashfile.write(header % "SYSTEM INFO") 154
> crashfile.write("StarCluster: %s\n" % __version__)
>
> /usr/lib/python2.6/StringIO.pyc in getvalue(self) 268 """
> 269 if self.buflist: --> 270 self.buf +=
> ''.join(self.buflist) 271 self.buflist = [] 272
> return self.buf
>
> MemoryError:
>
>
> Thanks!
>
> -Wei
>
>
> This body part will be downloaded on demand.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk8OEfQACgkQ4llAkMfDcrm5hACdH3Lu7/h2ef1VQ4lXf3oRcxLK
yQgAn2snn/KkJR9n/aqf7wPhIyw++pu+
=sWbl
-----END PGP SIGNATURE-----
Received on Wed Jan 11 2012 - 17:49:27 EST
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject