Re: load balanced nodes accepting jobs before ready
Hey Stewart,
You can fix this issue by setting disable_queue=True in your config to
disable the default SGE plugin. Then you can define the SGE plugin in
your config, add it to your plugins list, and then move the pkginstaller
(and any other plugins that need to run before the node gets added)
*before* SGE in the list. This will ensure all other plugins get
executed before the node gets added to SGE. See the following doc for
more details on setting disable_queue and defining the SGE plugin in
your config:
http://star.mit.edu/cluster/docs/latest/plugins/sge.html#advanced-options
~Justin
On Mon, Apr 14, 2014 at 06:02:20PM +0000, Stewart, Andrew wrote:
> pkginstaller was called during add_node, but the node was added to the
> host list and its queue enabled before pkginstaller had a chance to finish
> installing dependencies. So it looks like a race condition. I did bump
> pkginstaller to the front of the plugins line (ahead of IPCluster) but I
> haven’t yet bothered to test whether that helps the situation any. The
> most certain way to handle it would be to just disable the queue until
> provisioning is complete.
> I actually think the simpler solution would be to bypass pkginstaller and
> just share managed packages with compute nodes via NFS. Why reinstall the
> same package N times?
> --
> Andrew Stewart
> Office of Research Information Services (ORIS),
> Office of the Chief Information Officer (OCIO),
> Smithsonian Institution
> 202-505-3633
> From: Rajat Banerjee <[1]rajatb_at_post.harvard.edu>
> Date: Monday, April 14, 2014 at 10:49 AM
> To: Andrew Stewart <[2]stewarta_at_si.edu>
> Cc: "[3]starcluster_at_mit.edu" <[4]starcluster_at_mit.edu>
> Subject: Re: [StarCluster] load balanced nodes accepting jobs before ready
> Hi,
> Does that mean that the pkginstaller plugin doesn't get called during
> add_node ? before the host is added to the SGE host list?
> Raj
>
> References
>
> Visible links
> 1. mailto:rajatb_at_post.harvard.edu
> 2. mailto:stewarta_at_si.edu
> 3. mailto:starcluster_at_mit.edu
> 4. mailto:starcluster_at_mit.edu
> _______________________________________________
> StarCluster mailing list
> StarCluster_at_mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
Received on Thu Apr 24 2014 - 12:09:04 EDT
This archive was generated by
hypermail 2.3.0.