Re: load balanced nodes accepting jobs before ready
Awesome, Iıll try that. The thought occurred to me but I wasnıt sure if
the SGE plugin was a special case that had to be run outside of the
context of the optional plugin list.
Thanks!
Andrew
--
Andrew Stewart
Office of Research Information Services (ORIS),
Office of the Chief Information Officer (OCIO),
Smithsonian Institution
202-505-3633
On 4/24/14, 12:09 PM, "Justin Riley" <jtriley_at_MIT.EDU> wrote:
>Hey Stewart,
>
>You can fix this issue by setting disable_queue=True in your config to
>disable the default SGE plugin. Then you can define the SGE plugin in
>your config, add it to your plugins list, and then move the pkginstaller
>(and any other plugins that need to run before the node gets added)
>*before* SGE in the list. This will ensure all other plugins get
>executed before the node gets added to SGE. See the following doc for
>more details on setting disable_queue and defining the SGE plugin in
>your config:
>
>https://urldefense.proofpoint.com/v1/url?u=http://star.mit.edu/cluster/doc
>s/latest/plugins/sge.html%23advanced-options&k=diZKtJPqj4jWksRIF4bjkw%3D%3
>D%0A&r=BtonOWSFhbuSfSXh3meGJQ%3D%3D%0A&m=PzgbcH1%2FFGh9TgdCo76DwKgrQmH7Q5a
>RkkX1TxHpijY%3D%0A&s=84887f5a8d8945ca8812b322628c8e876b1f0e62ba1d62f254965
>9792642a128
>
>~Justin
>
>On Mon, Apr 14, 2014 at 06:02:20PM +0000, Stewart, Andrew wrote:
>> pkginstaller was called during add_node, but the node was added to
>>the
>> host list and its queue enabled before pkginstaller had a chance to
>>finish
>> installing dependencies. So it looks like a race condition. I did
>>bump
>> pkginstaller to the front of the plugins line (ahead of IPCluster)
>>but I
>> havenıt yet bothered to test whether that helps the situation any.
>> The
>> most certain way to handle it would be to just disable the queue
>>until
>> provisioning is complete.
>> I actually think the simpler solution would be to bypass
>>pkginstaller and
>> just share managed packages with compute nodes via NFS. Why
>>reinstall the
>> same package N times?
>> --
>> Andrew Stewart
>> Office of Research Information Services (ORIS),
>> Office of the Chief Information Officer (OCIO),
>> Smithsonian Institution
>> 202-505-3633
>> From: Rajat Banerjee <[1]rajatb_at_post.harvard.edu>
>> Date: Monday, April 14, 2014 at 10:49 AM
>> To: Andrew Stewart <[2]stewarta_at_si.edu>
>> Cc: "[3]starcluster_at_mit.edu" <[4]starcluster_at_mit.edu>
>> Subject: Re: [StarCluster] load balanced nodes accepting jobs before
>>ready
>> Hi,
>> Does that mean that the pkginstaller plugin doesn't get called during
>> add_node ? before the host is added to the SGE host list?
>> Raj
>>
>> References
>>
>> Visible links
>> 1. mailto:rajatb_at_post.harvard.edu
>> 2. mailto:stewarta_at_si.edu
>> 3. mailto:starcluster_at_mit.edu
>> 4. mailto:starcluster_at_mit.edu
>
>> _______________________________________________
>> StarCluster mailing list
>> StarCluster_at_mit.edu
>>
>>https://urldefense.proofpoint.com/v1/url?u=http://mailman.mit.edu/mailman
>>/listinfo/starcluster&k=diZKtJPqj4jWksRIF4bjkw%3D%3D%0A&r=BtonOWSFhbuSfSX
>>h3meGJQ%3D%3D%0A&m=PzgbcH1%2FFGh9TgdCo76DwKgrQmH7Q5aRkkX1TxHpijY%3D%0A&s=
>>980d09ee86824895ced819d1fc5866f8968e4da6414431fccbddf7953632fb18
>
Received on Thu Apr 24 2014 - 12:11:34 EDT
This archive was generated by
hypermail 2.3.0.