Re: load balanced nodes accepting jobs before ready
This archive was generated by
pkginstaller was called during add_node, but the node was added to the host list and its queue enabled before pkginstaller had a chance to finish installing dependencies. So it looks like a race condition. I did bump pkginstaller to the front of the plugins line (ahead of IPCluster) but I havenít yet bothered to test whether that helps the situation any. The most certain way to handle it would be to just disable the queue until provisioning is complete.
I actually think the simpler solution would be to bypass pkginstaller and just share managed packages with compute nodes via NFS. Why reinstall the same package N times?
Office of Research Information Services (ORIS),
Office of the Chief Information Officer (OCIO),
From: Rajat Banerjee <rajatb_at_post.harvard.edu<mailto:rajatb_at_post.harvard.edu>>
Date: Monday, April 14, 2014 at 10:49 AM
To: Andrew Stewart <stewarta_at_si.edu<mailto:stewarta_at_si.edu>>
Cc: "starcluster_at_mit.edu<mailto:starcluster_at_mit.edu>" <starcluster_at_mit.edu<mailto:starcluster_at_mit.edu>>
Subject: Re: [StarCluster] load balanced nodes accepting jobs before ready
Does that mean that the pkginstaller plugin doesn't get called during add_node ? before the host is added to the SGE host list?
Received on Mon Apr 14 2014 - 14:02:25 EDT