StarCluster - Mailing List Archive

Re: Loadbalancing error against ubuntu QIIME AMI

From: Rayson Ho <no email>
Date: Wed, 4 Sep 2013 12:18:29 -0400

James,

One quick workaround: last time I looked at the load balancer, qstat
is only issued on the master node, so you should be able to run the
standard StarCluster AMI for the Grid Engine master host, and run the
QIIME AMI for the Grid Engine execution hosts by specifying:

MASTER_IMAGE_ID = <Standard StarCluster AMI>
NODE_IMAGE_ID = <QIIME AMI>

Rayson

==================================================
Open Grid Scheduler - The Official Open Source Grid Engine
http://gridscheduler.sourceforge.net/
http://gridscheduler.sourceforge.net/GridEngine/GridEngineCloud.html



On Sat, Aug 31, 2013 at 8:55 AM, Rayson Ho <raysonlogin_at_gmail.com> wrote:
> Hi James,
>
> The SGE Load Balancer needs the SGE executables to be in $PATH, and
> looks like the QIIME AMI does not have the modifications from
> StarCluster -- specifically /etc/profile.d/sge.sh that sets the
> environment variables needed by SGE qstat. (The "qstat" that complains
> about invalid option is the PBS qstat that happens to be in the
> execution $PATH.)
>
> I don't have access to the QIIME AMI (I've tried to find the
> ami-d5cc8fbc AMI but couldn't find it - is it available to the
> public?), I believe there are at least 2 ways to fix it:
>
> 1) Patch the QIIME AMI - Just look at the execution host of
> StarCluster and see how /etc/profile.d/sge.sh is introduced into the
> default environment. Then create a new AMI based on the modified
> instance (would be easy if it is EBS-based -- it's just a few steps in
> the AWS Management Console).
>
> 2) Write a StarCluster plugin that fixes this $PATH problem on the
> fly, or even add that to the SGEPlugin so that if the environment
> settings are not available, inject them during StarCluster-SGE
> bootstrap.
>
> Rayson
>
> ==================================================
> Open Grid Scheduler - The Official Open Source Grid Engine
> http://gridscheduler.sourceforge.net/
> http://gridscheduler.sourceforge.net/GridEngine/GridEngineCloud.html
>
>
> On Thu, Aug 29, 2013 at 8:14 AM, james pettengill <fixtgear_at_gmail.com> wrote:
>>
>> We are trying to run the loadbalancing when launching a cluster of QIIME AMI's (a software for analysis of next-gen sequencing data) and are running into some errors.
>
>
>> >>> Writing stats to file: /home/ubuntu/.starcluster/sge/STAR-ELASTIC/sge-stats.csv
>> >>> Loading full job history
>> *** WARNING - Failed to retrieve stats (1/5):
>> Traceback (most recent call last):
>> File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.94-py2.7.egg/starcluster/balancers/sge/__init__.py", line 536, in get_stats
>> return self._get_stats()
>> File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.94-py2.7.egg/starcluster/balancers/sge/__init__.py", line 507, in _get_stats
>> qstatxml = '\n'.join(master.ssh.execute(qstat_cmd))
>> File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.94-py2.7.egg/starcluster/sshutils/__init__.py", line 555, in execute
>> msg, command, exit_status, out_str)
>> RemoteCommandFailed: remote command 'source /etc/profile && qstat -u \* -xml -f -r' failed with status 2:
>> qstat: invalid option -- 'm'
>> qstat: conflicting options.
>> usage:
>> qstat [-f [-1]] [-W site_specific] [-x] [ job_identifier... | destination... ]
>> qstat [-a|-i|-r|-e] [-u user] [-n [-1]] [-s] [-G|-M] [-R] [job_id... | destination...]
>> qstat -Q [-f [-1]] [-W site_specific] [ destination... ]
>> qstat -q [-G|-M] [ destination... ]
>> qstat -B [-f [-1]] [-W site_specific] [ server_name... ]
>> *** WARNING - Retrying in 60s
>>
>> _______________________________________________
>> StarCluster mailing list
>> StarCluster_at_mit.edu
>> http://mailman.mit.edu/mailman/listinfo/starcluster
>>
Received on Wed Sep 04 2013 - 12:18:31 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject