StarCluster - Mailing List Archive

Re: Loadbalancing error against ubuntu QIIME AMI

From: Rayson Ho <no email>
Date: Sat, 31 Aug 2013 08:55:36 -0400

Hi James,

The SGE Load Balancer needs the SGE executables to be in $PATH, and
looks like the QIIME AMI does not have the modifications from
StarCluster -- specifically /etc/profile.d/sge.sh that sets the
environment variables needed by SGE qstat. (The "qstat" that complains
about invalid option is the PBS qstat that happens to be in the
execution $PATH.)

I don't have access to the QIIME AMI (I've tried to find the
ami-d5cc8fbc AMI but couldn't find it - is it available to the
public?), I believe there are at least 2 ways to fix it:

1) Patch the QIIME AMI - Just look at the execution host of
StarCluster and see how /etc/profile.d/sge.sh is introduced into the
default environment. Then create a new AMI based on the modified
instance (would be easy if it is EBS-based -- it's just a few steps in
the AWS Management Console).

2) Write a StarCluster plugin that fixes this $PATH problem on the
fly, or even add that to the SGEPlugin so that if the environment
settings are not available, inject them during StarCluster-SGE
bootstrap.

Rayson

==================================================
Open Grid Scheduler - The Official Open Source Grid Engine
http://gridscheduler.sourceforge.net/
http://gridscheduler.sourceforge.net/GridEngine/GridEngineCloud.html


On Thu, Aug 29, 2013 at 8:14 AM, james pettengill <fixtgear_at_gmail.com> wrote:
>
> We are trying to run the loadbalancing when launching a cluster of QIIME AMI's (a software for analysis of next-gen sequencing data) and are running into some errors.


> >>> Writing stats to file: /home/ubuntu/.starcluster/sge/STAR-ELASTIC/sge-stats.csv
> >>> Loading full job history
> *** WARNING - Failed to retrieve stats (1/5):
> Traceback (most recent call last):
> File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.94-py2.7.egg/starcluster/balancers/sge/__init__.py", line 536, in get_stats
> return self._get_stats()
> File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.94-py2.7.egg/starcluster/balancers/sge/__init__.py", line 507, in _get_stats
> qstatxml = '\n'.join(master.ssh.execute(qstat_cmd))
> File "/usr/local/lib/python2.7/dist-packages/StarCluster-0.94-py2.7.egg/starcluster/sshutils/__init__.py", line 555, in execute
> msg, command, exit_status, out_str)
> RemoteCommandFailed: remote command 'source /etc/profile && qstat -u \* -xml -f -r' failed with status 2:
> qstat: invalid option -- 'm'
> qstat: conflicting options.
> usage:
> qstat [-f [-1]] [-W site_specific] [-x] [ job_identifier... | destination... ]
> qstat [-a|-i|-r|-e] [-u user] [-n [-1]] [-s] [-G|-M] [-R] [job_id... | destination...]
> qstat -Q [-f [-1]] [-W site_specific] [ destination... ]
> qstat -q [-G|-M] [ destination... ]
> qstat -B [-f [-1]] [-W site_specific] [ server_name... ]
> *** WARNING - Retrying in 60s
>
> _______________________________________________
> StarCluster mailing list
> StarCluster_at_mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
>
Received on Sat Aug 31 2013 - 08:55:39 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject