StarCluster - Mailing List Archive

Re: StarCluster Digest, Vol 49, Issue 1

From: Lyn Gerner <no email>
Date: Sun, 1 Sep 2013 15:35:15 -1000

Jacob, Hugh's approach sounds good to me, but I believe you want to do
something like this:

qconf -ssconf > $file.
vi $file # edit the file to change (to a non-zero value) the
weight_priority value, then feed it back into the config with:
qconf -Msconf $file.

See man pages *sched_conf(5)<http://gridscheduler.sourceforge.net/htmlman/htmlman5/sched_conf.html?pathrev=V62u5_TAG>
 *and* **sge_priority(5)<http://gridscheduler.sourceforge.net/htmlman/htmlman5/sge_priority.html?pathrev=V62u5_TAG>
 *for reference.


On Sun, Sep 1, 2013 at 8:46 AM, Jacob Barhak <jacob.barhak_at_gmail.com> wrote:

> Thanks Lyn,
>
> Yes. The -hold_jid option works well. Yet the issue is more complicated
> and has to do with priorities along with dependencies.
>
> Someone who knows the scheduler may have an answer here.
>
> I am using the -hold_jid to hold job 50001 until job 1 is complete. Yet
> I want job 50001 to start after job 1 completes because of higher priority
> than jobs 2-50000.
> The ideal sequence I am trying to get is:
> 1, 50001 , 2, 50002, ...
>
> I don't mind if this takes a few more jobs before 50001 kicks in after 1
> is done. Yet currently the jobs are executed in FIFO order so by the time
> 50001 is reached there will be 50000 files in a directory. In a cluster
> this may slow up things significantly. However, if the files are processed
> close to the way I want, there will not be too many files.
>
> I rather not chance the order of submission if possible.
>
> I hope someone knows how to get the desired behavior from SGE.
>
> Jacob
>
> Sent from my iPhone
>
> On Sep 1, 2013, at 12:08 PM, Lyn Gerner <schedulerqueen_at_gmail.com> wrote:
>
> Hey Jacob,
>
> Have you looked at the qsub man page -hold_jid option? You can give it
> multiple jids or jobnames (or regex patterns).
> http://gridscheduler.sourceforge.net/htmlman/manuals.html
> http://gridscheduler.sourceforge.net/htmlman/htmlman1/sge_types.html<http://gridscheduler.sourceforge.net/htmlman/htmlman1/sge_types.html?pathrev=V62u5_TAG>
>
> This preso has a couple of examples:
> http://www.bioteam.net/wp-content/uploads/2011/03/02-SGE-SimpleWorkflow.pdf
>
> Best of luck,
> Lyn
>
>
>
> On Sun, Sep 1, 2013 at 6:26 AM, <starcluster-request_at_mit.edu> wrote:
>
>> Send StarCluster mailing list submissions to
>> starcluster_at_mit.edu
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> http://mailman.mit.edu/mailman/listinfo/starcluster
>> or, via email, send a message with subject or body 'help' to
>> starcluster-request_at_mit.edu
>>
>> You can reach the person managing the list at
>> starcluster-owner_at_mit.edu
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of StarCluster digest..."
>>
>> Today's Topics:
>>
>> 1. SGE priorities and job dependency (Jacob Barhak)
>>
>>
>> ---------- Forwarded message ----------
>> From: Jacob Barhak <jacob.barhak_at_gmail.com>
>> To: "starcluster_at_mit.edu" <starcluster_at_mit.edu>
>> Cc:
>> Date: Sun, 1 Sep 2013 02:19:12 -0500
>> Subject: [StarCluster] SGE priorities and job dependency
>> Hello,
>>
>> Does someone have experience with the SGE scheduler that comes with
>> StarCluster? Experienced enough to figure out how make a dependent job
>> launch before other jobs once its dependencies are gone?
>>
>> I have been trying to give the dependent job a high priority, yet it
>> seems the scheduler ignores this and launches the jobs in FIFO order.
>>
>> Here is a simplified description of my problem. Lets say I have 100k
>> jobs. The first 50k are generating files in a shared directory and the last
>> 50k are processing those files and deleting them.
>>
>> Jobs 1-50000 are independent while job 50001 is dependent on job 50000
>> and job 50000+n is dependent on job n.
>>
>> I tried lowering the priority of the first 50k jobs using qsub -p -100. I
>> was hoping to get the effect of job 1 completing and since job 50001
>> dependency was satisfied and has the highest priority then it would be
>> launched next. The idea is to perform cleanup of the file after each job -
>> otherwise too many files can accumulate in a shared directory and may slow
>> down the NFS I/O significantly - not to mention disk space.
>>
>> However, I cannot get this behavior from SGE on a single test machine
>> outside StarCluster. So I assume this needs some special configuration.
>>
>> I am trying to avoid the I/O bottleneck I experienced on the cloud due to
>> too many files in a shared directory.
>>
>> Can someone help with this without changing the order if the jobs when
>> being launched?
>>
>> I hope there is a simple one line / qsub option for this.
>>
>> Jacob
>>
>>
>>
>> Sent from my iPhone
>>
>>
>> _______________________________________________
>> StarCluster mailing list
>> StarCluster_at_mit.edu
>> http://mailman.mit.edu/mailman/listinfo/starcluster
>>
>>
> _______________________________________________
> StarCluster mailing list
> StarCluster_at_mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
>
>
Received on Sun Sep 01 2013 - 21:35:18 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject