StarCluster - Mailing List Archive

Re: StarCluster Digest, Vol 49, Issue 1

From: Jacob Barhak <no email>
Date: Sun, 1 Sep 2013 19:52:01 -0500

Thanks Hugh,

Yes. It is a good idea, to resolve the simplified problem I posed, yet my problem is a bit more complicated. So your solution does not resolve my problem although it does avoid the issue in the simplified version.

A less simplified description of my application is that job 50001 depends on jobs 1-10 and collects the files generated by those. Yet it does not depend on other jobs. So I want the ideal sequence to be 1,2,3,4,5,6,7,8,9,10, 50001,11 ...20, 50002 ...

If processed in FIFO order there are 50k files generated and stored - this may put a toll on the cluster NFS. With ideal ordering of processing there would be only 10 files at most at any given time - thus avoiding NFS I/O bottlenecks on the cluster.

In my previous email I was trying to simplify the problem too much - apparently it was oversimplified.

I guess the solution has to do with the scheduler. I did try to mess with the scheduler a bit using the graphical interface yet had no success. If someone knows how to get the ideal scheduling behavior I desire, or knows if it is possible, and how to set it up for StarCluster, please do let me know.

In any case I appreciate the attention and willingness to help from those who responded.

        Jacob

Sent from my iPhone

On Sep 1, 2013, at 5:53 PM, "MacMullan, Hugh" <hughmac_at_wharton.upenn.edu> wrote:

> I would probably just combine jobs 1 & 50001, 2 & 50002, etc. so that each pair runs sequentially in the same script. That way job 1 'holds' a slot open for job 50001, which starts immediately when the job 1 part of the script is done.
>
> Possibly the simplest approach?
>
> -Hugh
>
> On Sep 1, 2013, at 2:50 PM, "Jacob Barhak" <jacob.barhak_at_gmail.com> wrote:
>
>> Thanks Lyn,
>>
>> Yes. The -hold_jid option works well. Yet the issue is more complicated and has to do with priorities along with dependencies.
>>
>> Someone who knows the scheduler may have an answer here.
>>
>> I am using the -hold_jid to hold job 50001 until job 1 is complete. Yet I want job 50001 to start after job 1 completes because of higher priority than jobs 2-50000.
>> The ideal sequence I am trying to get is:
>> 1, 50001 , 2, 50002, ...
>>
>> I don't mind if this takes a few more jobs before 50001 kicks in after 1 is done. Yet currently the jobs are executed in FIFO order so by the time 50001 is reached there will be 50000 files in a directory. In a cluster this may slow up things significantly. However, if the files are processed close to the way I want, there will not be too many files.
>>
>> I rather not chance the order of submission if possible.
>>
>> I hope someone knows how to get the desired behavior from SGE.
>>
>> Jacob
>>
>> Sent from my iPhone
>>
>> On Sep 1, 2013, at 12:08 PM, Lyn Gerner <schedulerqueen_at_gmail.com> wrote:
>>
>>> Hey Jacob,
>>>
>>> Have you looked at the qsub man page -hold_jid option? You can give it multiple jids or jobnames (or regex patterns).
>>> http://gridscheduler.sourceforge.net/htmlman/manuals.html
>>> http://gridscheduler.sourceforge.net/htmlman/htmlman1/sge_types.html
>>>
>>> This preso has a couple of examples:
>>> http://www.bioteam.net/wp-content/uploads/2011/03/02-SGE-SimpleWorkflow.pdf
>>>
>>> Best of luck,
>>> Lyn
>>>
>>>
>>>
>>> On Sun, Sep 1, 2013 at 6:26 AM, <starcluster-request_at_mit.edu> wrote:
>>>> Send StarCluster mailing list submissions to
>>>> starcluster_at_mit.edu
>>>>
>>>> To subscribe or unsubscribe via the World Wide Web, visit
>>>> http://mailman.mit.edu/mailman/listinfo/starcluster
>>>> or, via email, send a message with subject or body 'help' to
>>>> starcluster-request_at_mit.edu
>>>>
>>>> You can reach the person managing the list at
>>>> starcluster-owner_at_mit.edu
>>>>
>>>> When replying, please edit your Subject line so it is more specific
>>>> than "Re: Contents of StarCluster digest..."
>>>>
>>>> Today's Topics:
>>>>
>>>> 1. SGE priorities and job dependency (Jacob Barhak)
>>>>
>>>>
>>>> ---------- Forwarded message ----------
>>>> From: Jacob Barhak <jacob.barhak_at_gmail.com>
>>>> To: "starcluster_at_mit.edu" <starcluster_at_mit.edu>
>>>> Cc:
>>>> Date: Sun, 1 Sep 2013 02:19:12 -0500
>>>> Subject: [StarCluster] SGE priorities and job dependency
>>>> Hello,
>>>>
>>>> Does someone have experience with the SGE scheduler that comes with StarCluster? Experienced enough to figure out how make a dependent job launch before other jobs once its dependencies are gone?
>>>>
>>>> I have been trying to give the dependent job a high priority, yet it seems the scheduler ignores this and launches the jobs in FIFO order.
>>>>
>>>> Here is a simplified description of my problem. Lets say I have 100k jobs. The first 50k are generating files in a shared directory and the last 50k are processing those files and deleting them.
>>>>
>>>> Jobs 1-50000 are independent while job 50001 is dependent on job 50000 and job 50000+n is dependent on job n.
>>>>
>>>> I tried lowering the priority of the first 50k jobs using qsub -p -100. I was hoping to get the effect of job 1 completing and since job 50001 dependency was satisfied and has the highest priority then it would be launched next. The idea is to perform cleanup of the file after each job - otherwise too many files can accumulate in a shared directory and may slow down the NFS I/O significantly - not to mention disk space.
>>>>
>>>> However, I cannot get this behavior from SGE on a single test machine outside StarCluster. So I assume this needs some special configuration.
>>>>
>>>> I am trying to avoid the I/O bottleneck I experienced on the cloud due to too many files in a shared directory.
>>>>
>>>> Can someone help with this without changing the order if the jobs when being launched?
>>>>
>>>> I hope there is a simple one line / qsub option for this.
>>>>
>>>> Jacob
>>>>
>>>>
>>>>
>>>> Sent from my iPhone
>>>>
>>>>
>>>> _______________________________________________
>>>> StarCluster mailing list
>>>> StarCluster_at_mit.edu
>>>> http://mailman.mit.edu/mailman/listinfo/starcluster
>>> _______________________________________________
>>> StarCluster mailing list
>>> StarCluster_at_mit.edu
>>> http://mailman.mit.edu/mailman/listinfo/starcluster
>> _______________________________________________
>> StarCluster mailing list
>> StarCluster_at_mit.edu
>> http://mailman.mit.edu/mailman/listinfo/starcluster
Received on Sun Sep 01 2013 - 20:52:15 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject