StarCluster - Mailing List Archive

Re: SGE priorities and job dependency

From: Jacob Barhak <no email>
Date: Sat, 7 Sep 2013 14:59:07 -0500

Thanks Rayson,

This is good to know.

Can such array dependencies by extended to a "one after many" dependency with multiple arrays. Something like:

An element in Array B runs after corresponding elements in Arrays A1 and A2 have completed:

A1[0] -> B[0]
A2[0] -> B[0]

A1[1] -> B[1]
A2[1] -> B[1]

And if possible, what is the advantage of using an array over using many single jobs? It this just an organization issue or is there real benefit such as improved performance? Are there restrictions?

This is beyond the scope if the issue that was resolved, yet this makes an interesting discussion.

       Jacob


Sent from my iPhone

On Sep 4, 2013, at 10:50 AM, Rayson Ho <raysonlogin_at_gmail.com> wrote:

> Jacob,
>
> BTW, do the first 50,000 jobs all run the same program and the only
> difference between them is the input?? If so, all the first 50,000
> jobs should be submitted as an array job, and then next 50,000 jobs
> should be submitted as another array job. This way, you can even
> define inter-array element dependencies such that:
>
> a[0] -> b[0]
> a[1] -> b[1]
> ...
> a[49999] -> b[49999]
>
> Where a & b are array jobs that each has 50000 tasks.
>
> Rayson
>
> ==================================================
> Open Grid Scheduler - The Official Open Source Grid Engine
> http://gridscheduler.sourceforge.net/
> http://gridscheduler.sourceforge.net/GridEngine/GridEngineCloud.html
> ==================================================
> Open Grid Scheduler - The Official Open Source Grid Engine
> http://gridscheduler.sourceforge.net/
>
>
> On Tue, Sep 3, 2013 at 4:13 AM, Jacob Barhak <jacob.barhak_at_gmail.com> wrote:
>> Thanks Adam, for trying to help,
>>
>> The option you pointed to was already used to define the dependency. Yet
>> what I was trying to do is a bit more complicated than that and had to do
>> with the job priority and the dependency.
>>
>> In any case, the issue was resolved with the help of Lyn.
>>
>> Here is a link to the resolution:
>> http://star.mit.edu/cluster/mlarchives/1848.html
>>
>> Never the less, thanks for your response.
>>
>> Jacob
>>
>>
>> On Mon, Sep 2, 2013 at 8:45 PM, Adam <adamnkraut_at_gmail.com> wrote:
>>>
>>> Hi Jacob,
>>>
>>> Check out the -hold_jid option to qsub. This should do exactly what you
>>> need.
>>>
>>> Best Regards,
>>> Adam
>>>
>>>
>>> On Sun, Sep 1, 2013 at 3:19 AM, Jacob Barhak <jacob.barhak_at_gmail.com>
>>> wrote:
>>>>
>>>> Hello,
>>>>
>>>> Does someone have experience with the SGE scheduler that comes with
>>>> StarCluster? Experienced enough to figure out how make a dependent job
>>>> launch before other jobs once its dependencies are gone?
>>>>
>>>> I have been trying to give the dependent job a high priority, yet it
>>>> seems the scheduler ignores this and launches the jobs in FIFO order.
>>>>
>>>> Here is a simplified description of my problem. Lets say I have 100k
>>>> jobs. The first 50k are generating files in a shared directory and the last
>>>> 50k are processing those files and deleting them.
>>>>
>>>> Jobs 1-50000 are independent while job 50001 is dependent on job 50000
>>>> and job 50000+n is dependent on job n.
>>>>
>>>> I tried lowering the priority of the first 50k jobs using qsub -p -100. I
>>>> was hoping to get the effect of job 1 completing and since job 50001
>>>> dependency was satisfied and has the highest priority then it would be
>>>> launched next. The idea is to perform cleanup of the file after each job -
>>>> otherwise too many files can accumulate in a shared directory and may slow
>>>> down the NFS I/O significantly - not to mention disk space.
>>>>
>>>> However, I cannot get this behavior from SGE on a single test machine
>>>> outside StarCluster. So I assume this needs some special configuration.
>>>>
>>>> I am trying to avoid the I/O bottleneck I experienced on the cloud due to
>>>> too many files in a shared directory.
>>>>
>>>> Can someone help with this without changing the order if the jobs when
>>>> being launched?
>>>>
>>>> I hope there is a simple one line / qsub option for this.
>>>>
>>>> Jacob
>>>>
>>>>
>>>>
>>>> Sent from my iPhone
>>>> _______________________________________________
>>>> StarCluster mailing list
>>>> StarCluster_at_mit.edu
>>>> http://mailman.mit.edu/mailman/listinfo/starcluster
>>
>>
>> _______________________________________________
>> StarCluster mailing list
>> StarCluster_at_mit.edu
>> http://mailman.mit.edu/mailman/listinfo/starcluster
>>
Received on Sat Sep 07 2013 - 15:59:39 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject