Re: Starcluster SGE usage

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

From: John St. John <no email>
Date: Wed, 17 Oct 2012 14:23:12 -0700

Hi Gavin,
Thanks for pointing me in the right direction. I found a great solution though that seems to work really well. Since the "slots" is already set up to be equal to the core count on each node, I just needed access to a parallel environment that allowed me to submit jobs to nodes, but request a certain number of slots on a single node rather than spread out across N nodes. Changing the allocation rule to "fill" would probably still overflow into multiple nodes at the edge case. The way to do this properly is with the $pe_slots allocation rule in the parallel environment config file. Here is what I did:

qconf -sp by_node (create this with qconf -ap [name])

pe_name by_node
slots 9999999
user_lists NONE
xuser_lists NONE
start_proc_args /bin/true
stop_proc_args /bin/true
allocation_rule $pe_slots
control_slaves TRUE
job_is_first_task TRUE
urgency_slots min
accounting_summary FALSE

Then I modify the parallel environment list in all.q:
qconf -mq all.q
pe_list make orte by_node

That does it! Wahoo!

Ok now the problem is that I want this done automatically whenever a cluster is booted up, and if a node is added I want to make sure these configurations aren't clobbered. Any suggestions on making that happen?

Thanks everyone for your time!

Best,
John

On Oct 17, 2012, at 8:16 AM, Gavin W. Burris <bug_at_sas.upenn.edu> wrote:

> Hi John,
>
> The default configuration will distribute jobs based on load, meaning
> new jobs land on the least loaded node. If you want to fill nodes, you
> can change the load formula on the scheduler config:
> # qconf -msconf
> load_formula slots
>
> If you are using a parallel environment, the default can be changed to
> fill a node, as well:
> # qconf -mp orte
> allocation_rule $fill_up
>
> You may want to consider making memory consumable to prevent
> over-subscription. An easy option may be to make an arbitrary
> consumable complex resource, say john_jobs, and set it to the max number
> you want running at one time:
> # qconf -mc
> john_jobs jj INT <= YES YES 0 0
> # qconf -me global
> complex_values john_jobs=10
>
> Then, when you submit a job, specify the resource:
> $ qsub -l jj=1 ajob.sh
>
> Each job submitted in this way will consume one count of john_jobs,
> effectively limiting you to ten.
>
> Cheers.
>
>
> On 10/16/2012 06:32 PM, John St. John wrote:
>> Thanks Jesse!
>>
>> This does seem to work. I don't need to define -pe in this case b/c the
>> slots are actually limited per node.
>>
>> My only problem with this solution is that all jobs are now limited to
>> this hard coded number of slots, and also when nodes are added to the
>> cluster while it is running the file is modified and the line would need
>> to be edited again. On other systems I have seen the ability to specify
>> that a job will use a specific number of CPU's without being in a
>> special parallel environment I have seen the "-l ncpus=X" option
>> working, but it does't seem to with the default starcluster setup. Also
>> it looks like the "orte" parallel environment has some stuff very
>> specific to MPI, and doesn't have a problem splitting the requested
>> number of slots between multiple nodes, which I definitely don't want. I
>> just want to limit the number of jobs per node, but be able to specify
>> that at runtime.
>>
>> It looks like the grid engine is somehow aware of the number of CPU's
>> available on each node. I get this with by running `qhost`:
>> HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO
>> SWAPUS
>> -------------------------------------------------------------------------------
>> global - - - - - -
>> -
>> master linux-x64 8 0.88 67.1G 1.5G 0.0
>> 0.0
>> node001 linux-x64 8 0.36 67.1G 917.3M 0.0
>> 0.0
>> node002 linux-x64 8 0.04 67.1G 920.4M 0.0
>> 0.0
>> node003 linux-x64 8 0.04 67.1G 887.3M 0.0
>> 0.0
>> node004 linux-x64 8 0.06 67.1G 911.4M 0.0
>> 0.0
>>
>>
>> So it seems like there should be a way to tell qsub that job X is using
>> some subset of the available CPU, or RAM, so that it doesn't
>> oversubscribe the node.
>>
>> Thanks for your time!
>>
>> Best,
>> John
>>
>>
>>
>>
>>
>> On Oct 16, 2012, at 2:12 PM, Jesse Lu <jesselu_at_stanford.edu
>> <mailto:jesselu_at_stanford.edu>> wrote:
>>
>>> You can modify the all.q queue to assign a fixed number of slots to
>>> each node.
>>>
>>> * If I remember correctly, "$ qconf -mq all.q" will bring up the
>>> configuration of the all.q queue in an editor.
>>> * Under the "slots" attribute should be a semilengthly string such
>>> as "[node001=16],[node002=16],..."
>>> * Try replacing the entire string with a single number such as "2".
>>> This should assign each host to have only two slots.
>>> * Save the configuration and try a simple submission with the 'orte'
>>> parallel environment and let me know if it works.
>>>
>>> Jesse
>>>
>>> On Tue, Oct 16, 2012 at 1:37 PM, John St. John
>>> <johnthesaintjohn_at_gmail.com <mailto:johnthesaintjohn_at_gmail.com>> wrote:
>>>
>>> Hello,
>>> I am having issues telling qsub to limit the number of jobs ran at
>>> any one time on each node of the cluster. There are sometimes ways
>>> to do this with things like "qsub -l node=1:ppn=1" or "qsub -l
>>> procs=2" or something. I even tried "qsub -l slots=2" but that
>>> gave me an error and told me to use the parallel environment. When
>>> I tried to use the "orte" parallel environment like "-pe orte 2" I
>>> see "slots=2" in my qstat list, but everything gets executed on
>>> one node at the same parallelization as before. How do I limit the
>>> number of jobs per node? I am running a process that consumes a
>>> very large amount of ram.
>>>
>>> Thanks,
>>> John
>>>
>>>
>>> _______________________________________________
>>> StarCluster mailing list
>>> StarCluster_at_mit.edu <mailto:StarCluster_at_mit.edu>
>>> http://mailman.mit.edu/mailman/listinfo/starcluster
>>>
>>>
>>
>>
>>
>> _______________________________________________
>> StarCluster mailing list
>> StarCluster_at_mit.edu
>> http://mailman.mit.edu/mailman/listinfo/starcluster
>>
>
> --
> Gavin W. Burris
> Senior Systems Programmer
> Information Security and Unix Systems
> School of Arts and Sciences
> University of Pennsylvania
Received on Wed Oct 17 2012 - 17:23:15 EDT

This message: [ Message body ]
Next message: Olivier Grisel: "Re: IPython cluster plugin questions and suggested improvements"
Previous message: Justin Riley: "Re: IPython cluster plugin questions and suggested improvements"
In reply to: Gavin W. Burris: "Re: Starcluster SGE usage"
Next in thread: Gavin W. Burris: "Re: Starcluster SGE usage"
Reply: Gavin W. Burris: "Re: Starcluster SGE usage"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Navigation

Re: Starcluster SGE usage

Search:

Sort all by:

Navigation