StarCluster - Mailing List Archive

Re: CG1 plus StarCluster Questions

From: Ron Chen <no email>
Date: Fri, 11 May 2012 20:37:19 -0700 (PDT)

The funny thing is that once qmon is up, then the latency is not bad. You can use qmon as if it is on your local workstation.

The main slowness is in the initial startup phrase of qmon. It could be caused by the Motif library or other X libraries, but we don't know which one.

 -Ron



----- Original Message -----
From: Rayson Ho <raysonlogin_at_gmail.com>
To: Scott Le Grand <varelse2005_at_gmail.com>
Cc: "starcluster_at_mit.edu" <starcluster_at_mit.edu>; Rayson Ho <raysonlogin_at_yahoo.com>
Sent: Friday, May 11, 2012 11:22 PM
Subject: Re: [StarCluster] CG1 plus StarCluster Questions

That's a known issue - and we would like to understand why it is taking so long.

If you leave it there for around 3-5 mins, then qmon will show up. For
a LAN connection it is not painful, but for a long latency network,
then starting qmon takes forever :-(

Rayson

================================
Open Grid Scheduler / Grid Engine
http://gridscheduler.sourceforge.net/

Scalable Grid Engine Support Program
http://www.scalablelogic.com/


On Fri, May 11, 2012 at 11:18 PM, Scott Le Grand <varelse2005_at_gmail.com> wrote:
> StarCluster - (http://web.mit.edu/starcluster) (v. 0.93.3)
>
> If I starcluster sshmaster -X mycluster and type qmon, then the splash
> screen for it shows up but it doesn't seem to progress from there.  How long
> should it take to get past that?
>
> Scott
>
>
>
> On Fri, May 11, 2012 at 8:15 PM, Rayson Ho <raysonlogin_at_gmail.com> wrote:
>>
>> If you have a recent enough version of StarCluster, then you should be
>> able to run qmon without any special settings that forward X in SSH.
>>
>> This was added in: https://github.com/jtriley/StarCluster/issues/81
>>
>> Rayson
>>
>> ================================
>> Open Grid Scheduler / Grid Engine
>> http://gridscheduler.sourceforge.net/
>>
>> Scalable Grid Engine Support Program
>> http://www.scalablelogic.com/
>>
>>
>>
>> On Fri, May 11, 2012 at 10:58 PM, Scott Le Grand <varelse2005_at_gmail.com>
>> wrote:
>> > This is a stupid question but...
>> >
>> > Given I access a starcluster cluster indirectly, how do I run an X
>> > application such that it displays on my remote system?
>> >
>> > I would normally type ssh -X ec2-user_at_amazoninstance.com qmon in order
>> > to
>> > fire up qmon, yes?
>> >
>> > How do I do the equivalent here?
>> >
>> > On Fri, May 11, 2012 at 2:45 PM, Rayson Ho <raysonlogin_at_yahoo.com>
>> > wrote:
>> >>
>> >> Hi Scott,
>> >>
>> >> You can set up a consumable resource to track usage of GPUs:
>> >>
>> >> http://gridscheduler.sourceforge.net/howto/consumable.html
>> >>
>> >> And we also have a load sensor that monitors the GPU devices:
>> >>
>> >>
>> >>
>> >> https://gridscheduler.svn.sourceforge.net/svnroot/gridscheduler/trunk/source/dist/gpu/gpu_sensor.c
>> >>
>> >> If you want to use the (2nd - ie. dynamic) method, then you will need
>> >> to
>> >> set it up by following this HOWTO:
>> >>
>> >> http://gridscheduler.sourceforge.net/howto/loadsensor.html
>> >>
>> >> The first method of using a consumable resource works best if you don't
>> >> run GPU
>> >> programs outside of Open Grid Scheduler/Grid Engine.
>> >>
>> >> Also note that in the next release of StarCluster GPU support will be
>> >> enhanced.
>> >>
>> >> Rayson
>> >>
>> >> =================================
>> >> Open Grid Scheduler / Grid Engine
>> >> http://gridscheduler.sourceforge.net/
>> >>
>> >> Scalable Grid Engine Support Program
>> >> http://www.scalablelogic.com/
>> >>
>> >>
>> >> ________________________________
>> >> From: Scott Le Grand <varelse2005_at_gmail.com>
>> >> To: starcluster_at_mit.edu
>> >> Sent: Friday, May 11, 2012 5:25 PM
>> >> Subject: [StarCluster] CG1 plus StarCluster Questions
>> >>
>> >> Hey guys, I'm really impressed with StarCluster and I've used it to
>> >> create
>> >> clusters ranging from 2 to 70 instances...
>> >>
>> >> I've also customized it to use CUDA 4.2 and 295.41, the latest toolkit
>> >> and
>> >> driver, because my code has GTX 680 support and I don't want to have to
>> >> comment it out just to build it (and 4.1 had a horrendous perf
>> >> regression).
>> >>
>> >> Anyway, 2 questions, one of which I think you already answered:
>> >>
>> >> 1. I'd like to setup a custom AMI that by default has configured 2 GPUs
>> >> as
>> >> a consumable resource.  I already have code to utilize exclusive mode
>> >> and
>> >> choose whichever GPU isn't in use in my app, but that all falls down
>> >> because
>> >> the queueing system is based on CPU cores rather than GPU count.  How
>> >> would
>> >> I set this up once so I can save the customized AMI and never have to
>> >> do it
>> >> again?
>> >>
>> >> 2. I'm also seeing the .ssh directories disappear on restart.  But I'll
>> >> look at your solution as I've just been restarting the whole cluster up
>> >> to
>> >> now.
>> >>
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> StarCluster mailing list
>> >> StarCluster_at_mit.edu
>> >> http://mailman.mit.edu/mailman/listinfo/starcluster
>> >>
>> >>
>> >
>> >
>> > _______________________________________________
>> > StarCluster mailing list
>> > StarCluster_at_mit.edu
>> > http://mailman.mit.edu/mailman/listinfo/starcluster
>> >
>>
>>
>>
>> --
>> ==================================================
>> Open Grid Scheduler - The Official Open Source Grid Engine
>> http://gridscheduler.sourceforge.net/
>
>



-- 
==================================================
Open Grid Scheduler - The Official Open Source Grid Engine
http://gridscheduler.sourceforge.net/
_______________________________________________
StarCluster mailing list
StarCluster_at_mit.edu
http://mailman.mit.edu/mailman/listinfo/starcluster
Received on Fri May 11 2012 - 23:37:21 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject