StarCluster - Mailing List Archive

Re: 100 nodes cluster

From: Paolo Di Tommaso <no email>
Date: Fri, 28 Oct 2011 12:57:18 +0200

Dear All,

I'm still struggling with this problem with large cluster that requires so long time to be launched.

I think that some improvements are possible having a better multithread handling, but I'm not a Python guru, so I cannot say about that in details.

Anyway I'm looking for a more "radical" approach. My idea is to launch a 2-node cluster, save the master and slave nodes as two separate AMIs and use these to deploy a cluster of any size without having to install and configure everything from scratch (NFS, SGE, password less access, etc) but modifying only what is changed.


So my questions is: which are the "delta" in the configuration files between two different cluster instances of X and Y nodes ?

Knowing this it could be quite easy write a StarCluster plugin that will apply only these changes, achieving a much more faster launch time.


Thank you,

Paolo Di Tommaso
Software Engineer
Comparative Bioinformatics Group
Centre de Regulacio Genomica (CRG)
Dr. Aiguader, 88
08003 Barcelona, Spain




On Oct 20, 2011, at 9:48 PM, Rayson Ho wrote:

> ----- Original Message -----
>> However, if one can wrap around the real
> ssh with a fake ssh script that sleeps 30 seconds and then runs the
> real
>> ssh, then we can see how good (or bad) the Workerpool handles long
> latency commands - and we will start from
>> there to optimize the launch
> performance.
>
> Replying to myself - after quickly reading the code...
>
> StarCluster uses Paramiko instead of executing ssh, so wrapping around a long latency ssh script won't work.
>
> And there are quite a lot of discussions about issues with multithreaded programs that call Paramiko -- just google: Paramiko+multithreading
>
>
> Rayson
>
> =================================
> Grid Engine / Open Grid Scheduler
> http://gridscheduler.sourceforge.net
> _______________________________________________
> StarCluster mailing list
> StarCluster_at_mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
Received on Fri Oct 28 2011 - 06:58:33 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject