This archive was generated by
On Mon, Oct 17, 2011 at 4:48 AM, Luis M. Carril <lmcarril_at_cesga.es> wrote:
> Although I´ve never tested a deployment so big, I´ve had a lot of
> problems with 10-20 node deployments. Always one machine or two hangs
> booting or deploying, which is pretty annoying; so I can´t have the
> cluster deployment completely automatized because I have to watch it to
> stop or boot the failing nodes.
> Best regards
> Luis M Carril
> El 14/10/2011 16:46, Paolo Di Tommaso escribió:
>> Hi All,
>> I've tried to setup a cluster with 100 nodes with quite powerful machines (Hi-Mem double extra large configuration) but it ended in a total failure.
>> The overall configuration process was extremely slow. Five instances blocked in pending state for more than 10 minutes so I had to terminate them manually .
>> Also other machines returns some error codes, for example mounting the /home and other SGE components.
>> I had to stop the initialization phase manually after more than 30 minutes, because it seem to hung.
>> I'm not blaming about StarCluster, it is really a nice piece of software. The problem really seems to be the Amazon infrastructure that has lot of latencies and unreliable behaviors.
>> What is your opinion about that? Is there anyone running successfully a "big" cluster using the StarCluster tool?
>> Thank you,
>> Paolo Di Tommaso
>> Software Engineer
>> Comparative Bioinformatics Group
>> Centre de Regulacio Genomica (CRG)
>> Dr. Aiguader, 88
>> 08003 Barcelona, Spain
>> StarCluster mailing list
> Luis M. Carril
> Project Technician
> Galicia Supercomputing Center (CESGA)
> Avda. de Vigo s/n
> 15706 Santiago de Compostela
> Tel: 34-981569810 ext 249
> StarCluster mailing list
Are you guys running a versioned release or the HEAD on git. I am more
than fairly certain this has been optimized in the repo, iirc a few
Matthew W. Summers
Gentoo Foundation Inc.
Received on Mon Oct 17 2011 - 10:59:00 EDT