StarCluster - Mailing List Archive

100 nodes cluster

From: Paolo Di Tommaso <no email>
Date: Fri, 14 Oct 2011 16:46:18 +0200

Hi All,

I've tried to setup a cluster with 100 nodes with quite powerful machines (Hi-Mem double extra large configuration) but it ended in a total failure.

The overall configuration process was extremely slow. Five instances blocked in pending state for more than 10 minutes so I had to terminate them manually .

Also other machines returns some error codes, for example mounting the /home and other SGE components.

I had to stop the initialization phase manually after more than 30 minutes, because it seem to hung.


I'm not blaming about StarCluster, it is really a nice piece of software. The problem really seems to be the Amazon infrastructure that has lot of latencies and unreliable behaviors.


What is your opinion about that? Is there anyone running successfully a "big" cluster using the StarCluster tool?




Thank you,

Paolo Di Tommaso
Software Engineer
Comparative Bioinformatics Group
Centre de Regulacio Genomica (CRG)
Dr. Aiguader, 88
08003 Barcelona, Spain
Received on Fri Oct 14 2011 - 10:46:30 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject