StarCluster - Mailing List Archive

Re: Terminating a cluster does not remove its security group (UNCLASSIFIED)

From: Oppe, Thomas C ERDC-RDE-ITL-MS Contractor <no email>
Date: Thu, 20 Dec 2012 10:22:06 +0000

Classification: UNCLASSIFIED
Caveats: FOUO

Thank you very much. The paper has many other pointers to additional information on EC2 I/O performance.

I looked at the High I/O instances with SSD storage, but I am not sure if StarCluster allows clusters formed from these types of instances. My smallest size HYCOM job needs 32 8-core nodes (for a 255-process job), so I am not sure that there are even that many High I/O instances (hi1.4xlarge) in a particular region.

I'm looking at using the /mnt local disk of the "master" node and copying input files to each node's /mnt disk for those files that each MPI process needs to read. It's very cumbersome. I have to login to each node.

Tom Oppe

From: Paolo Di Tommaso []
Sent: Thursday, December 20, 2012 3:57 AM
To: Oppe, Thomas C ERDC-RDE-ITL-MS Contractor
Subject: Re: [StarCluster] Terminating a cluster does not remove its security group (UNCLASSIFIED)

Hi Thomas,

Regarding to EBS disks performance you may be interested to the following post, that reports a very detailed EBS benchmarking for different configurations and EC2 instance types.

Also if you need high I/O performance I would suggest you to give a try to the EC2 High I/O instance types that provides SSD storage.


On Dec 18, 2012, at 6:32 PM, "Oppe, Thomas C ERDC-RDE-ITL-MS Contractor" <<>> wrote:

Classification: UNCLASSIFIED
Caveats: FOUO


Thank you very much. I think the leftover clusters revealed by "starcluster listclusters" must have been the result of incomplete "starcluster terminate <cluster>" commands.

On another issue, do you have a pointer to information about how to create and use striped disks within StarCluster? We have an application that is I/O-intensive and the I/O portion seems to be slowing down the execution time considerably. We have tried the following approaches:

(1) Use standard EBS volumes. This was the slowest for us.
(2) Use provisioned IOPS EBS volumes with IOPS maxed out to 2000. This is somewhat faster than standard EBS volumes but still I/O dominates the computation.
(3) Warm the provisioned IOPS EBS volume by writing to it and filling it up, then erasing all information on it, then run the job on the warmed EBS volume. This was somewhat better.

I think we need to go to striped EBS volumes, but I have many questions. For the striped volumes, do we use standard EBS volumes or provisioned IOPS EBS volumes? How many volumes should we set up for striping? Do you recommend, say, 4-6 volumes of size 100GB or 200GB each? Do we use NFS or XFS as the file system for striped disks? Finally, do we do the striping before starting up a cluster, and if so, how do we declare a striped EBS volume in the StarCluster "config" file? Do we format the disks prior to starting up the cluster or afterwards?

If you have any information on these topics, it would be great to hear it. In my opinion, how to use striped disks should be a part of the StarCluster documentation and perhaps they can add a command to do this automatically for users, with options of the kind of EBS volumes to use, size of each, number of volumes to stripe, NFS or XFS file system (I may be showing my ignorance here), whether to prewarm it or not, and how to access the same striped volume in future clusters.

Thank you for your response. It is appreciated.

Tom Oppe

-----Original Message-----
From: Dustin Machi [<>]
Sent: Tuesday, December 18, 2012 10:39 AM
To: Oppe, Thomas C ERDC-RDE-ITL-MS Contractor
Subject: Re: [StarCluster] Terminating a cluster does not remove its security group (UNCLASSIFIED)

If the terminate command fails, then there will often be things left behind. This often happens if you manipulate the security groups outside of start cluster and it fails to delete the security group. For example if you start a cluster and then add it's security group to another manually, it can't be deleted because that group is referenced by another. This is just an example, but there are several places where terminate could fail. These sometimes require cleanup before reusing a name again.


On 7 Dec 2012, at 2:15, Oppe, Thomas C ERDC-RDE-ITL-MS Contractor wrote:

Classification: UNCLASSIFIED
Caveats: FOUO

Dear Sir:

I have noticed that when I try to start up a cluster with the same
name that was used before, the "startcluster start" command errors out
with some message about there not being a "master" node to run plugins
or something to that effect. So I choose a new name for the cluster
and all works fine with the new name. I think the problem is that
"starcluster terminate" doesn't completely remove all remnants of the
cluster. I did "starcluster listclusters" and saw all these defunct
clusters that were terminated a long time ago. The solution was to
delete the "Security Group" associated with each terminated cluster.
Once the security group is deleted, you can use that cluster's name
again in future "starcluster start" commands. At least I think so.

Tom Oppe

Classification: UNCLASSIFIED
Caveats: FOUO

StarCluster mailing list<>

Classification: UNCLASSIFIED
Caveats: FOUO

StarCluster mailing list<>

Classification: UNCLASSIFIED
Caveats: FOUO
Received on Thu Dec 20 2012 - 05:22:11 EST
This archive was generated by hypermail 2.3.0.


Sort all by: