Re: dealing with EBS volumes
This archive was generated by
Regarding EBS image, what is best when you've got all stuff installed and demands a new AMI of that? I've got an EBS volume attached to my cluster. If I go to AWS console and issue an image copy option, is it going to work? Do i have to start a 1-instance cluster and reinstall all software again?
Em 06/08/2012, às 13:46, Justin Riley <jtriley_at_mit.edu> escreveu:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> Hi Manal,
> My apologies for the extreme delay. I'm still catching up on
> responding to threads.
>> 1. about ebs. I created the image of the my modified instance using
>> the command ec2-create-image, and this was while there was an ebs
>> volume attached. Then I changed the ami in the configuration file,
>> to be the new ami ID, and I still keep the mounting of the ebs
>> volume, so I end up with 2 ebs volumes attached to the new images.
>> Is this normal? should I create the image after detaching the
>> volume first?
> StarCluster uses Amazon's create-image API when creating new AMIs from
> EBS-backed instances. This call will automatically snapshot any
> attached EBS volumes and include them in the new AMI's "block device
> This means anytime you start a new instance with the new AMI a new EBS
> volume will be created per snapshot in the AMI's block device mapping
> and automatically attached to the instance. To prevent extra volumes
> from being included in the AMI you should detach all external EBS
> volumes before creating the AMI.
> If you specify a list of volumes in your default cluster template and
> then use starcluster to start the image host then the specified
> volumes will be attached to the image host by default. In this case I
> would recommend either temporarily commenting out your volumes list in
> the default template or create an alternate template and then use the
> '-c' option to the start command to specify the alternate template, e.g.:
> [cluster image]
> cluster_size = 1
> keyname = mykey
> node_instance_type = m1.small
> node_image_id = ami-asdflkasdf
> $ starcluster start -s 1 -c image -o image_host
> I will add a note to the docs about this caveat with using the 'start'
> command to launch the image host.
>> 2. I am having problems detaching the volume while keeping the
>> image running, I can't find the commands that can do this, and when
>> I used ec2-detach-volume, I caused more problems than solving any.
> You can use ec2-detach-volume or the AWS console to detach volumes
> from the image host. You need to make sure to unmount the volume
> before detaching. After detaching you should then wait for the volume
> to be in the 'available' state before creating the AMI.
>> 3. Also, I thought this ebs is shared by the sense that it is
>> mounted so that all instances of the same cluster can read and
>> write from. However, when I create a cluster of 2 instances, each
>> one instantiate its own ebs volume from the starting volume in the
>> configuration file. I am not sure if there is any thing that can
>> make this volume itself truly shared. All I can find here:
>> that ebs is attached to only one instance, and sharing is by
>> taking snapshots. This will be a manual process, or too much
>> programming. I need something like a scratch volume to be shared
>> for an mpi application.
> The only way volumes can be shared is through a network file share.
> StarCluster uses NFS to share all volumes specified in your volumes
> list in the config across the cluster. In your case you're seeing the
> 'extra' volumes being created and attached as a consequence of having
> external EBS volumes mounted when creating your new AMI. These are not
> handled by StarCluster. Only volumes listed in your config will be
> NFS-shared across the cluster.
>> 4. Also I searched for how to make an mpi application work on a
>> number of instances, and couldn't locate the information about the
>> machine file, and whether it is found by default in the ec2
>> configuration, or should I build it manually from the instance IDs
>> or other identifiers, and if you can send me an example file, this
>> will be great
> I would recommend using SGE to submit parallel jobs on the cluster.
> You can easily submit a job that requests N processors on the cluster
> without needing a hostfile:
> $ qsub -b y -pe orte 50 /path/to/your/mpi/executable
> See here for more details (please read that section in full)
>> thanks again for your support,
> My pleasure :D
>> P.S. I also get this error message, but it doesn't stop me from ssh
>> and terminating normally
> What is your cluster_user setting in your config? Also would you mind
> opening $HOME/.starcluster/logs/debug.log, searching for 'Creating
> cluster user' and send the surrounding lines. This will give us more
> info on what's happening.
> These lines indicate that something weird is going on when creating
> the cluster user:
> !!! ERROR - command 'groupadd -o -g 1002 ubuntu' failed with status 9
> !!! ERROR - command 'useradd -o -u 1002 -g 1002 -s `which bash` -m
> ubuntu' failed with status 6
> Do you have cluster_user = ubuntu by chance? I need to look into how
> cluster_user could show up as "None" in the log above...
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.19 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> -----END PGP SIGNATURE-----
> StarCluster mailing list
Received on Fri Sep 21 2012 - 20:13:21 EDT