StarCluster - Mailing List Archive

dealing with EBS volumes

From: Manal Helal <no email>
Date: Fri, 8 Jun 2012 04:36:13 +1000

Hi,

I appreciate the help I am getting from our mailing list very much,

I am just having some confusions again:

1. about ebs. I created the image of the my modified instance using the
command ec2-create-image, and this was while there was an ebs volume
attached. Then I changed the ami in the configuration file, to be the new
ami ID, and I still keep the mounting of the ebs volume, so I end up with 2
ebs volumes attached to the new images. Is this normal? should I create the
image after detaching the volume first?

2. I am having problems detaching the volume while keeping the image
running, I can't find the commands that can do this, and when I
used ec2-detach-volume, I caused more problems than solving any.

3. Also, I thought this ebs is shared by the sense that it is mounted so
that all instances of the same cluster can read and write from. However,
when I create a cluster of 2 instances, each one instantiate its own ebs
volume from the starting volume in the configuration file. I am not sure if
there is any thing that can make this volume itself truly shared. All I can
find here:

http://aws.amazon.com/ebs/

that ebs is attached to only one instance, and sharing is by taking
snapshots. This will be a manual process, or too much programming. I need
something like a scratch volume to be shared for an mpi application.

4. Also I searched for how to make an mpi application work on a number of
instances, and couldn't locate the information about the machine file, and
whether it is found by default in the ec2 configuration, or should I build
it manually from the instance IDs or other identifiers, and if you can send
me an example file, this will be great


thanks again for your support,

P.S. I also get this error message, but it doesn't stop me from ssh and
terminating normally

$ starcluster start microSFMcluster
StarCluster - (http://web.mit.edu/starcluster) (v. 0.93.3)
Software Tools for Academics and Researchers (STAR)
Please submit bug reports to starcluster_at_mit.edu

>>> Using default cluster template: microSFMcluster
>>> Validating cluster template settings...
>>> Cluster template settings are valid
>>> Starting cluster...
>>> Launching a 1-node cluster...
>>> Creating security group _at_sc-microSFMcluster...
Reservation:r-e831c58d
>>> Waiting for cluster to come up... (updating every 30s)
>>> Waiting for all nodes to be in a 'running' state...
1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Waiting for SSH to come up on all nodes...
1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Waiting for cluster to come up took 1.334 mins
>>> The master node is ec2-67-202-55-20.compute-1.amazonaws.com
>>> Setting up the cluster...
>>> Attaching volume vol-69bd4807 to master node on /dev/sdz ...
>>> Configuring hostnames...
1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Mounting EBS volume vol-69bd4807 on /home...
>>> Creating cluster user: None (uid: 1002, gid: 1002)
!!! ERROR - command 'groupadd -o -g 1002 ubuntu' failed with status 9 |
0%
!!! ERROR - command 'useradd -o -u 1002 -g 1002 -s `which bash` -m ubuntu'
failed with status 6
1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Configuring scratch space for user(s): ubuntu
1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Configuring /etc/hosts on each node
1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Starting NFS server on master
>>> Setting up NFS took 0.074 mins
>>> Configuring passwordless ssh for root
>>> Configuring passwordless ssh for ubuntu
>>> Shutting down threads...
20/20 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
Traceback (most recent call last):
  File
"/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/cli.py",
line 255, in main
    sc.execute(args)
  File
"/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/commands/start.py",
line 194, in execute
    validate_running=validate_running)
  File
"/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/cluster.py",
line 1414, in start
    return self._start(create=create, create_only=create_only)
  File "<string>", line 2, in _start
  File
"/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/utils.py",
line 87, in wrap_f
    res = func(*arg, **kargs)
  File
"/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/cluster.py",
line 1437, in _start
    self.setup_cluster()
  File
"/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/cluster.py",
line 1446, in setup_cluster
    self._setup_cluster()
  File "<string>", line 2, in _setup_cluster
  File
"/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/utils.py",
line 87, in wrap_f
    res = func(*arg, **kargs)
  File
"/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/cluster.py",
line 1460, in _setup_cluster
    self.cluster_shell, self.volumes)
  File
"/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/clustersetup.py",
line 350, in run
    self._setup_passwordless_ssh()
  File
"/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/clustersetup.py",
line 231, in _setup_passwordless_ssh
    auth_conn_key=True)
  File
"/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/node.py",
line 411, in generate_key_for_user
    self.ssh.mkdir(ssh_folder)
  File
"/Library/Python/2.6/site-packages/StarCluster-0.93.3-py2.6.egg/starcluster/sshutils/__init__.py",
line 245, in mkdir
    return self.sftp.mkdir(path, mode)
  File "build/bdist.macosx-10.6-universal/egg/ssh/sftp_client.py", line
303, in mkdir
    self._request(CMD_MKDIR, path, attr)
  File "build/bdist.macosx-10.6-universal/egg/ssh/sftp_client.py", line
635, in _request
    return self._read_response(num)
  File "build/bdist.macosx-10.6-universal/egg/ssh/sftp_client.py", line
682, in _read_response
    self._convert_status(msg)
  File "build/bdist.macosx-10.6-universal/egg/ssh/sftp_client.py", line
708, in _convert_status
    raise IOError(errno.ENOENT, text)
IOError: [Errno 2] No such file

!!! ERROR - Oops! Looks like you've found a bug in StarCluster
!!! ERROR - Crash report written to:
/Users/manal/.starcluster/logs/crash-report-2240.txt
!!! ERROR - Please remove any sensitive data from the crash report
!!! ERROR - and submit it to starcluster_at_mit.edu
Received on Thu Jun 07 2012 - 14:36:54 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject