StarCluster - Mailing List Archive

Re: StarCluster Development VPC-Starclusters - possible bug relating to "Tag Value exceeds 255 characters"....

From: Justin Riley <no email>
Date: Thu, 12 Dec 2013 12:09:07 -0500

Hi Jennifer,

Sorry you're having issues and thanks for reporting. I've created an
issue on github to track this:

https://github.com/jtriley/StarCluster/issues/348

Would you mind commenting on that issue with a copy of your config so
that I can take a look? Please remove all sensitive parts of your config
first.

Thanks!!

~Justin

On Tue, Dec 10, 2013 at 11:23:19AM -0500, Jennifer Staab wrote:
> I have had limited success getting Starcluster to successfully launch a
> cluster with EC2-VPC nodes under the development version (0.9999). Using a
> certain AMI I can easily launch a Starcluster cluster with EC2-VPC nodes,
> but using a different AMI it fails to launch.  I do set the config
> variables "VPC_ID" and "SUBNET_ID" and the only difference between the two
> cluster templates is the AMI that is used.
> Both AMIs used successfully launch a Starcluster cluster with EC2-classic
> nodes.  The only noted difference between the AMIs is that the one that
> successfully launches a Starcluster cluster with VPC-EC2 nodes is a
> private AMI that is "shared" with the account that I am running my VPC
> within.  The AMI that doesn't work with Starcluster-VPC is one that is
> private AMI "owned" by the account I am running my VPC within.   
> I believe the error I am getting has something to do with the Tags,
> specifically the "_at_sc-core" tag's value being beyond 255 characters, but I
> could be wrong.  Below I have included an example of the successful
> launch, the failed launch (including error message), and the listed
> clusters after both commands.
> Any suggestions on how to address this issue would be greatly appreciated.
> Thanks in advance for the help,
> -Jennifer
> -------------------------------------------------------------------------------------------------
> ------ Below is what it looks like when I have a successful launch ---
> -------------------------------------------------------------------------------------------------
> (starcluster)root_at_xxxxxxxxxxx:~# starcluster start -c testvpcA vpcA
> StarCluster - ([1]http://star.mit.edu/cluster) (v. 0.9999)
> Software Tools for Academics and Researchers (STAR)
> Please submit bug reports to [2]starcluster_at_mit.edu
> >>> Validating cluster template settings...
> >>> Cluster template settings are valid
> >>> Starting cluster...
> >>> Launching a 1-node cluster...
> >>> Creating security group _at_sc-vpcA...
> Reservation:r-2843fa4e
> >>> Waiting for cluster to come up... (updating every 30s)
> >>> Waiting for all nodes to be in a 'running' state...
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> >>> Waiting for SSH to come up on all nodes...
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> >>> Waiting for cluster to come up took 1.574 mins
> >>> The master node is
> >>> Configuring cluster...
> >>> Running plugin starcluster.clustersetup.DefaultClusterSetup
> >>> Configuring hostnames...
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> >>> Creating cluster user: sgeadmin (uid: 1007, gid: 1000)
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> >>> Configuring scratch space for user(s): sgeadmin
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> >>> Configuring /etc/hosts on each node
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> >>> Starting NFS server on master
> >>> Setting up NFS took 0.113 mins
> >>> Configuring passwordless ssh for root
> >>> Configuring passwordless ssh for sgeadmin
> >>> Running plugin starcluster.plugins.sge.SGEPlugin
> >>> Configuring SGE...
> >>> Setting up NFS took 0.000 mins
> >>> Removing previous SGE installation...
> >>> Installing Sun Grid Engine...
> >>> Creating SGE parallel environment 'orte'
> 1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> 100%
> >>> Adding parallel environment 'orte' to queue 'all.q'
> >>> Configuring cluster took 0.679 mins
> >>> Starting cluster took 2.307 mins
> The cluster is now ready to use. To login to the master node
> as root, run:
>     $ starcluster sshmaster vpcA
> If you're having issues with the cluster you can reboot the
> instances and completely reconfigure the cluster from
> scratch using:
>     $ starcluster restart vpcA
> When you're finished using the cluster and wish to terminate
> it and stop paying for service:
>     $ starcluster terminate vpcA
> Alternatively, if the cluster uses EBS instances, you can
> use the 'stop' command to shutdown all nodes and put them
> into a 'stopped' state preserving the EBS volumes backing
> the nodes:
>     $ starcluster stop vpcA
> WARNING: Any data stored in ephemeral storage (usually /mnt)
> will be lost!
> You can activate a 'stopped' cluster by passing the -x
> option to the 'start' command:
>     $ starcluster start -x vpcA
> This will start all 'stopped' nodes and reconfigure the
> cluster.
> -------------------------------------------------------------------------------------------------
> ------ Below is what it looks like when I have a FAILED launch ---
> -------------------------------------------------------------------------------------------------
> (starcluster)root_at_xxxxxxxxxxx:~# starcluster start -c testvpcB vpcB
> StarCluster - ([3]http://star.mit.edu/cluster) (v. 0.9999)
> Software Tools for Academics and Researchers (STAR)
> Please submit bug reports to [4]starcluster_at_mit.edu
> >>> Validating cluster template settings...
> >>> Cluster template settings are valid
> >>> Starting cluster...
> >>> Launching a 1-node cluster...
> >>> Creating security group _at_sc-vpcB...
> !!! ERROR - InvalidParameterValue: Tag value exceeds the maximum length of
> 255 characters
> Traceback (most recent call last):
>   File "/root/.virtualenvs/starcluster/starcluster/starcluster/cli.py",
> line 274, in main
>     sc.execute(args)
>   File
> "/root/.virtualenvs/starcluster/starcluster/starcluster/commands/start.py",
> line 220, in execute
>     validate_running=validate_running)
>   File
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
> 1537, in start
>     return self._start(create=create, create_only=create_only)
>   File "<string>", line 2, in _start
>   File "/root/.virtualenvs/starcluster/starcluster/starcluster/utils.py",
> line 111, in wrap_f
>     res = func(*arg, **kargs)
>   File
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
> 1552, in _start
>     self.create_cluster()
>   File
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
> 1066, in create_cluster
>     self._create_flat_rate_cluster()
>   File
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
> 1091, in _create_flat_rate_cluster
>     force_flat=True)[0]
>   File
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
> 859, in create_nodes
>     cluster_sg = [5]self.cluster_group.name
>   File
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
> 657, in cluster_group
>     self._add_tags_to_sg(sg)
>   File
> "/root/.virtualenvs/starcluster/starcluster/starcluster/cluster.py", line
> 698, in _add_tags_to_sg
>     sg.add_tag(static.CORE_TAG, core_settings)
>   File
> "/root/.virtualenvs/starcluster/local/lib/python2.7/site-packages/boto-2.19.0-py2.7.egg/boto/ec2/ec2object.py",
> line 82, in add_tag
>     dry_run=dry_run
>   File
> "/root/.virtualenvs/starcluster/local/lib/python2.7/site-packages/boto-2.19.0-py2.7.egg/boto/ec2/connection.py",
> line 4026, in create_tags
>     return self.get_status('CreateTags', params, verb='POST')
>   File
> "/root/.virtualenvs/starcluster/local/lib/python2.7/site-packages/boto-2.19.0-py2.7.egg/boto/connection.py",
> line 1158, in get_status
>     raise self.ResponseError(response.status, response.reason, body)
> EC2ResponseError: EC2ResponseError: 400 Bad Request
> <?xml version="1.0" encoding="UTF-8"?>
> <Response><Errors><Error><Code>InvalidParameterValue</Code><Message>Tag
> value exceeds the maximum length of 255
> characters</Message></Error></Errors><RequestID>1f589605-8f30-472d-8989-22ea120aea14</RequestID></Response>
> -----------------------------------------------------------------------------------------------------------------
> ------ When if FAILS it creates only a security group see "listclusters"
> below ---
> -----------------------------------------------------------------------------------------------------------------
> (starcluster)root_at_xxxxxxxxxxx:~# starcluster listclusters
> StarCluster - ([6]http://star.mit.edu/cluster) (v. 0.9999)
> Software Tools for Academics and Researchers (STAR)
> Please submit bug reports to [7]starcluster_at_mit.edu
> -------------------------------
> vpcB (security group: _at_sc-vpcB)
> -------------------------------
> Launch time: N/A
> Uptime: N/A
> Zone: N/A
> Keypair: N/A
> EBS volumes: N/A
> Cluster nodes: N/A
> -------------------------------
> vpcA (security group: _at_sc-vpcA)
> -------------------------------
> Launch time: 2013-12-10 14:39:36
> Uptime: 0 days, 00:04:23
> Zone: us-east-1b
> Keypair: Starcluster_VPC
> EBS volumes: N/A
> Cluster nodes:
>      master running i-1d745b65 10.0.0.138
> Total nodes: 1
> (starcluster)root_at_xxxxxxxxxxx:~#
>
> References
>
> Visible links
> 1. http://star.mit.edu/cluster
> 2. mailto:starcluster_at_mit.edu
> 3. http://star.mit.edu/cluster
> 4. mailto:starcluster_at_mit.edu
> 5. http://self.cluster_group.name/
> 6. http://star.mit.edu/cluster
> 7. mailto:starcluster_at_mit.edu

> _______________________________________________
> StarCluster mailing list
> StarCluster_at_mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster




Received on Thu Dec 12 2013 - 12:09:10 EST
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject