StarCluster - Mailing List Archive

Re: [Starcluster] cluster start-up hangs when starting Sun Grid Engine

From: Justin Riley <no email>
Date: Fri, 16 Apr 2010 15:01:26 -0400

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Damian,

You're experiencing the same problem as Gabriel earlier. Basically with
the 64bit StarCluster AMI it has a problem with NFS and this is what
you're experiencing.

I just tested my new, not yet released, StarCluster Ubuntu 9.10 karmic
AMIs and it works with c1.xlarge/64bit with nfs just fine.

Unfortunately this AMI will not work with the current version of
StarCluster on pypi given that it uses the latest Sun Grid Engine which
has slightly different config parameters.

However, if you want to use the github code which I'm hoping to release
next week anyway, this should work for you. The current work-in-progress
docs for the next release are here:

http://web.mit.edu/stardev/cluster/docs

Let me know if you want to give this a try. I'm going to make those new
9.10 karmic AMIs available today after some testing.

~Justin


On 04/16/2010 02:53 PM, Damian Eads wrote:
> Hi,
>
> I reserved three c1.xlarge instances (24 cores, 3 nodes, 8 cores per
> node), one master and two workers all in the same availability group
> (us-east-1a). Installing the Sun Grid Engine hangs for a very long
> time. I terminated and tried again, no avail. The third time, when I
> tried logging in, my EBS volume mounted with mount point /data isn't
> visible over NFS on all worker nodes.
>
> eads_at_argentina:~/work/repo/StarCluster$ starcluster start -x mycluster dtest
> /tmp/qqq/lib/python2.6/site-packages/pycrypto-2.0.1-py2.6-linux-i686.egg/Crypto/Hash/SHA.py:6:
> DeprecationWarning: the sha module is deprecated; use the hashlib
> module instead
> /tmp/qqq/lib/python2.6/site-packages/pycrypto-2.0.1-py2.6-linux-i686.egg/Crypto/Hash/MD5.py:6:
> DeprecationWarning: the md5 module is deprecated; use hashlib instead
> /var/lib/python-support/python2.6/IPython/Magic.py:38:
> DeprecationWarning: the sets module is deprecated
> from sets import Set
> StarCluster - (http://web.mit.edu/starcluster)
> Software Tools for Academics and Researchers (STAR)
> Please submit bug reports to starcluster_at_mit.edu
>
>>>> Validating cluster settings...
>>>> Cluster settings are valid
>>>> Starting cluster...
>>>> Waiting for cluster to start...
>>>> The master node is ec2-184-73-86-18.compute-1.amazonaws.com
>>>> Attaching volume vol-c5e85dac to master node...
>>>> Setting up the cluster...
>>>> Mounting EBS volume vol-c5e85dac on /data...
> ssh.py:65 - WARNING - specified key does not end in either rsa or dsa,
> trying both
>>>> Using private key /home/eads/deadskey.pem (rsa)
>>>> Creating cluster user: sgeadmin
> ssh.py:65 - WARNING - specified key does not end in either rsa or dsa,
> trying both
>>>> Using private key /home/eads/deadskey.pem (rsa)
> ssh.py:65 - WARNING - specified key does not end in either rsa or dsa,
> trying both
>>>> Using private key /home/eads/deadskey.pem (rsa)
>>>> Configuring scratch space for user: sgeadmin
>>>> Configuring /etc/hosts on each node
>>>> Configuring NFS...
>>>> Configuring passwordless ssh for root
>>>> Configuring passwordless ssh for user: sgeadmin
>>>> Using existing RSA ssh keys found for user: sgeadmin
>>>> Installing Sun Grid Engine...
>
> It hangs on this step. I've reproduced this three times. Any ideas?
>
> Thanks a lot in advance!
>
> Damian
>
>
>
> -----------------------------------------------------
> Damian Eads Ph.D. Candidate
> University of California Computer Science
> 1156 High Street Machine Learning Lab, E2-489
> Santa Cruz, CA 95064 http://www.soe.ucsc.edu/~eads
> _______________________________________________
> Starcluster mailing list
> Starcluster_at_mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvItAYACgkQ4llAkMfDcrm2pwCeLjqFrpMYrEk9GoMWQ9wQZ7Be
/A4An1fXxW33qXZzHTTSuY51+iDOwWNW
=6VCJ
-----END PGP SIGNATURE-----
Received on Fri Apr 16 2010 - 15:01:28 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject