StarCluster - Mailing List Archive

Re: [Starcluster] cluster start-up hangs when starting Sun Grid Engine

From: Justin Riley <no email>
Date: Fri, 16 Apr 2010 17:26:18 -0400

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Damian,

So if you're willing to test out the github version, here are the new
9.10 karmic AMIs (they're now public):

ami-17b15e7e (32bit)
ami-2941ad40 (64bit)

Again the work-in-progress docs for starcluster github version are here;

http://web.mit.edu/stardev/cluster/docs

Let me know if you have any issues or questions,

Thanks!

~Justin

On 04/16/2010 03:01 PM, Justin Riley wrote:
> Hi Damian,
>
> You're experiencing the same problem as Gabriel earlier. Basically with
> the 64bit StarCluster AMI it has a problem with NFS and this is what
> you're experiencing.
>
> I just tested my new, not yet released, StarCluster Ubuntu 9.10 karmic
> AMIs and it works with c1.xlarge/64bit with nfs just fine.
>
> Unfortunately this AMI will not work with the current version of
> StarCluster on pypi given that it uses the latest Sun Grid Engine which
> has slightly different config parameters.
>
> However, if you want to use the github code which I'm hoping to release
> next week anyway, this should work for you. The current work-in-progress
> docs for the next release are here:
>
> http://web.mit.edu/stardev/cluster/docs
>
> Let me know if you want to give this a try. I'm going to make those new
> 9.10 karmic AMIs available today after some testing.
>
> ~Justin
>
>
> On 04/16/2010 02:53 PM, Damian Eads wrote:
>> Hi,
>
>> I reserved three c1.xlarge instances (24 cores, 3 nodes, 8 cores per
>> node), one master and two workers all in the same availability group
>> (us-east-1a). Installing the Sun Grid Engine hangs for a very long
>> time. I terminated and tried again, no avail. The third time, when I
>> tried logging in, my EBS volume mounted with mount point /data isn't
>> visible over NFS on all worker nodes.
>
>> eads_at_argentina:~/work/repo/StarCluster$ starcluster start -x mycluster dtest
>> /tmp/qqq/lib/python2.6/site-packages/pycrypto-2.0.1-py2.6-linux-i686.egg/Crypto/Hash/SHA.py:6:
>> DeprecationWarning: the sha module is deprecated; use the hashlib
>> module instead
>> /tmp/qqq/lib/python2.6/site-packages/pycrypto-2.0.1-py2.6-linux-i686.egg/Crypto/Hash/MD5.py:6:
>> DeprecationWarning: the md5 module is deprecated; use hashlib instead
>> /var/lib/python-support/python2.6/IPython/Magic.py:38:
>> DeprecationWarning: the sets module is deprecated
>> from sets import Set
>> StarCluster - (http://web.mit.edu/starcluster)
>> Software Tools for Academics and Researchers (STAR)
>> Please submit bug reports to starcluster_at_mit.edu
>
>>>>> Validating cluster settings...
>>>>> Cluster settings are valid
>>>>> Starting cluster...
>>>>> Waiting for cluster to start...
>>>>> The master node is ec2-184-73-86-18.compute-1.amazonaws.com
>>>>> Attaching volume vol-c5e85dac to master node...
>>>>> Setting up the cluster...
>>>>> Mounting EBS volume vol-c5e85dac on /data...
>> ssh.py:65 - WARNING - specified key does not end in either rsa or dsa,
>> trying both
>>>>> Using private key /home/eads/deadskey.pem (rsa)
>>>>> Creating cluster user: sgeadmin
>> ssh.py:65 - WARNING - specified key does not end in either rsa or dsa,
>> trying both
>>>>> Using private key /home/eads/deadskey.pem (rsa)
>> ssh.py:65 - WARNING - specified key does not end in either rsa or dsa,
>> trying both
>>>>> Using private key /home/eads/deadskey.pem (rsa)
>>>>> Configuring scratch space for user: sgeadmin
>>>>> Configuring /etc/hosts on each node
>>>>> Configuring NFS...
>>>>> Configuring passwordless ssh for root
>>>>> Configuring passwordless ssh for user: sgeadmin
>>>>> Using existing RSA ssh keys found for user: sgeadmin
>>>>> Installing Sun Grid Engine...
>
>> It hangs on this step. I've reproduced this three times. Any ideas?
>
>> Thanks a lot in advance!
>
>> Damian
>
>
>
>> -----------------------------------------------------
>> Damian Eads Ph.D. Candidate
>> University of California Computer Science
>> 1156 High Street Machine Learning Lab, E2-489
>> Santa Cruz, CA 95064 http://www.soe.ucsc.edu/~eads
>> _______________________________________________
>> Starcluster mailing list
>> Starcluster_at_mit.edu
>> http://mailman.mit.edu/mailman/listinfo/starcluster
>
_______________________________________________
Starcluster mailing list
Starcluster_at_mit.edu
http://mailman.mit.edu/mailman/listinfo/starcluster

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvI1foACgkQ4llAkMfDcrmAWACfcR845r3WA8GrsVy/+pnE6U+l
wT8An23UcNqRGRgb3zcpfna4Bf+SMDyA
=jlXO
-----END PGP SIGNATURE-----
Received on Fri Apr 16 2010 - 17:26:21 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject