StarCluster - Mailing List Archive

SGE issue with hostnames

From: Robert Tomkiewicz <no email>
Date: Mon, 25 Jul 2011 20:25:38 -0700

Hi there,

I started a 4-node EC2 cluster using 0.92rc2, and ami-a5c42dcc, standard
starcluster 9.04 x64 ami.

I ran into the following issue while doing some basic sge setup. At first
qconf worked fine, then a few minutes later...

root_at_master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qconf -sq all.q
error: commlib error: access denied (client IP resolved to host name
"domU-12-31-39-09-80-C1.compute-1.internal". This is not identical to
clients host name "master").

after issuing

 root_at_master: ~# hostname master

I was able to proceed normally, and launch my sge jobs. They were running
normally, confirmed by the output of qstat.

However, some minutes later, when checking on them with another qstat, I got
the same thing again.

root_at_master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qstat -f
error: commlib error: access denied (client IP resolved to host name
"domU-12-31-39-09-80-C1.compute-1.internal". This is not identical to
clients host name "master").

resetting the hostname was to no avail.

root_at_master: ~ # hostname
master
root_at_master: ~ # hostname master
root_at_master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qstat -f
error: commlib error: access denied (client IP resolved to host name
"domU-12-31-39-09-80-C1.compute-1.internal". This is not identical to
clients host name "master").

So I tried this

 root_at_master: ~# hostname domU-12-31-39-09-80-C1.compute-1.internal

which yielded, vice versa...

root_at_master:/mnt/NAS1/simulations/CDI_0615_amazon/logs# qstat -f
error: commlib error: access denied (client IP resolved to host name
"master". This is not identical to clients host name
"domU-12-31-39-09-80-C1.compute-1.internal")

Setting the hostname back to master "hostname master") at this point yields
correct operation for a few minutes.


It seems clear the problem has to do with doubled hostnames, but where are
they set? Has anyone else had a similar problem?

Thank you,

Robert Tomkiewicz



/etc/hostname is simply

master


/etc/hosts is below:

127.0.0.1 localhost

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
10.210.135.47 master
10.66.83.219 node001
10.193.155.175 node002
10.206.70.15 node003
Received on Mon Jul 25 2011 - 23:25:38 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject