Twilight Zone: sge_gethostbyname failed

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

From: Lyn Gerner <no email>
Date: Fri, 27 Dec 2013 12:34:32 -1000

Hi All,

Okay, I'm in the Twilight Zone now. After starting a small cluster on the
23rd, and doing minimal reconfig (qmod -d) to disable the sge_execd on the
master and qconf -mq all.q to change some slot counts -- all of which
worked fine -- I come back these days later to find an unusable SGE config:

root_at_AWS-VTMXmaster-w2b ~
# qstat -f
error: sge_gethostbyname failed

/etc/hosts is correct for all its (internal) host addrs:

root_at_AWS-VTMXmaster-w2b ~
# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4
localhost4.localdomain4
::1 localhost localhost.localdomain localhost6
localhost6.localdomain6
10.250.65.204 master
10.251.30.12 node001

The gethostbyname utility works correctly (so does gethostbyaddr):

root_at_AWS-VTMXmaster-w2b /opt/sge6/default/common/install_logs
# /opt/sge6/utilbin/linux-x64/gethostbyname master
Hostname: master
Aliases:
Host Address(es): 10.250.65.204

root_at_AWS-VTMXmaster-w2b /opt/sge6/default/common/install_logs
# /opt/sge6/utilbin/linux-x64/gethostbyname node001
Hostname: node001
Aliases:
Host Address(es): 10.251.30.12

root_at_AWS-VTMXmaster-w2b /opt/sge6/default/common/install_logs
# qstat -f
error: sge_gethostbyname failed

I went so far as to edit the hostname in /etc/sysconfig/network to contain
"master" and "node001" on the two nodes. Same error.

I have been all over the 'net looking for solutions, but have found nothing
with a clear resolution. gridengine.sunsource.net is gone. The follow-on
at http://gridengine.org/pipermail/users/ doesn't seem to be searchable,
except on an onerous, month-by-month click-thru basis (which hasn't yielded
anything useful as I slog thru it).

Short of starcluster restart'ing, I'll appreciate anyone's inputs on what
to try next.

Thanks much,
Lyn
Received on Fri Dec 27 2013 - 17:34:34 EST

This message: [ Message body ]
Next message: Rayson Ho: "Re: Twilight Zone: sge_gethostbyname failed"
Previous message: Jacob Barhak: "Re: The simplest plugin for installing"
Next in thread: Rayson Ho: "Re: Twilight Zone: sge_gethostbyname failed"
Reply: Rayson Ho: "Re: Twilight Zone: sge_gethostbyname failed"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

This archive was generated by hypermail 2.3.0.

Twilight Zone: sge_gethostbyname failed

Search:

Sort all by:

Navigation

Twilight Zone: sge_gethostbyname failed

Search:

Sort all by:

Navigation