Hi Folks,
I am using 0.94.2. I am experimenting w/scaling. I had started a cluster
w/two nodes initially, using default names of master and node001. I added
another node, node002, then did a removenode of node001. When I attempted
to add another node of the c3.8xlarge type (supported by Rayson's
mod--thanks, Rayson) using alias of -a c3.8xlarge. Everything went fine
until it attempted to install OGS. At that point, it tried to reference
the node being added as node001, instead of as the alias:
Gerner:.starcluster mary$ sc addnode e1d -a c3.8xlarge -I c3.8xlarge
StarCluster - (
http://star.mit.edu/cluster) (v. 0.94.2)
Software Tools for Academics and Researchers (STAR)
Please submit bug reports to starcluster_at_mit.edu
>>> Launching node(s): c3.8xlarge
Reservation:r-7ad16218
>>> Waiting for instances to propagate...
>>> Waiting for node(s) to come up... (updating every 30s)
>>> Waiting for all nodes to be in a 'running' state...
3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Waiting for SSH to come up on all nodes...
3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Waiting for cluster to come up took 2.089 mins
>>> Running plugin starcluster.clustersetup.DefaultClusterSetup
>>> Configuring hostnames...
1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Configuring /etc/hosts on each node
3/3 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Configuring NFS exports path(s):
/home /usr/share/jobs/
>>> Mounting all NFS export path(s) on 1 worker node(s)
1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Setting up NFS took 0.166 mins
1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Configuring scratch space for user(s): sgeadmin
1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Configuring passwordless ssh for root
>>> Configuring passwordless ssh for sgeadmin
>>> Running plugin starcluster.plugins.sge.SGEPlugin
>>> Adding c3.8xlarge to SGE
>>> Configuring NFS exports path(s):
/opt/sge6
>>> Mounting all NFS export path(s) on 1 worker node(s)
1/1 ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
100%
>>> Setting up NFS took 0.128 mins
!!! ERROR - Error occured while running plugin
'starcluster.plugins.sge.SGEPlugin':
!!! ERROR - remote command 'source /etc/profile && cd /opt/sge6 &&
!!! ERROR - TERM=rxvt ./inst_sge -x -noremote -auto ./ec2_sge.conf'
!!! ERROR - failed with status 1:
!!! ERROR - Reading configuration from file ./ec2_sge.conf
!!! ERROR - [H[2J
!!! ERROR - error resolving host "node001": can't resolve host name
!!! ERROR - (h_errno = HOST_NOT_FOUND)
!!! ERROR - error resolving host "node001": can't resolve host name
!!! ERROR - (h_errno = HOST_NOT_FOUND)
!!! ERROR - error resolving host "node001": can't resolve host name
!!! ERROR - (h_errno = HOST_NOT_FOUND)
!!! ERROR - error resolving host "node001": can't resolve host name
!!! ERROR - (h_errno = HOST_NOT_FOUND)
!!! ERROR - error resolving host "node001": can't resolve host name
!!! ERROR - (h_errno = HOST_NOT_FOUND)
!!! ERROR - error resolving host "node001": can't resolve host name
!!! ERROR - (h_errno = HOST_NOT_FOUND)
Gerner:.starcluster mary$ sc sm e1d
I don't plan on using this approach routinely, but thought you'd want to
know about the error.
Thanks,
Lyn
Received on Thu Nov 21 2013 - 16:50:51 EST