StarCluster - Mailing List Archive

Possible NFS setup error when adding new nodes to a cluster?

From: Paul Koerbitz <no email>
Date: Wed, 18 Jan 2012 20:08:04 +0100

Dear starcluster team,

I tripped over what might be an error with the NFS setup when adding new
nodes to a cluster.

I set up my cluster with initially one root node only and then first added
one node and subsequently 4 more nodes.
I noticed that my ebsvolume wasn't getting mounted correctly on the nodes,
calling 'df' reported 'stale filehandle' for
/home /opt/sge6 and /data

My impression is that as nodes get added, the /etc/exports file which is
responsible for allowing NFS access gets overwritten. Therefore only the
last added node can access the shared file systems.

Here is how I resloved the issue. First I unmounted all the volumes:

root_at_node001:~# umount -f /data

At this point remounting doesn't work:

root_at_node001:~# mount -t nfs master:/data /data

mount.nfs: access denied by server while mounting master:/data


I then edited /etc/exports on the master node. Here only the last node was
listed:

/home node005(async,no_root_squash,no_subtree_check,rw)
/opt/sge6 node005(async,no_root_squash,no_subtree_check,rw)
/data node005(async,no_root_squash,no_subtree_check,rw)

I changed this to
/home node001(async,no_root_squash,no_subtree_check,rw)
/opt/sge6 node001(async,no_root_squash,no_subtree_check,rw)
/data node001(async,no_root_squash,no_subtree_check,rw)
/home node002(async,no_root_squash,no_subtree_check,rw)
/opt/sge6 node002(async,no_root_squash,no_subtree_check,rw)
/data node002(async,no_root_squash,no_subtree_check,rw)
/home node003(async,no_root_squash,no_subtree_check,rw)
/opt/sge6 node003(async,no_root_squash,no_subtree_check,rw)
/data node003(async,no_root_squash,no_subtree_check,rw)
/home node004(async,no_root_squash,no_subtree_check,rw)
/opt/sge6 node004(async,no_root_squash,no_subtree_check,rw)
/data node004(async,no_root_squash,no_subtree_check,rw)
/home node005(async,no_root_squash,no_subtree_check,rw)
/opt/sge6 node005(async,no_root_squash,no_subtree_check,rw)
/data node005(async,no_root_squash,no_subtree_check,rw)

then restart the nfs-server:

$ /etc/init.d/nfs-kernel-server restart

After that running 'df' on each node showed the nfs now working correctly.

kind regards
Paul
Received on Wed Jan 18 2012 - 14:08:36 EST
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject