StarCluster - Mailing List Archive

Re: Possible NFS setup error when adding new nodes to a cluster?

From: Paul Koerbitz <no email>
Date: Wed, 18 Jan 2012 21:44:00 +0100

Hi Justin,

thanks for the fast response and the great work. I thought about taking a
crack at a fix myself, but Im not familiar with the codebase and don't have
very little time right now.

thanks
Paul

On Wed, Jan 18, 2012 at 21:33, Justin Riley <jtriley_at_mit.edu> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi Paul,
>
> I just tested for myself and I can confirm that /etc/exports is indeed
> being clobbered when running the 'addnode' command. I'm working on a
> patch release to fix this and other minor things. Should be out tomorrow.
>
> Thanks for reporting!
>
> ~Justin
>
> On 01/18/2012 02:08 PM, Paul Koerbitz wrote:
> > Dear starcluster team,
> >
> > I tripped over what might be an error with the NFS setup when
> > adding new nodes to a cluster.
> >
> > I set up my cluster with initially one root node only and then
> > first added one node and subsequently 4 more nodes. I noticed that
> > my ebsvolume wasn't getting mounted correctly on the nodes, calling
> > 'df' reported 'stale filehandle' for /home /opt/sge6 and /data
> >
> > My impression is that as nodes get added, the /etc/exports file
> > which is responsible for allowing NFS access gets overwritten.
> > Therefore only the last added node can access the shared file
> > systems.
> >
> > Here is how I resloved the issue. First I unmounted all the
> > volumes:
> >
> > root_at_node001:~# umount -f /data
> >
> > At this point remounting doesn't work:
> >
> > root_at_node001:~# mount -t nfs master:/data /data
> >
> > mount.nfs: access denied by server while mounting master:/data
> >
> >
> > I then edited /etc/exports on the master node. Here only the last
> > node was listed:
> >
> > /home node005(async,no_root_squash,no_subtree_check,rw) /opt/sge6
> > node005(async,no_root_squash,no_subtree_check,rw) /data
> > node005(async,no_root_squash,no_subtree_check,rw)
> >
> > I changed this to /home
> > node001(async,no_root_squash,no_subtree_check,rw) /opt/sge6
> > node001(async,no_root_squash,no_subtree_check,rw) /data
> > node001(async,no_root_squash,no_subtree_check,rw) /home
> > node002(async,no_root_squash,no_subtree_check,rw) /opt/sge6
> > node002(async,no_root_squash,no_subtree_check,rw) /data
> > node002(async,no_root_squash,no_subtree_check,rw) /home
> > node003(async,no_root_squash,no_subtree_check,rw) /opt/sge6
> > node003(async,no_root_squash,no_subtree_check,rw) /data
> > node003(async,no_root_squash,no_subtree_check,rw) /home
> > node004(async,no_root_squash,no_subtree_check,rw) /opt/sge6
> > node004(async,no_root_squash,no_subtree_check,rw) /data
> > node004(async,no_root_squash,no_subtree_check,rw) /home
> > node005(async,no_root_squash,no_subtree_check,rw) /opt/sge6
> > node005(async,no_root_squash,no_subtree_check,rw) /data
> > node005(async,no_root_squash,no_subtree_check,rw)
> >
> > then restart the nfs-server:
> >
> > $ /etc/init.d/nfs-kernel-server restart
> >
> > After that running 'df' on each node showed the nfs now working
> > correctly.
> >
> > kind regards Paul
> >
> >
> > This body part will be downloaded on demand.
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.17 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAk8XLJUACgkQ4llAkMfDcrkc3wCgi+vGwbv7fJDYmf3UBLuJp9QP
> 06MAn2QNOt+EFuTwnaiCyemhttM6oTdo
> =jNrz
> -----END PGP SIGNATURE-----
>
Received on Wed Jan 18 2012 - 15:44:32 EST
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject