StarCluster - Mailing List Archive

Re: Possible NFS setup error when adding new nodes to a cluster?

From: Paul Koerbitz <no email>
Date: Wed, 18 Jan 2012 23:45:14 +0100

Hi Justin,

ok great. I have something running right now that I don't want to
interrupt, but I might be able to take a stab at it tomorrow and will
report back then.

cheers
Paul

On Wed, Jan 18, 2012 at 23:17, Justin Riley <jtriley_at_mit.edu> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi Paul,
>
> No problem at all and thanks for the kind words. From my limited
> testing I believe this is fixed in the latest github code which will
> be included in tomorrow's patch release:
>
> http://tinyurl.com/8axmckc
>
> If you could test the latest github code and report back whether it
> fixes the issue for you or not that'd be very helpful.
>
> ~Justin
>
> On 01/18/2012 03:44 PM, Paul Koerbitz wrote:
> > Hi Justin,
> >
> > thanks for the fast response and the great work. I thought about
> > taking a crack at a fix myself, but Im not familiar with the
> > codebase and don't have very little time right now.
> >
> > thanks Paul
> >
> > On Wed, Jan 18, 2012 at 21:33, Justin Riley <jtriley_at_mit.edu
> > <mailto:jtriley_at_mit.edu>> wrote:
> >
> > Hi Paul,
> >
> > I just tested for myself and I can confirm that /etc/exports is
> > indeed being clobbered when running the 'addnode' command. I'm
> > working on a patch release to fix this and other minor things.
> > Should be out tomorrow.
> >
> > Thanks for reporting!
> >
> > ~Justin
> >
> > On 01/18/2012 02:08 PM, Paul Koerbitz wrote:
> >> Dear starcluster team,
> >
> >> I tripped over what might be an error with the NFS setup when
> >> adding new nodes to a cluster.
> >
> >> I set up my cluster with initially one root node only and then
> >> first added one node and subsequently 4 more nodes. I noticed
> >> that my ebsvolume wasn't getting mounted correctly on the nodes,
> >> calling 'df' reported 'stale filehandle' for /home /opt/sge6 and
> >> /data
> >
> >> My impression is that as nodes get added, the /etc/exports file
> >> which is responsible for allowing NFS access gets overwritten.
> >> Therefore only the last added node can access the shared file
> >> systems.
> >
> >> Here is how I resloved the issue. First I unmounted all the
> >> volumes:
> >
> >> root_at_node001:~# umount -f /data
> >
> >> At this point remounting doesn't work:
> >
> >> root_at_node001:~# mount -t nfs master:/data /data
> >
> >> mount.nfs: access denied by server while mounting master:/data
> >
> >
> >> I then edited /etc/exports on the master node. Here only the
> >> last node was listed:
> >
> >> /home node005(async,no_root_squash,no_subtree_check,rw)
> >> /opt/sge6 node005(async,no_root_squash,no_subtree_check,rw)
> >> /data node005(async,no_root_squash,no_subtree_check,rw)
> >
> >> I changed this to /home
> >> node001(async,no_root_squash,no_subtree_check,rw) /opt/sge6
> >> node001(async,no_root_squash,no_subtree_check,rw) /data
> >> node001(async,no_root_squash,no_subtree_check,rw) /home
> >> node002(async,no_root_squash,no_subtree_check,rw) /opt/sge6
> >> node002(async,no_root_squash,no_subtree_check,rw) /data
> >> node002(async,no_root_squash,no_subtree_check,rw) /home
> >> node003(async,no_root_squash,no_subtree_check,rw) /opt/sge6
> >> node003(async,no_root_squash,no_subtree_check,rw) /data
> >> node003(async,no_root_squash,no_subtree_check,rw) /home
> >> node004(async,no_root_squash,no_subtree_check,rw) /opt/sge6
> >> node004(async,no_root_squash,no_subtree_check,rw) /data
> >> node004(async,no_root_squash,no_subtree_check,rw) /home
> >> node005(async,no_root_squash,no_subtree_check,rw) /opt/sge6
> >> node005(async,no_root_squash,no_subtree_check,rw) /data
> >> node005(async,no_root_squash,no_subtree_check,rw)
> >
> >> then restart the nfs-server:
> >
> >> $ /etc/init.d/nfs-kernel-server restart
> >
> >> After that running 'df' on each node showed the nfs now working
> >> correctly.
> >
> >> kind regards Paul
> >
> >
> >> This body part will be downloaded on demand.
> >
> >
> >
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.17 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAk8XRPwACgkQ4llAkMfDcrlJWACgjNwy6KVMywbiP6aVggOgQVqm
> OD8AnA/1fwt04oGIhEtA7i3kq8KLMr0y
> =9mnL
> -----END PGP SIGNATURE-----
> _______________________________________________
> StarCluster mailing list
> StarCluster_at_mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
>
Received on Wed Jan 18 2012 - 17:45:46 EST
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject