Problems with NFS files not being flushed from cache
From what I've read in the NFS HOWTO pages at http://nfs.sourceforce.net it appears to be linked to the new codes regarding the close-to-open cache consistency. It definitely seems to be a bug in the nfs kernel implementation as a job can run for days on end without other NFS clients able to see data, even though the file is updated many, many times on the server where the application is running. Clearly the kernel is getting into some state where
Wondered if anyone else has seen this.. haven't seen much on google that matches it. Problems with NFS files not being flushed from cache Detailed Description: We're experiencing a problem across our sles9 (including pre sp1, as well as sp1 and an sp2 host) where the following occurs: 1. A program on a server opens a file in NFS, and keeps that file open, periodically writing output to that file as the program runs. 2. Going on another server, and doing something like tail -f on the file in NFS (or trying to less it, ls it, vi it, etc) will not see the updates after the first update is written out to NFS. 3. The file will finally appear to update on other clients when either the program is exited, or it closes the file, or a sync is run by hand on the server where the application that's writing to the file is running. We have tried playing with the different mount options for nfs (acdirmin/max and acregmin/max) as well as using noac for a short period which seemed to make the problem go away, but noac is not a viable long term solution as it kills performance (some jobs would take up to 10x longer to run with noac turned on). the NFS layer isn't doing a proper sync to flush the NFS writes/data out to the NFS server. We can recreate this problem quite easily... -- Mike Marion-Unix SysAdmin/Staff Engineer-http://www.qualcomm.com Homer: "Dad says this new guy is a repulsive, obnoxious old billionaire. So let's be extra nice to him!" ==> Simpsons
participants (1)
-
Michael Marion