On 5/3/2010 at 01:47 PM, in message <20100503194736.GH3470@quack.suse.cz>, Jan Kara <jack@suse.cz> wrote: Hello,
On Mon 03-05-10 13:12:40, Cameron Seader wrote:
First, we're given an inode 'ainode', which should be the correct inode for the file we're looking at. (If it were incorrect, we would have gotten an error much earlier.)
If we have iget, we call iget. The 2.6.16.60-* kernels lack iget, I believe, so instead we do: So we are talking about SLE10 based kernels, right? In fact these kernels do have iget() but I guess you do not want to do all the writing by hand and want to use standard write path and thus you need open file descriptor for which you need a dentry...
No, this was my mistake. I thought the lack of an 'iget' symbol in the core meant that it wasn't available, but iget itself is just a static inline function, so it wouldn't be in there. We use iget if it's available, so we are using iget here.
fid.i32.ino = ainode; fid.i32.gen = 0; dp = afs_cacheSBp->s_export_op->fh_to_dentry(afs_cacheSBp, &fid, sizeof(fid), FILEID_INO32_GEN); filp = dentry_open(dp, mntget(afs_cacheMnt), O_RDWR); Hmm, so about which kernel are we speaking? fh_to_dentry has been introduced only in 2.6.24...
Yes, sorry, that's my mistake. With iget, we actually call: tip = iget(afs_cacheSBp, (u_long) ainode); dp = d_alloc_anon(tip); tip->i_flags |= MS_NOATIME; filp = dentry_open(dp, mntget(afs_cacheMnt), O_RDWR);
<snip>
It's not intended to be written to constantly, but in this case it probably is written to several times successively (due to certain parameters set a bit low, and the high load for these clients).
I believe this function was called about 644203 times in the core I'm looking at, which means that file was written to at least around 644000 times... I'm assuming at least most of those were right after another. OK, so these 644000 writes succeed and then you start getting ENOMEM? Does the machine have enough free memory? If not, output from /proc/meminfo and /proc/slabinfo could help to tell you where the memory has gone.
However, there would almost always be several reads of the same file between successive writes. (Again, in an 'open(); read(); close();' fashion) But they are probably all happening very quickly; I assume the cache for the stuff in this file is thrashing. Well, the cache could be thrashing but still you'll get ENOMEM only if the kernel cannot find enough memory to pull in a page you are writing to. And that should not happen unless the machine has real problems. My personal tip would be that your code leaks some memory (or reference or so) and thus kernel really gets out of memory after enough reading / writing...
To be clear, I mean the OpenAFS cache is thrashing, not kernel memory caches et al... I just meant to say that this particular file is getting written to and read from a lot. Here is output from kmem -i crash> kmem -i PAGES TOTAL PERCENTAGE TOTAL MEM 4089940 15.6 GB ---- FREE 1587155 6.1 GB 38% of TOTAL MEM USED 2502785 9.5 GB 61% of TOTAL MEM SHARED 1591925 6.1 GB 38% of TOTAL MEM BUFFERS 34087 133.2 MB 0% of TOTAL MEM CACHED 1779819 6.8 GB 43% of TOTAL MEM SLAB 166166 649.1 MB 4% of TOTAL MEM TOTAL HIGH 0 0 0% of TOTAL MEM FREE HIGH 0 0 0% of TOTAL HIGH TOTAL LOW 4089940 15.6 GB 100% of TOTAL MEM FREE LOW 1587155 6.1 GB 38% of TOTAL LOW TOTAL SWAP 8389936 32 GB ---- SWAP USED 382 1.5 MB 0% of TOTAL SWAP SWAP FREE 8389554 32 GB 99% of TOTAL SWAP Seems like we have enough memory. Do you know why we could be getting an ENOMEM at all? Is there anything in an ext2/3 write that could require allocating a lot of memory? Do you know why an ENOMEM could be generated with this much free memory available? I don't know if there's some limit for FS/VFS-related memory that is possible to hit, or perhaps a certain type of memory is needed that is not available... etc. The way we get the dentry for the file in question is via s_export_op->fh_to_dentry(), we get the file via dentry_open(), and write to it via filp->f_op->write(). Is there anything 'bad' or unsupported that we're doing with that sequence of calls that could contribute to this? Thanks, Cameron -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org