On Mon 03-05-10 16:05:35, Cameron Seader wrote:
On a standard write path, there's not too much of an ext2/3 specific code and I don't see a big potetial for returning ENOMEM especially with this much of free memory. Looking at the generic code, generic_file_buffered_write has: if (unlikely(sigismember(¤t->pending.signal, SIGKILL))) { /* * Must not hang almost forever in D state in * presence of sigkill and lots of ram/swap * (think during OOM). */ status = -ENOMEM; break; } So maybe this could be the path we are taking?
A look at the core makes it look very much to me like that is what it is (hooray). Can you confirm the following?
crash> print ((struct task_struct*)0xffff8103432a2080)->pending.signal $5 = { sig = {256} }
SIGKILL is 9, 9-1==8, and (1 & (256 >> 8)) == 1. So, if I'm reading sigismember correctly, yes, we have a SIGKILL pending. A little C test program confirms, but I'd like to get confirmation from someone that's actually used to the linux kernel code :) Yes, indeed it seems the process has SIGKILL pending.
I don't suppose there's any way to tell if this is caused via the OOM killer, is there? Any structures or something in the core i can analyze to see if it's been activated for some reason? You would have messages about OOM kill in the kernel log. So unless you see messages like "Out of Memory: Kill process ..." in the kernel log, it was not OOM killer which sent the signal. Given the amount of free memory, I actually seriously doubt it was OOM killer but check the log to be sure.
Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org