[Bug 390384] New: Dom0 kernel oops while processing aio operation
https://bugzilla.novell.com/show_bug.cgi?id=390384 User kwolf@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=390384#c380514 Summary: Dom0 kernel oops while processing aio operation Product: openSUSE 11.0 Version: Beta 2 Platform: Other OS/Version: Other Status: NEW Severity: Normal Priority: P5 - None Component: Xen AssignedTo: cgriffin@novell.com ReportedBy: kwolf@novell.com QAContact: qa@suse.de Found By: --- Created an attachment (id=215278) --> (https://bugzilla.novell.com/attachment.cgi?id=215278) /var/log/messages snippet Another oops that occurred on my development box while debugging the tap:aio problems. Note that I selected Xen because it's a Dom0 kernel, but I can't exclude that it is a general kernel problem. This one happened somewhere in the middle of a PV VM installation using tap:aio both for the virtual harddisk and the installation DVD iso. The IO requests of the guest are handled by tapdisk which in turn accesses the image file using Linux aio. Obviously something went wrong with one of these accesses. This could be even directly related to bug #380514. The usual symptom I get there is an EIO return value from random aio operations (say, up to five failing operations per VM installation). In most cases it worked to simply repeat a failed request in tapdisk. In the logfile the oops is immediately following such a repeated request. I've not been able to reproduce this yet. A snippet from /var/log/messages containing the stacktrace is attached. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=390384
User carnold@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c1
Charles Arnold
https://bugzilla.novell.com/show_bug.cgi?id=390384
User kwolf@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c2
Kevin Wolf
https://bugzilla.novell.com/show_bug.cgi?id=390384
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c3
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=390384
User kwolf@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c4
Kevin Wolf
https://bugzilla.novell.com/show_bug.cgi?id=390384
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c5
Jan Beulich
From the little analysis we were able to do this looks like a problem not specific to the Xen kernel. And even if it is, we'd need someone with much better AIO/direct-io knowledge to assist here.
-- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=390384
Lars Marowsky-Bree
https://bugzilla.novell.com/show_bug.cgi?id=390384
User jack@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c6
Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=390384
User kwolf@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c7
Kevin Wolf
https://bugzilla.novell.com/show_bug.cgi?id=390384
User jack@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c8
--- Comment #8 from Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=390384
User jack@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c9
--- Comment #9 from Jan Kara
From what I've understood from reading tap:aio sources do_cow_read() issues a
https://bugzilla.novell.com/show_bug.cgi?id=390384
User jack@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c10
--- Comment #10 from Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=390384
User jack@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c11
Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=390384
User kwolf@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c12
Kevin Wolf
https://bugzilla.novell.com/show_bug.cgi?id=390384
User kwolf@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c13
--- Comment #13 from Kevin Wolf
https://bugzilla.novell.com/show_bug.cgi?id=390384
User jack@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c14
Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=390384
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c15
--- Comment #15 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=390384
User jack@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c16
--- Comment #16 from Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=390384
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c17
--- Comment #17 from Jan Beulich
How have you found out the page is RO (I always like to learn how to find more info from the oops ;)?
Quite early in the oops there is a line like this PGD 1559067 PUD 175b067 PMD 18fb067 PTE 70ce165 which says that all upper page table levels have their write bit set, just the leaf entry (PTE) doesn't. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=390384
User kwolf@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c18
--- Comment #18 from Kevin Wolf
https://bugzilla.novell.com/show_bug.cgi?id=390384
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c19
--- Comment #19 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=390384
User kwolf@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c20
--- Comment #20 from Kevin Wolf
Ah, indeed, I wasn't aware of them mapping pages for write requests readonly. But that shouldn't cause any problems - nothing should try to access these pages other than for reading in that path.
Thinking about it again, this _must_ cause problems in combination with my patch. As I wrote above it issues a read when the write fails - and then of course we are in the wrong code path. So I think we can forget about the oops... Anyway, there is still the question why I'm getting EIO in the first place. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=390384
User jack@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c21
--- Comment #21 from Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=390384
User jack@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c22
--- Comment #22 from Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=390384
User jack@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c23
Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=390384
User kwolf@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c24
Kevin Wolf
https://bugzilla.novell.com/show_bug.cgi?id=390384
User jack@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c25
Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=390384
User kwolf@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c26
Kevin Wolf
https://bugzilla.novell.com/show_bug.cgi?id=390384
User meissner@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=390384#c27
--- Comment #27 from Marcus Meissner
participants (1)
-
bugzilla_noreply@novell.com