Mailinglist Archive: opensuse-kernel (81 mails)

< Previous Next >
[opensuse-kernel] Re: Xen kernel bug: new GENERIC_TIME based code breaks live migration
  • From: "Michael Marineau" <mike@xxxxxxxxxxxx>
  • Date: Wed, 27 Feb 2008 21:18:14 -0800
  • Message-id: <c0526ddf0802272118x411a2660qd957ae331d8ebe3b@xxxxxxxxxxxxxx>
On Wed, Feb 27, 2008 at 9:15 PM, Michael Marineau <mike@xxxxxxxxxxxx> wrote:
Greetings,

So I've been trying to get together a new xen kernel for Gentoo based on
the Suse 2.6.22 xen patches in 2.6.22.17-0.1. So far I've hit two bugs.
The easier of the two is below, I'll post the second bug in a second
email since they are not related. Note that this is on i386, I haven't
even tried x86_64 yet.

Currently domU live migration between two hosts is broken. The guest's
system clock (implemented in arch/i386/kernel/time-xen.c) is pulled
directly from xen so time will typically change significantly when
moving between two xen instances. In older versions of the xen patches
this works fine but these 2.6.22 patches have replaced the old
do_gettimeofday implementation with the GENERIC_TIME implementation
which gets the time from the new xen_clocksource_read function. The
problem is that xen_clocksource_read makes sure that time never goes
backwards, either returning the new time if it went forwards or the
previous read if time appears to have gone backwards. So when the system
time jumps backwards during a move to a new physical machine suddenly
time stops. I've worked around the issue for now by replacing time-xen.c
with an older version from redhat's 2.6.21 xen kernel and disabling
GENERIC_TIME. The attached patch does this for i386 but it leaves x86_64
broken since I haven't gotten to that yet.

For kicks I tried letting xen_clocksource_read return a time in the past
but that caused the kernel to get lost in an endless loop somewhere.
Perhaps the system time should be kept locally within the kernel instead
of pulling directly from xen and updated during the timer tick's wall
clock update. Then xen_clocksource_read would use that time plus the
change in TSC since the last tick instead of xen's time.

And of course I was dumb and didn't attach my workaround patch, here it is.
--
Michael Marineau
Oregon State University
mike@xxxxxxxxxxxx
< Previous Next >
References