Currently domU live migration between two hosts is broken. The guest's system clock (implemented in arch/i386/kernel/time-xen.c) is pulled directly from xen so time will typically change significantly when moving between two xen instances. In older versions of the xen patches this works fine but these 2.6.22 patches have replaced the old do_gettimeofday implementation with the GENERIC_TIME implementation which gets the time from the new xen_clocksource_read function. The problem is that xen_clocksource_read makes sure that time never goes backwards, either returning the new time if it went forwards or the previous read if time appears to have gone backwards. So when the system time jumps backwards during a move to a new physical machine suddenly time stops. I've worked around the issue for now by replacing time-xen.c with an older version from redhat's 2.6.21 xen kernel and disabling GENERIC_TIME. The attached patch does this for i386 but it leaves x86_64 broken since I haven't gotten to that yet.
Hmm, indeed, that is a case where insisting on monotonicity is a problem, but ...
For kicks I tried letting xen_clocksource_read return a time in the past but that caused the kernel to get lost in an endless loop somewhere. Perhaps the system time should be kept locally within the kernel instead of pulling directly from xen and updated during the timer tick's wall clock update. Then xen_clocksource_read would use that time plus the change in TSC since the last tick instead of xen's time.
... that is exactly why monotonicity is required (there is a calculation somewhere in generic code [I debugged this a while back, but don't recall without in-depth checking] that creates huge positive timeouts when time stamps move a tiny bit backwards, but since as said this is in generic code that I understand at best half ways it didn't seem reasonable to change the behavior there). Going back to the non-generic-time handling is not really a solution here, instead I'm already feeling quite nervous about GENERIC_CLOCKEVENTS & Co being suppressed for Xen (but I don't think changing this would help the issue you raise) - I simply didn't have a chance to educate myself enough about this new time infrastructure. So I think the issue should rather be taken care of in the resume path (where, without checking the code) I assume generic code is capable of dealing with time moving backwards. Jan -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org