https://bugzilla.novell.com/show_bug.cgi?id=413842
User schueffler@softgarden.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=413842#c25
--- Comment #25 from Stefan Schueffler 2008-08-14 01:56:12 MDT ---
Hi,
meanwhile i get a working setup by applying a patch to
/usr/src/linux/arch/x86/mm/pageattr-xen.c
By further looking at the details in set_memory_uc(), i found that the problem
is in file pageattr-xen.c
in set_memory_uc() {
(line 1105): change_page_attr_set()
in change_page_attr_set() {
(line 1094): change_page_attr_set_clr()
in change_page_attr_set_clr() {
...
(line 1080)
/*
* On success we use clflush, when the CPU supports it to
* avoid the wbindv. If the CPU does not support it and in the
* error case we fall back to cpa_flush_all (which uses
* wbindv):
*/
if (!ret && cpu_has_clflush)
cpa_flush_range(addr, numpages, cache);
else
cpa_flush_all(cache);
On my CPU, cpu_has_clflush evaluates to true, and so only the modified
cache-pages will be flushed.
-> in cpa_flush_range() {
..
(line 449)
for (i = 0, addr=start; i < numpages; i++..) {
..
if (pte && ...) -> true
(line 456) clflush_cache_range
This works fine for some pages (while loading other drivers etc), but
eventually (while loading the mpt-driver) this stops to work. The driver tries
to flush 32 pages a 4096 bytes, starting at some offset 18446675904257851392.
clflush_cache_range now flushes blocks a
64 bytes. For the first and the second page, this will be ok, but on the third
page a 4069 bytes the cpu-ierr occurs and the system hangs.
My work-a-round now disables the range-flushing at all and always flushes the
whole cache, by just commenting the "if (!ret && cpu_has_clflush)" and
corresponding if-branch out, and thus always using "cpa_flush_all(cache);"
As a result, my setup is booting, and working in regards to
raid-hard-disk-access.
Of course, the penalty of this patch is that we disregard any potential
performance-improvements provided by the fact that we could just flush the
modified cache-pages.
I do not know if clflush_cache_range is supposed to work with the given offset,
(and thus the error is inside the implementation of clflush_cache_range), or if
the offset given by the mpt-driver is invalid at all...
If you have any idea of how i can evaluate that, of how i can provide further
info to eliminate this buggy behaviour, just give me a hint on what to look
for.
Do you think that i nevertheless should try new xen on old 10.3 and vice-versa?
Apart from this bug, i do now have problems setting up network bonding. During
normal boot, i can set up bonding by editing syconfig-network-config files as
yast-bond-setup does not work (it just does not show up my slave-interfaces),
but booting xen i can not get bonding working at all. But i guess i have to
open an other bug report...
--
Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.