Comment # 8 on bug 1180917 from
(In reply to LTC BugProxy from comment #7)

Thats a whole lot of questions, some of which need some more explanations.

> ------- Comment From geraldsc@de.ibm.com 2021-01-15 13:02 EDT-------
> (In reply to comment #9)
> > We are using kernel 5.10.7 in the latest Tumbleweed version.
> > The latest iso image is available under:
> > https://download.opensuse.org/ports/zsystems/tumbleweed/iso/
> 
> Hmm, it says "5.10.5-1-default" in the kernel BUG output. In order to match
> the given line 2144 from "mm/huge_memory.c:2144" and to find the
> corresponding kernel code, a matching kernel source would be needed.

The kernel to use is special for the builds. It originates from Tumbleweed, but
it is possible to substitute the kernel with special versions in the build
systems, and thus it is not automatically updated to the latest version. The
version string that you see tells the truth.
> 
> Is there any other means of kernel source access for openSUSE Tumbleweed,
> ideally a git repo like for SLES? Seems hard to believe that "open"SUSE
> kernel source is harder to find / access than SLES code...

It is not hard to find at all. All you need to know is, that the different
flavors of kernels all depend on a central package called kernel-source, which
has an own mechanics to integrate patches depending on a variety of conditions.
The source can be found in the package
http://download.opensuse.org/ports/zsystems/tumbleweed/repo/oss/noarch/kernel-source-5.10.5-1.1.noarch.rpm

I downloaded this in case it gets overwritten and would not be available that
easy anymore. Note, that one can always rebuild older versions, because OBS
does not throw away sources. Therefore you can just rebuild an older version of
a package.

> Anyway, SUSE developers surely have such access, and since this is BUG
> statement in common memory management code anyway, I would suggest to let
> one of the corresponding SUSE developers have a look first.

That is the reason, why the assignee is the openSUSE Kernel Developers.

> BTW, some information that might help is the fact(?) that THP worked fine on
> s390 with Tumbleweed, at least for some very short time, when verifying the
> other THP fix in LTC bug#184202 / SUSE bug#1163684. IIUC, then it was
> verified there with 5.9.11, but that is not 100% clear to me from the other
> bugzilla. 

Now that is an interesting question as well. We never could reliably reproduce
the behavior, it is more kind of a statistical experience. From my feeling, I
would say, that the kernel at least worked for some time.

One thing that is also a little strange is, that now only one process leads to
issues, which is cc1plus. On the other hand, the compile process is one of the
biggest (from a memory perspective) processes to be found. Often enough,
restarting the build just makes the build work.

> Please verify on which kernel version it worked fine the last
> time. Then, with having access to some proper source repo (and not just an
> ISO), one might be able to see what was changed in between and with regard
> to THP, maybe madvise.

So, the changes can be found in the changelog to the rpm. This is the reliable
source for knowing what has changed when. The changelog is found with rpm (rpm
-q --changelog ...) and also next to the spec file with a changes extension.

With regards to the sources, you can get the sources for the kernel-source
package with the command:

osc co -r 77cf39676446e7f7aa15ea53ef337b64 openSUSE:Factory:zSystems
kernel-source

The config is found in the file config.tar.bz2 within. The definition of what
patch is applied in what case is found in the series.conf file. 

Would it be helpful to temporarily add some extra kernel parameter for testing?
With boo#1163684 it was very helpful to have a reliable build environment. I
know that this kind of issue is hard to debug and hard to find. However, I also
believe that it is vital to find it before it hits customers with enterprise
distributions. This case hits even less often than boo#1163684 but that does
not help those who are hit.


You are receiving this mail because: