[Bug 1206359] New: kernel 6.1: random build failures: "error: <random package>: install failed"

newer
[Bug 1205597] ltp:pty03:slcan:...

older
[Bug 1205149] New:...

bugzilla_noreply＠suse.com

13 Dec 2022 13 Dec '22

09:31

https://bugzilla.suse.com/show_bug.cgi?id=1206359 Bug ID: 1206359 Summary: kernel 6.1: random build failures: "error: <random package>: install failed" Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-bugs@opensuse.org Reporter: jslaby@suse.com QA Contact: qa-bugs@suse.de CC: dimstar@opensuse.org, jack@suse.com, rgoldwyn@suse.com Found By: --- Blocker: --- kernel 6.1 is submitted to openSUSE:Factory:Staging:J. There, a lot of packages randomly fail. It happens usually at the installation phase and looks like (build rpm):

...

[ 12s] libdb-4_8-4.8.30-41.81 ######################################## [ 12s] error: libdb-4_8-4.8.30-41.81.x86_64: install failed [ 12s] perl-5.36.0-5.47 ######################################## [ 13s] liblsan0-12.2.1+git537-2.20 ########################################

Note that the failure is intermittent. Only some package installations fail (here libdb only). Note this can happen later in the loop (samba rpm):

...

error: samba-libs-4.17.3+git.283.2157972742b-3.2.x86_64: install failed ... [ 2288s] failed to install rpms, aborting build ... [ 2288s] [0;1;38;5;185mFailed to write wtmp record, ignoring: Read-only file system[0m [ 2288s] [0;1;38;5;185mFailed to write utmp record: Read-only file system[0m

I'm currently out of ideas. It looks like the VM image becomes RO and then RW randomly. The image is ext4. I cannot reproduce locally with --kvm-type=kvm. Any hint anyone? -- You are receiving this mail because: You are the assignee for the bug.

Attachments:

attachment.html (text/html — 3.5 KB)

Show replies by date

bugzilla_noreply＠suse.com

13 Dec 13 Dec

09:32

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c1 Jiri Slaby changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(dimstar@opensuse. | |org) --- Comment #1 from Jiri Slaby --- And yet, it looks like host-specific. lamb and old-cirrus machines seem to be affected only. I don't know what's their configuration though. Maybe Dominique can shed some light? -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

10:25

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c2 Dominique Leuenberger changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |adrian.schroeter@suse.com Flags|needinfo?(dimstar@opensuse. |needinfo?(adrian.schroeter@ |org) |suse.com) --- Comment #2 from Dominique Leuenberger --- (In reply to Jiri Slaby from comment #1)

...

And yet, it looks like host-specific. lamb and old-cirrus machines seem to be affected only. I don't know what's their configuration though. Maybe Dominique can shed some light?

Not really - except if you can do something out of the information provided by

...

osc workerinfo x86_64:lamb02:1

I'm sure Adrian is better suited to answer the original question:

...

And yet, it looks like host-specific. lamb and old-cirrus machines seem to be > affected only. I don't know what's their configuration though. Maybe Dominique > can shed some light?

-- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

13:12

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c3 --- Comment #3 from Jan Kara --- BTW, can we get kernel logs from the build in OBS if there's anything interesting? -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

15:56

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c4 Marcus R�ckert changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mrueckert@suse.com --- Comment #4 from Marcus R�ckert --- Bad run: ``` [ 11s] D: ========== +++ libdb-4_8-4.8.30-41.81 x86_64-linux 0x0 [ 11s] D: install: libdb-4_8-4.8.30-41.81.x86_64 has 2 files [ 11s] D: Plugin: calling hook psm_pre in selinux plugin [ 11s] libdb-4_8-4.8.30-41.81 D: create 100755 1 ( 0, 0)1595712 /usr/lib64/libdb-4.8.so;63989ef2 [ 11s] D: create 100755 1 ( 0, 0)1743256 /usr/lib64/libdb_cxx-4.8.so;63989ef2 [ 11s] D: ========== Directories not explicitly included in package: [ 11s] D: 0 /usr/lib64/ [ 11s] D: ========== [ 11s] ######################################## [ 11s] D: Plugin: calling hook fsm_file_prepare in fsverity plugin [ 11s] D: fsverity no signature for: path /usr/lib64/libdb-4.8.so;63989ef2 dest /usr/lib64/libdb-4.8.so [ 11s] D: Plugin: calling hook fsm_file_prepare in ima plugin [ 11s] D: Plugin: calling hook fsm_file_prepare in selinux plugin [ 11s] D: Plugin: calling hook fsm_file_prepare in fsverity plugin [ 11s] D: fsverity no signature for: path /usr/lib64/libdb_cxx-4.8.so;63989ef2 dest /usr/lib64/libdb_cxx-4.8.so [ 11s] D: Plugin: calling hook fsm_file_prepare in ima plugin [ 11s] D: Plugin: calling hook fsm_file_prepare in selinux plugin [ 11s] D: adding "libdb-4_8" to Name index. [ 11s] D: adding 2 entries to Basenames index. [ 11s] D: adding "System/Libraries" to Group index. [ 11s] D: adding 25 entries to Requirename index. [ 11s] D: adding 5 entries to Providename index. [ 11s] D: adding 1 entries to Dirnames index. [ 11s] D: adding 1 entries to Installtid index. [ 11s] D: adding 1 entries to Sigmd5 index. [ 11s] D: adding "39322b96c5751e2783f0d4fc0307783ae06e355e" to Sha1header index. [ 11s] D: Plugin: calling hook psm_post in syslog plugin [ 11s] error: libdb-4_8-4.8.30-41.81.x86_64: install failed ``` a good package right before this one ``` [ 11s] D: ========== +++ libasan8-12.2.1+git537-2.20 x86_64-linux 0x0 [ 11s] D: install: libasan8-12.2.1+git537-2.20.x86_64 has 2 files [ 11s] D: Plugin: calling hook psm_pre in selinux plugin [ 11s] libasan8-12.2.1+git537-2.20 D: create 120777 1 ( 0, 0) 16 /usr/lib64/libasan.so.8;63989ef2 [ 11s] D: create 100755 1 ( 0, 0)1307120 /usr/lib64/libasan.so.8.0.0;63989ef2 [ 11s] D: ========== Directories not explicitly included in package: [ 11s] D: 0 /usr/lib64/ [ 11s] D: ========== [ 11s] ######################################## [ 11s] D: Plugin: calling hook fsm_file_prepare in fsverity plugin [ 11s] D: fsverity skipping non regular: path /usr/lib64/libasan.so.8;63989ef2 dest /usr/lib64/libasan.so.8 [ 11s] D: Plugin: calling hook fsm_file_prepare in ima plugin [ 11s] D: Plugin: calling hook fsm_file_prepare in selinux plugin [ 11s] D: Plugin: calling hook fsm_file_prepare in fsverity plugin [ 11s] D: fsverity no signature for: path /usr/lib64/libasan.so.8.0.0;63989ef2 dest /usr/lib64/libasan.so.8.0.0 [ 11s] D: Plugin: calling hook fsm_file_prepare in ima plugin [ 11s] D: Plugin: calling hook fsm_file_prepare in selinux plugin [ 11s] D: adding "libasan8" to Name index. [ 11s] D: adding 2 entries to Basenames index. [ 11s] D: adding "Development/Languages/C and C++" to Group index. [ 11s] D: adding 21 entries to Requirename index. [ 11s] D: adding 3 entries to Providename index. [ 11s] D: adding 1 entries to Conflictname index. [ 11s] D: adding 1 entries to Dirnames index. [ 11s] D: adding 1 entries to Installtid index. [ 11s] D: adding 1 entries to Sigmd5 index. [ 11s] D: adding "2973d96fce735d6078efd3998891e1a7d8e1f98a" to Sha1header index. [ 11s] D: %post(libasan8-12.2.1+git537-2.20.x86_64): scriptlet start [ 11s] D: %post(libasan8-12.2.1+git537-2.20.x86_64): execv(/sbin/ldconfig) pid 957 [ 11s] D: Plugin: calling hook scriptlet_fork_post in prioreset plugin [ 11s] D: Plugin: calling hook scriptlet_fork_post in selinux plugin [ 11s] D: %post(libasan8-12.2.1+git537-2.20.x86_64): waitpid(957) rc 957 status 0 [ 11s] D: Plugin: calling hook scriptlet_post in syslog plugin [ 11s] D: Plugin: calling hook psm_post in syslog plugin ``` -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

16:15

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c5 --- Comment #5 from Marcus R�ckert --- so from the logs it looks like on the newer kernel rpm sometimes fails to launch the processes for the post scriptlets. sadly without any useful error. -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

17:34

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c6 Jiri Slaby changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mls@suse.com Flags| |needinfo?(mls@suse.com) --- Comment #6 from Jiri Slaby --- (In reply to Marcus R�ckert from comment #5)

...

so from the logs it looks like on the newer kernel rpm sometimes fails to launch the processes for the post scriptlets. sadly without any useful error.

If post hooks failed, that would dump: Plugin %s: hook psm_post failed But that didn't happen. So post hooks likely finished w/o error. It's something done after post hooks, right? Michael? -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

17:40

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c7 --- Comment #7 from Jiri Slaby --- (In reply to Jiri Slaby from comment #6)

...

(In reply to Marcus R�ckert from comment #5)

...
so from the logs it looks like on the newer kernel rpm sometimes fails to launch the processes for the post scriptlets. sadly without any useful error.

If post hooks failed, that would dump: Plugin %s: hook psm_post failed

But that didn't happen. So post hooks likely finished w/o error. It's something done after post hooks, right? Michael?

Or even before post hooks. Post hooks are run despite errors and nor affect the return status from rpmpsmRun(). The question is why rpm does not note anything about the failure in the debug log at all? -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

17:44

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c8 --- Comment #8 from Jiri Slaby --- (In reply to Jan Kara from comment #3)

...

BTW, can we get kernel logs from the build in OBS if there's anything interesting?

Adding to this, if we run debug runs, can we increase the kernel debug level? Currently, /usr/lib/build/build-vm does: echo 4 > /proc/sysrq-trigger so "echo 8" might show us more. -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

18:00

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c9 --- Comment #9 from Marcus R�ckert --- we ran with 7 already and it didnt give us additional details. -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

18:31

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c10 --- Comment #10 from Dominique Leuenberger --- (In reply to Marcus R�ckert from comment #5)

...

so from the logs it looks like on the newer kernel rpm sometimes fails to launch the processes for the post scriptlets. sadly without any useful error.

From a different build: https://build.opensuse.org/package/live_build_log/openSUSE:Factory:Staging:J... [ 18s] python-rpm-macros-20221117.8687578-2.1######################################## [ 18s] error: python-rpm-macros-20221117.8687578-2.10.noarch: install failed python-rpm-macros does not even have scriptlets; so I think we're following a red herring with the scriptlets -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

14 Dec 14 Dec

08:45

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c11 --- Comment #11 from Jiri Slaby --- (In reply to Marcus R�ckert from comment #5)

...

so from the logs it looks like on the newer kernel rpm sometimes fails to launch the processes for the post scriptlets. sadly without any useful error.

Ah. So this (AFAICT) means: dbAdd() failed, because: * a lot of "adding X entries to XX index" happens inside dbAdd(). * the following runFileTriggers() and runImmedFileTriggers() would emit an error in case of error. * the following runInstScript(psm, RPMTAG_POSTIN) was not run for libdb and it should as in libasan8 above (both have "postinstall program: /sbin/ldconfig"). * then only psm_post is run regardless of previous errors. So it might be interesting to see which of these invocations in rpmdbAdd() fails (increases ret): for (int dbix = 0; dbix < db->db_ndbi; dbix++) { ... ret += idxdbPut(dbi, rpmtag, hdrNum, h); -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

08:55

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c12 --- Comment #12 from Jiri Slaby --- (In reply to Jiri Slaby from comment #11)

...

ret += idxdbPut(dbi, rpmtag, hdrNum, h);

One things where the above returns RPMRC_FAIL is when this fails: flock(pkgdb->fd, LOCK_EX); Let me check if there are any changes in file locking in between 6.0 and 6.1. -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

09:24

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c13 --- Comment #13 from Jiri Slaby --- (In reply to Jiri Slaby from comment #12)

...

Let me check if there are any changes in file locking in between 6.0 and 6.1.

And there are none. Can anyone test with this patched rpm: https://build.opensuse.org/package/show/home:jirislaby:debug61/rpm ? There is a patch with many more rpmlog(RPMLOG_DEBUG, ...) to narrow what and how actually fails. -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

10:19

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c14 --- Comment #14 from Jiri Slaby --- (In reply to Jiri Slaby from comment #13)

...

Can anyone test with this patched rpm: https://build.opensuse.org/package/show/home:jirislaby:debug61/rpm ? There is a patch with many more rpmlog(RPMLOG_DEBUG, ...) to narrow what and how actually fails.

I hit it once in home:jirislaby:debug61:rebuild and it points to rpmidxPutInternal()'s: addnewkey(idxdb, key, keyl, &keyoff) And that does pwrite(), mmap() and mremap() internally. I respinned the debug patch to narrow it further down. -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

11:02

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c15 Jiri Slaby changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |vbabka@suse.com Flags|needinfo?(adrian.schroeter@ |needinfo?(vbabka@suse.com) |suse.com), | |needinfo?(mls@suse.com) | --- Comment #15 from Jiri Slaby --- (In reply to Jiri Slaby from comment #14)

...

And that does pwrite(), mmap() and mremap() internally.

And the winner is: [ 30s] error: remapslot (223): mapmem(old=0xffffffffffffffff, osz=1056768, nsz=1060864, fl=3, fd=4, off=868352) failed 12 i.e. mremap fails with ENOMEM. Vlasta, any ideas? -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

11:19

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c16 --- Comment #16 from Vlastimil Babka --- (In reply to Jiri Slaby from comment #15)

...

(In reply to Jiri Slaby from comment #14)

...
And that does pwrite(), mmap() and mremap() internally.

And the winner is: [ 30s] error: remapslot (223): mapmem(old=0xffffffffffffffff, osz=1056768, nsz=1060864, fl=3, fd=4, off=868352) failed 12

How exactly is that log related to mremap() syscall and its parameters? Can I assume the address and old/new sizes are 1:1 parameters? if 'fl=3' are flags of mremap() then going by tools/include/uapi/linux/mman.h it would be MREMAP_MAYMOVE | MREMAP_FIXED. With MREMAP_FIXED there should be also fifth parameter saying the new address, which I don't see here. Instead the "fd=4, off=868352" are not mremap() parameters.

...

i.e. mremap fails with ENOMEM. Vlasta, any ideas?

At least old=0xffffffffffffffff looks quite wrong, but it should result in EINVAL rather than ENOMEM. In general 6.1 includes the maple tree, so that would be my first suspect to look, but we should make sure what the mremap() parameters exactly are and ideally get a /proc/pid/smaps snapshot upon the failure. -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

11:39

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c17 --- Comment #17 from Jiri Slaby --- (In reply to Vlastimil Babka from comment #16)

...

(In reply to Jiri Slaby from comment #15)

...
(In reply to Jiri Slaby from comment #14)

...
And that does pwrite(), mmap() and mremap() internally.

And the winner is: [ 30s] error: remapslot (223): mapmem(old=0xffffffffffffffff, osz=1056768, nsz=1060864, fl=3, fd=4, off=868352) failed 12

How exactly is that log related to mremap() syscall and its parameters? Can I assume the address and old/new sizes are 1:1 parameters?

No, sorry for being misleading. The source is: https://github.com/rpm-software-management/rpm/blob/master/lib/backend/ndb/r... so it's in fact: mremap(MAP_FAILED, 1056768, 1060864, MREMAP_MAYMOVE);

...

if 'fl=3' are flags of mremap() then going by

Those fl, fd and off are used as prot, fd and off resp. for mmap() -- i.e. not in this case.

...

At least old=0xffffffffffffffff looks quite wrong, but it should result in EINVAL rather than ENOMEM.

Ugh, right. So apparently some previous mmap() failed, i.e. returned -1 == MAP_FAILED.

...

In general 6.1 includes the maple tree, so that would be my first suspect to look, but we should make sure what the mremap() parameters exactly are and ideally get a /proc/pid/smaps snapshot upon the failure.

Let me catch also the *previous* mmap() giving MAP_FAILED which is put to oldaddr. -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

12:23

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c18 --- Comment #18 from Jiri Slaby --- Created attachment 863524 --> https://bugzilla.suse.com/attachment.cgi?id=863524&action=edit smaps (In reply to Jiri Slaby from comment #17)

...

...
In general 6.1 includes the maple tree, so that would be my first suspect to look, but we should make sure what the mremap() parameters exactly are and ideally get a /proc/pid/smaps snapshot upon the failure.

Let me catch also the *previous* mmap() giving MAP_FAILED which is put to oldaddr.

smaps dump attached. I don't know where this "-1 as first argument" is from. slot->mapped (the old addr) is never assigned MAP_FAILED or -1. error: remapslot (232): mapped=0xffffffffffffffff shift=0 mapmem(old=0xffffffffffffffff, osz=1056768, nsz=1060864, prot=3, fd=4, off=868352) failed 12 -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

12:24

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c19 --- Comment #19 from Jiri Slaby --- (In reply to Jiri Slaby from comment #18)

...

smaps dump attached. I don't know where this "-1 as first argument" is from. slot->mapped (the old addr) is never assigned MAP_FAILED or -1.

And neither of previous mmap()s fail. -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

13:04

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c20 --- Comment #20 from Jiri Slaby --- (In reply to Vlastimil Babka from comment #16)

...

At least old=0xffffffffffffffff looks quite wrong, but it should result in EINVAL rather than ENOMEM. In general 6.1 includes the maple tree, so that would be my first suspect to look

I have to go now, so for now, I came to a suspicion that this is memory corruption. That 0xffffffffffffffff must be written by something else. So could this mmap() kernel rewrite ended up in something like this in rpm: 1) slots = calloc(); (it's 16320B, so likely via sbrk(), I didn't check) 2) do a lot of mremap() and mmap(), some of them simply returns "slots" address from 1) again. 3) slot[X].mapped gets rewritten by a mistake due to 2) by a bunch of 0xff. 4) we see slot->mapped as -1 here. ? -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

13:26

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c21 --- Comment #21 from Michael Schr�der --- But wouldn't that lead to EINVAL because offset_in_page(addr) is not zero? That sounds more like your debug statement does not match what happens. -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

13:30

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c22 --- Comment #22 from Jiri Slaby --- (In reply to Michael Schr�der from comment #21)

...

But wouldn't that lead to EINVAL because offset_in_page(addr) is not zero? That sounds more like your debug statement does not match what happens.

I might do some wrong mistake. Feel free to check the patch ;): https://build.opensuse.org/package/view_file/home:jirislaby:debug61/rpm/debu... -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

13:46

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c23 --- Comment #23 from Michael Schr�der --- Your debug statement is wrong because mapped gets overwritten with the mapmem() result. I.e. mapped is always MAP_FAILED and not the old address. -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

15:32

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c24 Fabian Vogt changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |fvogt@suse.com --- Comment #24 from Fabian Vogt --- Created attachment 863532 --> https://bugzilla.suse.com/attachment.cgi?id=863532&action=edit straced-rpm.zst I was able to get rpm to fail inside strace -f. Full build log attached. -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

19:05

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c25 --- Comment #25 from Jiri Slaby --- (In reply to Fabian Vogt from comment #24)

...

Created attachment 863532 [details] straced-rpm.zst

I was able to get rpm to fail inside strace -f. Full build log attached.

...

mmap2(NULL, 241664, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0xaa000) = 0xf767e000 mremap(0xf767e000, 241664, 245760, MREMAP_MAYMOVE) = 0xf767e000 mremap(0xf767e000, 245760, 249856, MREMAP_MAYMOVE) = 0xf767e000 mremap(0xf767e000, 249856, 253952, MREMAP_MAYMOVE) = 0xf767e000 mremap(0xf767e000, 253952, 258048, MREMAP_MAYMOVE) = 0xf767e000 mremap(0xf767e000, 258048, 262144, MREMAP_MAYMOVE) = -1 ENOMEM (Cannot allocate memory) mremap(0xf767e000, 258048, 262144, MREMAP_MAYMOVE) = -1 ENOMEM (Cannot allocate memory) mremap(0xf767e000, 258048, 262144, MREMAP_MAYMOVE) = -1 ENOMEM (Cannot allocate memory) mremap(0xf767e000, 258048, 262144, MREMAP_MAYMOVE) = -1 ENOMEM (Cannot allocate memory) mremap(0xf767e000, 258048, 262144, MREMAP_MAYMOVE) = -1 ENOMEM (Cannot allocate memory) mremap(0xf767e000, 258048, 262144, MREMAP_MAYMOVE) = -1 ENOMEM (Cannot allocate memory)

Looks interesting. -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

15 Dec 15 Dec

06:26

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c26 Jiri Slaby changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #863524|0 |1 is obsolete| | --- Comment #26 from Jiri Slaby --- Created attachment 863539 --> https://bugzilla.suse.com/attachment.cgi?id=863539&action=edit smaps Smaps for this error (I fixed the "mapped" variable dump): error: remapslot (235): mapped=0x7f72f4e78000 shift=0 mapmem(old=0x7f72f4e78000, osz=1056768, nsz=1060864, prot=3, fd=4, off=868352) failed 12 -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

08:30

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c27 Jiri Slaby changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #863539|0 |1 is obsolete| | --- Comment #27 from Jiri Slaby --- Created attachment 863541 --> https://bugzilla.suse.com/attachment.cgi?id=863541&action=edit full log incl. smaps From another run (I dumped even more), IIUC:

...

mapmem(130): map=0x7ff717f7f000 (old=(nil) osz=0 nsz=1056768 prot=3 fd=4 off=868352 )

off in hex=0x0d4000 nsz in hex=0x102000 So mmap mapped this part of the file (Index.db): 0x0d4000-0x1d6000

...

mapmem(130): remap=0xffffffffffffffff (old=0x7ff717f7f000 osz=1056768 nsz=1060864 prot=3 fd=4 off=868352 )

Now it wants: 0x0d4000-0x1d7000 but fails:

...

error: remapslot (240): mapped=0x7ff717f7f000 shift=0 mapmem(old=0x7ff717f7f000, osz=1056768, nsz=1060864, prot=3, fd=4, off=868352) failed 12

The original mapping (mmap above):

...

7ff717f56000-7ff718081000 rw-s 000ab000 fd:00 656385 /usr/lib/sysimage/rpm/Index.db

This mapping provides: 0x0ab000-0x1d6000 BTW, this was requested earlier by another mmap:

...

mapmem(130): map=0x7ff717f56000 (old=(nil) osz=0 nsz=167936 prot=3 fd=4 off=700416 )

and was not released until now. The above mapping cannot be enlarged, because this vma follows:

...

7ff718082000-7ff718087000 rw-s 00047000 fd:00 656385 /usr/lib/sysimage/rpm/Index.db

So the question is why remap did not apply the MAYMOVE rule. Provided it's shared, it should create a new mapping (and preserve the old for the older mmap) and return it. Or not? (I don't know the rules here.) -- You are receiving this mail because: You are the assignee for the bug.

bugzilla_noreply＠suse.com

09:46

New subject: [Bug 1206359] kernel 6.1: random build failures: "error: <random package>: install failed"

https://bugzilla.suse.com/show_bug.cgi?id=1206359 https://bugzilla.suse.com/show_bug.cgi?id=1206359#c28 Jiri Slaby changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|kernel-bugs@opensuse.org |vbabka@suse.com --- Comment #28 from Jiri Slaby --- So I reverted: commit ca3d76b0aa808a06997297d123b66d17b81e5285 Author: Jakub Mat��na Date: Fri Jun 3 16:57:19 2022 +0200 mm: add merging after mremap resize in my home:jirislaby:debug61 tesbed. And it seems to fix the problem. I submitted to Staging to verify. The problem is I don't understand why it helps as vma merge in mremap happens only in the "if (vma_expandable(vma, new_len - old_len)) {" case. So the remap we have here should not be affected. Leaving up to Vlasta, I believe he knows far better than me. -- You are receiving this mail because: You are the assignee for the bug.

521

Age (days ago)

523

Last active (days ago)

List overview

Download

28 comments

1 participants

participants (1)

bugzilla_noreply＠suse.com

[Bug 1206359] New: kernel 6.1: random build failures: "error: <random package>: install failed"

tags

participants (1)