[Bug 989176] New: Kernel 4.1.28 (from kernel:openSUSE-42.1 standard) iptables/iptabkes-batch hangs (SuSEfirewall2)
http://bugzilla.opensuse.org/show_bug.cgi?id=989176 Bug ID: 989176 Summary: Kernel 4.1.28 (from kernel:openSUSE-42.1 standard) iptables/iptabkes-batch hangs (SuSEfirewall2) Classification: openSUSE Product: openSUSE 13.1 Version: Final Hardware: x86 OS: openSUSE 13.1 Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-maintainers@forge.provo.novell.com Reporter: AxelKoellhofer@web.de QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- Now I know that my installation/system is certainly not standard, so if this can not be reproduced on Leap 42.1, just ignore/close this report. However, if this problem is also present with openSUSE Leap 42.1 (or 13.2) at least there is some report for other users to add comments. I am running 13.1 with kernel 4.1.X from Kernel:openSUSE-42.1/standard. After upgrading to latest release (4.1.28-1.1) the system became very slow and "top" showed a process "iptables-batch" eating up most of the available CPU. So I disabled SuSEfirewall2 and rebooted the machine to investigate a little further. As expected, the problem was gone and starting SuSEfirewall2 manually hung at "SuSEfirewall2: Setting up rules from /etc/sysconfig/SuSEfirewall2 ..." with an iptables-batch process using 99% CPU. As the changelog entries shows a lot of changes to netfilter patches.fixes/netfilter-arp_tables-simplify-translate_compat_table.patch patches.fixes/netfilter-ip6_tables-simplify-translate_compat_table.patch patches.fixes/netfilter-ip_tables-simplify-translate_compat_table-.patch patches.fixes/netfilter-x_tables-add-and-use-xt_check_entry_offset.patch patches.fixes/netfilter-x_tables-add-compat-version-of-xt_check_en.patch patches.fixes/netfilter-x_tables-assert-minimum-target-size.patch patches.fixes/netfilter-x_tables-check-for-bogus-target-offset.patch patches.fixes/netfilter-x_tables-check-standard-target-size-too.patch patches.fixes/netfilter-x_tables-do-compat-validation-via-translat.patch patches.fixes/netfilter-x_tables-don-t-move-to-non-existent-next-r.patch patches.fixes/netfilter-x_tables-don-t-reject-valid-target-size-on.patch patches.fixes/netfilter-x_tables-fix-unconditional-helper.patch patches.fixes/netfilter-x_tables-kill-check_entry-helper.patch patches.fixes/netfilter-x_tables-make-sure-e-next_offset-covers-re.patch patches.fixes/netfilter-x_tables-validate-all-offsets-and-sizes-in.patch patches.fixes/netfilter-x_tables-validate-e-target_offset-early.patch patches.fixes/netfilter-x_tables-validate-targets-of-jumps.patch patches.fixes/netfilter-x_tables-xt_compat_match_from_user-doesn-t.patch I tried to disable all entries in /etc/sysconfig/SuSEfirewall practically having a "no options set" file but this did not change anything. Even just using an empty /etc/sysconfig/SuSEfirewall2 still hung SuSEfirewall2 start. I am also running kernel 4.6.4 from Kernel:stable/standard without this problem and (as expected) previous kernel 4.1.27 from Kernel:openSUSE-42.1/standard is also not affected. Now I know that it might be a problem with the older versions of iptables/SuSEfirewall2 from 13.1, but as said before, if other users of 4.1.28 on newer versions of openSUSE experience the same behaviour, they might find this report here. AK P.S. I will try to install the same kernel on my other machine running 42.1 this evening and report back if the same problem exists also there. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=989176 Axel Köllhofer <AxelKoellhofer@web.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|Kernel 4.1.28 (from |Kernel 4.1.28 (from |kernel:openSUSE-42.1 |kernel:openSUSE-42.1 |standard) |standard) |iptables/iptabkes-batch |iptables/iptables-batch |hangs (SuSEfirewall2) |hangs (SuSEfirewall2) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=989176 Axel Köllhofer <AxelKoellhofer@web.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Component|Kernel |Kernel Version|Final |Leap 42.1 Product|openSUSE 13.1 |openSUSE Distribution Target Milestone|--- |Leap 42.1 OS|openSUSE 13.1 |openSUSE 42.1 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=989176 http://bugzilla.opensuse.org/show_bug.cgi?id=989176#c1 Takashi Iwai <tiwai@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tiwai@suse.com --- Comment #1 from Takashi Iwai <tiwai@suse.com> --- Yes, 4.1.28 seems containing a few regressions. A test kernel with a partial revert is being built on OBS home:tiwai:bnc989084-2 repo. You can try this kernel once when the build finishes. Other than that, keep using the previous 4.1.27 kernel from Leap 42.1 for now. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=989176 http://bugzilla.opensuse.org/show_bug.cgi?id=989176#c2 --- Comment #2 from Axel Köllhofer <AxelKoellhofer@web.de> --- I could reproduce the problem on openSUSE Leap 42.1. The only additional thing I tried was to set FW_USE_IPTABLES_BATCH="no" in order to see if (by some stroke of luck) only the batch processing of iptables might be affected. Unfortunately, that only changed the process hanging from "iptables-batch" to "iptables" (no surprises there). The respective iptables/iptables-batch process is completely stuck and can not be killed even with "kill -5" or "kill -9". AK P.S. Changed "Product" and "Version" to 42.1. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=989176 http://bugzilla.opensuse.org/show_bug.cgi?id=989176#c3 --- Comment #3 from Axel Köllhofer <AxelKoellhofer@web.de> --- (In reply to Takashi Iwai from comment #1)
Yes, 4.1.28 seems containing a few regressions. A test kernel with a partial revert is being built on OBS home:tiwai:bnc989084-2 repo. You can try this kernel once when the build finishes.
Will do and report back ASAP. AK -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=989176 http://bugzilla.opensuse.org/show_bug.cgi?id=989176#c4 --- Comment #4 from Axel Köllhofer <AxelKoellhofer@web.de> --- (In reply to Takashi Iwai from comment #1)
partial revert is being built on OBS home:tiwai:bnc989084-2 repo. You can try this kernel once when the build finishes.
I just downloaded that kernel kernel-default-4.1.28-1.1.gae3ccbc.x86_64.rpm read the bug report at https://bugzilla.opensuse.org/show_bug.cgi?id=989084 and had a look at the changelog. Although the latter did not indicate any changes related to netfilter stuff, I gave it a try. Unfortunately (and most likely not surprising to you) the problem is still there. However, I am volunteering to test any packages you may provide in order to fix the netfilter related regressions. Just give me a quick heads up and I will download/install/test them. Greetings, AK -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=989176 http://bugzilla.opensuse.org/show_bug.cgi?id=989176#c6 --- Comment #6 from Axel Köllhofer <AxelKoellhofer@web.de> --- First of all, I used kernel-default-4.1.28-1.1.gcdbda6b.x86_64 from Kernel:openSUSE-42.1/standard on openSUSE Leap 42.1 x86_64 to reproduce the issue. (In reply to Michal Kubeček from comment #5)
When you reproduce the issue, could you try getting the contents of /proc/PID/stack (with "PID" replaced by PID of the stuck iptables or iptables-batch process) few times (5-10) and attach it here
I hope three times is enough, actually I do not only have to reboot after every test, the system also hangs on shut down and I have to do a hard reset. Here are the respective outputs of cat /proc/$(pidof iptables-batch)/stack after trying to start SuSEfirewall2: ----------------------------------------------------------------- [<ffffffff816660d9>] retint_kernel+0x1b/0x1d [<ffffffff811a7e49>] vmap_page_range_noflush+0x279/0x390 [<ffffffffa0776160>] translate_table+0x720/0x820 [ip_tables] [<ffffffffa07760fd>] translate_table+0x6bd/0x820 [ip_tables] [<ffffffff811aa4ad>] vmalloc_node+0x4d/0x60 [<ffffffffa11e55ee>] xt_alloc_table_info+0xde/0x124 [x_tables] [<ffffffffa11e55bd>] xt_alloc_table_info+0xad/0x124 [x_tables] [<ffffffffa0776e91>] do_ipt_set_ctl+0x121/0x1df [ip_tables] [<ffffffff8159b62e>] nf_setsockopt+0x3e/0x60 [<ffffffff815a9fcf>] ip_setsockopt+0x7f/0xa0 [<ffffffff8154c5af>] SyS_setsockopt+0x6f/0xd0 [<ffffffff816654f2>] system_call_fastpath+0x16/0x75 [<ffffffffffffffff>] 0xffffffffffffffff ----------------------------------------------------------------- [<ffffffff816660d9>] retint_kernel+0x1b/0x1d [<ffffffff811a7e49>] vmap_page_range_noflush+0x279/0x390 [<ffffffffa067b16c>] translate_table+0x72c/0x820 [ip_tables] [<ffffffffa067b0fd>] translate_table+0x6bd/0x820 [ip_tables] [<ffffffff811aa4ad>] vmalloc_node+0x4d/0x60 [<ffffffffa11f85ee>] xt_alloc_table_info+0xde/0x124 [x_tables] [<ffffffffa11f85bd>] xt_alloc_table_info+0xad/0x124 [x_tables] [<ffffffffa067be91>] do_ipt_set_ctl+0x121/0x1df [ip_tables] [<ffffffff8159b62e>] nf_setsockopt+0x3e/0x60 [<ffffffff815a9fcf>] ip_setsockopt+0x7f/0xa0 [<ffffffff8154c5af>] SyS_setsockopt+0x6f/0xd0 [<ffffffff816654f2>] system_call_fastpath+0x16/0x75 [<ffffffffffffffff>] 0xffffffffffffffff ----------------------------------------------------------------- [<ffffffff816660d9>] retint_kernel+0x1b/0x1d [<ffffffff811a7e49>] vmap_page_range_noflush+0x279/0x390 [<ffffffffa067b160>] translate_table+0x720/0x820 [ip_tables] [<ffffffffa067b0fd>] translate_table+0x6bd/0x820 [ip_tables] [<ffffffff811aa4ad>] vmalloc_node+0x4d/0x60 [<ffffffffa11f85ee>] xt_alloc_table_info+0xde/0x124 [x_tables] [<ffffffffa11f85bd>] xt_alloc_table_info+0xad/0x124 [x_tables] [<ffffffffa067be91>] do_ipt_set_ctl+0x121/0x1df [ip_tables] [<ffffffff8159b62e>] nf_setsockopt+0x3e/0x60 [<ffffffff815a9fcf>] ip_setsockopt+0x7f/0xa0 [<ffffffff8154c5af>] SyS_setsockopt+0x6f/0xd0 [<ffffffff816654f2>] system_call_fastpath+0x16/0x75 [<ffffffffffffffff>] 0xffffffffffffffff
Also, what does "rpm -q iptables" say?
iptables-1.4.21-4.1.x86_64 AK -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=989176 Neil Rickert <nwr10cst-oslnx@yahoo.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |nwr10cst-oslnx@yahoo.com -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=989176 http://bugzilla.opensuse.org/show_bug.cgi?id=989176#c17 Axel Köllhofer <AxelKoellhofer@web.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(AxelKoellhofer@we | |b.de) | --- Comment #17 from Axel Köllhofer <AxelKoellhofer@web.de> --- (In reply to Michal Kubeček from comment #16)
Test kernel build has finished, packages can be retrieved from OBS repository home:mkubecek:bsc989176 (the name in comment 12 was wrong) or downloaded at
http://download.opensuse.org/repositories/home:/mkubecek:/bsc989176/ openSUSE_Leap_42.1/
Axel, please check if it resolves the issue on your system.
Just downloaded, installed and tested kernel-default-4.1.28-1.1.x86_64 from home:mkubecek:989176 and it seems to fix the issue. Thanks. AK -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=989176 http://bugzilla.opensuse.org/show_bug.cgi?id=989176#c18 --- Comment #18 from Axel Köllhofer <AxelKoellhofer@web.de> --- Just to give some feedback ASAP: The packages are not published yet but osc getbinaries is a nice way of getting rpms, so I gave it a shot. I just installed kernel-default-4.1.28-3.1.gd509193.x86_64 from Kernel:openSUSE-42.1:standard including the fix mentioned above and it works as expected. AK -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=989176 http://bugzilla.opensuse.org/show_bug.cgi?id=989176#c20 Axel Köllhofer <AxelKoellhofer@web.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|IN_PROGRESS |RESOLVED Resolution|--- |FIXED --- Comment #20 from Axel Köllhofer <AxelKoellhofer@web.de> --- Changed status to RESOLVED FIXED. I hope this is OK. AK -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com