[Bug 1188327] New: kernel 5.13.1 is crashing on boot on armv7 due to 'Unhandled fault: alignment exception' in ecdsa_set_pub_key
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 Bug ID: 1188327 Summary: kernel 5.13.1 is crashing on boot on armv7 due to 'Unhandled fault: alignment exception' in ecdsa_set_pub_key Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: armv7 URL: https://openqa.opensuse.org/tests/1841539/modules/welc ome/steps/6 OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-bugs@opensuse.org Reporter: guillaume.gardet@arm.com QA Contact: qa-bugs@suse.de CC: afaerber@suse.com, dmueller@suse.com, mbrugger@suse.com Found By: openQA Blocker: Yes Created attachment 851007 --> http://bugzilla.opensuse.org/attachment.cgi?id=851007&action=edit kernel traces ## Observation openQA test in scenario opensuse-Tumbleweed-NET-arm-create_hdd_minimalx@aarch32 fails in [welcome](https://openqa.opensuse.org/tests/1841539/modules/welcome/steps/6) Since update to kernel 5.13.1 [0], kernel is crashing on boot on armv7 due to 'Unhandled fault: alignment exception' See attachment for the full log. [0]: https://build.opensuse.org/request/show/905770 ## Test suite description ## Reproducible Fails since (at least) Build [20210714](https://openqa.opensuse.org/tests/1841078) ## Expected result Last good: [20210712](https://openqa.opensuse.org/tests/1838553) (or more recent) ## Further details Always latest result in this scenario: [latest](https://openqa.opensuse.org/tests/latest?arch=arm&distri=opensuse&flavor=NET&machine=aarch32&test=create_hdd_minimalx&version=Tumbleweed) -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 Guillaume GARDET <guillaume.gardet@arm.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|Normal |Critical -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c1 Mian Yousaf Kaukab <yousaf.kaukab@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |matz@suse.com, | |yousaf.kaukab@suse.com --- Comment #1 from Mian Yousaf Kaukab <yousaf.kaukab@suse.com> --- ecdsa_set_pub_key() is making a u64 pointer at 1 byte offset of the key. const unsigned char *d = key; const u64 *digits = (const u64 *)&d[1]; So digits will mostly be unaligned. ecc_swap_digits() is then using ldm instruction on this unaligned pointer which is causing the exception. There are two instances of ecc_swap_digits(). First instance is implemented by two ldr instructions. As ldr can handle unaligned addresses, it works. The second instance is implemented by an ldm instruction which can't handle unaligned addresses and causes the exception. Here is the disassembly of the code: /* First instance */ ../crypto/ecc.h: 54 for (i = 0; i < ndigits; i++) 0xc07c26f0 <+132>: cmp r3, #0 0xc07c26f4 <+136>: beq 0xc07c276c <ecdsa_set_pub_key+256> 0xc07c26f8 <+140>: add r1, r5, #1 0xc07c26fc <+144>: sub r6, r3, #1 0xc07c2700 <+148>: lsl r5, r3, #3 0xc07c2704 <+152>: add r12, r4, #72 ; 0x48 0xc07c2708 <+156>: add r3, r1, r3, lsl #3 55 out[i] = be64_to_cpu(src[ndigits - 1 - i]); 0xc07c270c <+160>: ldr r2, [r3, #-8]! //<== First load == 0xc07c2710 <+164>: ldr r0, [r3, #4] //<== Second load == 0xc07c2714 <+168>: rev r2, r2 0xc07c2718 <+172>: rev r0, r0 0xc07c271c <+176>: str r0, [r12, #8]! 0xc07c2720 <+180>: cmp r3, r1 0xc07c2724 <+184>: str r2, [r12, #4] /* Second instance */ ../crypto/ecdsa.c: 246 ecc_swap_digits(&digits[ndigits], ctx->pub_key.y, ndigits); 0xc07c272c <+192>: ldr r0, [r4, #212] ; 0xd4 ../crypto/ecc.h: 54 for (i = 0; i < ndigits; i++) 0xc07c2730 <+196>: bic lr, lr, #7 0xc07c2734 <+200>: add lr, lr, r6, lsl #3 0xc07c2738 <+204>: sub r5, r5, #8 0xc07c273c <+208>: sub r2, r0, #8 0xc07c2740 <+212>: add r3, r3, lr 0xc07c2744 <+216>: add r0, r0, r5 55 out[i] = be64_to_cpu(src[ndigits - 1 - i]); 0xc07c2748 <+220>: ldm r3, {r1, r12} //<== ldm thus alignment exception 0xc07c274c <+224>: rev r12, r12 0xc07c2750 <+228>: str r12, [r2, #8]! 0xc07c2754 <+232>: rev r1, r1 0xc07c2758 <+236>: cmp r0, r2 0xc07c275c <+240>: sub r3, r3, #8 0xc07c2760 <+244>: str r1, [r2, #4] -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c2 --- Comment #2 from Mian Yousaf Kaukab <yousaf.kaukab@suse.com> --- Following is the disassembly of ecdsa_set_pub_key from v5.13.1 compiled with the toolchain from arm: arm-none-linux-gnueabihf-gcc (GNU Toolchain for the A-profile Architecture 10.2-2020.11 (arm-10.16)) 10.2.1 20201103 Both instances of ecc_swap_digits() are using ldr instruction. crypto/ecc.h: 54 for (i = 0; i < ndigits; i++) 0xc07bf174 <+132>: cmp r3, #0 0xc07bf178 <+136>: beq 0xc07bf1ec <ecdsa_set_pub_key+252> 0xc07bf17c <+140>: add lr, r5, #1 0xc07bf180 <+144>: add r0, r4, #72 ; 0x48 0xc07bf184 <+148>: sub r5, r3, #1 0xc07bf188 <+152>: add r3, lr, r3, lsl #3 55 out[i] = be64_to_cpu(src[ndigits - 1 - i]); 0xc07bf18c <+156>: ldr r2, [r3, #-8]! //<== First load == 0xc07bf190 <+160>: rev r2, r2 0xc07bf194 <+164>: cmp r3, lr 0xc07bf198 <+168>: ldr r1, [r3, #4] //<== Second load == 0xc07bf19c <+172>: rev r1, r1 0xc07bf1a0 <+176>: str r1, [r0, #8]! 0xc07bf1a4 <+180>: str r2, [r0, #4] 54 for (i = 0; i < ndigits; i++) 0xc07bf1a8 <+184>: bne 0xc07bf18c <ecdsa_set_pub_key+156> crypto/ecdsa.c: 246 ecc_swap_digits(&digits[ndigits], ctx->pub_key.y, ndigits); 0xc07bf1ac <+188>: bic r12, r12, #7 crypto/ecc.h: 54 for (i = 0; i < ndigits; i++) 0xc07bf1b0 <+192>: ldr r0, [r4, #212] ; 0xd4 0xc07bf1b4 <+196>: add r2, r12, #8 0xc07bf1b8 <+200>: add r12, r3, r12 0xc07bf1bc <+204>: add r5, r2, r5, lsl #3 0xc07bf1c0 <+208>: sub r0, r0, #8 0xc07bf1c4 <+212>: add r3, r3, r5 55 out[i] = be64_to_cpu(src[ndigits - 1 - i]); 0xc07bf1c8 <+216>: ldr r2, [r3, #-8]! //<== First load == 0xc07bf1cc <+220>: rev r2, r2 0xc07bf1d0 <+224>: cmp r12, r3 0xc07bf1d4 <+228>: ldr r1, [r3, #4] //<== Second load == 0xc07bf1d8 <+232>: rev r1, r1 0xc07bf1dc <+236>: str r1, [r0, #8]! 0xc07bf1e0 <+240>: str r2, [r0, #4] -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c3 Mian Yousaf Kaukab <yousaf.kaukab@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(matz@suse.com) --- Comment #3 from Mian Yousaf Kaukab <yousaf.kaukab@suse.com> --- Michael, any thoughts on this? -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c4 Richard Biener <rguenther@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |INVALID Flags|needinfo?(matz@suse.com) | --- Comment #4 from Richard Biener <rguenther@suse.com> --- const u64 *digits = (const u64 *)&d[1]; and then accessing *digits is bogus since the pointer is not aligned according to its type. A fix would be to do u64 digits_tem; memcpy (&digits_tem, digits, sizeof (u64)); instead of reading u64 from digits. -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c5 Richard Biener <rguenther@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|INVALID |--- --- Comment #5 from Richard Biener <rguenther@suse.com> --- Or rather, not invalid but a kernel bug. -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c6 Takashi Iwai <tiwai@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tiwai@suse.com --- Comment #6 from Takashi Iwai <tiwai@suse.com> --- FWIW, ecc_swap_digits() handles the u64 array, so copying would be needed there instead of the caller. (And there are two calls of ecc_swap_digits() in ecdsa_set_pub_key()). If performance matters, we'd need two versions of ecc_swap_digits() for aligned and unaligned. But all calls of ecc_swap_digits() in ecdsa.c look suspicious wrt alignment. -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c7 --- Comment #7 from Richard Biener <rguenther@suse.com> --- (In reply to Takashi Iwai from comment #6)
FWIW, ecc_swap_digits() handles the u64 array, so copying would be needed there instead of the caller. (And there are two calls of ecc_swap_digits() in ecdsa_set_pub_key()).
If performance matters, we'd need two versions of ecc_swap_digits() for aligned and unaligned. But all calls of ecc_swap_digits() in ecdsa.c look suspicious wrt alignment.
Instead of a memcpy you can also use typedef u64 unaligned_u64 __attribute__((aligned(1))); *(unaligned_u64 *)digits that avoids the memcpy (which should be similarly optimized in most cases, but you'd need to check). It might of course be slow when the target cannot do unaligned accesses as it will to shifting/masking/etc. to produce the u64 value which might not actually be needed as 'u64' (without knowing the code in question) -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c8 Mian Yousaf Kaukab <yousaf.kaukab@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rguenther@suse.com Flags| |needinfo?(rguenther@suse.co | |m) --- Comment #8 from Mian Yousaf Kaukab <yousaf.kaukab@suse.com> --- Please confirm that compiler inferring LDM instruction instead of LDR is correct for ecc_swap_digits(). In armv7, LDR instruction can handle unaligned address whereas LDM instruction can't handle them [1]. [1]: https://developer.arm.com/documentation/ddi0406/c/Application-Level-Architec... -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c9 --- Comment #9 from Mian Yousaf Kaukab <yousaf.kaukab@suse.com> --- (In reply to Richard Biener from comment #7)
(In reply to Takashi Iwai from comment #6)
FWIW, ecc_swap_digits() handles the u64 array, so copying would be needed there instead of the caller. (And there are two calls of ecc_swap_digits() in ecdsa_set_pub_key()).
If performance matters, we'd need two versions of ecc_swap_digits() for aligned and unaligned. But all calls of ecc_swap_digits() in ecdsa.c look suspicious wrt alignment.
Instead of a memcpy you can also use
typedef u64 unaligned_u64 __attribute__((aligned(1)));
*(unaligned_u64 *)digits
that avoids the memcpy (which should be similarly optimized in most cases, but you'd need to check). It might of course be slow when the target cannot do unaligned accesses as it will to shifting/masking/etc. to produce the u64 value which might not actually be needed as 'u64' (without knowing the code in question)
Or may be use get_unaligned(). -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c10 --- Comment #10 from Takashi Iwai <tiwai@suse.com> --- There is get_unaligned_be64() and this will do both jobs done there :) -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c11 --- Comment #11 from Mian Yousaf Kaukab <yousaf.kaukab@suse.com> --- (In reply to Takashi Iwai from comment #10)
There is get_unaligned_be64() and this will do both jobs done there :) Yes, I wasn't explicit :)
-- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c12 Richard Biener <rguenther@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(rguenther@suse.co | |m) | --- Comment #12 from Richard Biener <rguenther@suse.com> --- (In reply to Mian Yousaf Kaukab from comment #8)
Please confirm that compiler inferring LDM instruction instead of LDR is correct for ecc_swap_digits().
In armv7, LDR instruction can handle unaligned address whereas LDM instruction can't handle them [1].
[1]: https://developer.arm.com/documentation/ddi0406/c/Application-Level- Architecture/Application-Level-Memory-Model/Alignment-support/Unaligned-data- access
I can find static inline void ecc_swap_digits(const u64 *in, u64 *out, unsigned int ndigits) { int i; for (i = 0; i < ndigits; i++) out[i] = __swab64(in[ndigits - 1 - i]); } where yes, the compiler is correctly infering 64bit alignment for the load in[ndigits - 1 - 1] because in is of type u64 *. The instance where it doesn't likely knows better from where it is inlined to (we don't try to actively break things when we're faced with a 100% sure unaligned access). -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c14 --- Comment #14 from Mian Yousaf Kaukab <yousaf.kaukab@suse.com> --- I will send the following patch upstream shortly: diff --git a/crypto/ecc.h b/crypto/ecc.h index a006132646a4..5e2d9045208c 100644 --- a/crypto/ecc.h +++ b/crypto/ecc.h @@ -27,6 +27,7 @@ #define _CRYPTO_ECC_H #include <crypto/ecc_curve.h> +#include <asm/unaligned.h> /* One digit is u64 qword. */ #define ECC_CURVE_NIST_P192_DIGITS 3 @@ -52,7 +53,7 @@ static inline void ecc_swap_digits(const u64 *in, u64 *out, unsigned int ndigits int i; for (i = 0; i < ndigits; i++) - out[i] = be64_to_cpu(src[ndigits - 1 - i]); + out[i] = get_unaligned_be64(&src[ndigits - 1 - i]); } -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c15 --- Comment #15 from Mian Yousaf Kaukab <yousaf.kaukab@suse.com> --- https://lkml.org/lkml/2021/7/21/190 -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c16 --- Comment #16 from Guillaume GARDET <guillaume.gardet@arm.com> --- (In reply to Mian Yousaf Kaukab from comment #15)
Thanks! Did you send it to stable as well? I think this would need to be backported to stable releases. -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c17 --- Comment #17 from Mian Yousaf Kaukab <yousaf.kaukab@suse.com> --- (In reply to Guillaume GARDET from comment #16)
(In reply to Mian Yousaf Kaukab from comment #15)
Thanks!
Did you send it to stable as well? I think this would need to be backported to stable releases. No, but I can send it once it is accepted.
-- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c22 Guillaume GARDET <guillaume.gardet@arm.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|FIXED |--- --- Comment #22 from Guillaume GARDET <guillaume.gardet@arm.com> --- The problem occurs again with kernel 5.14.0-1-lpae: See: https://build.opensuse.org/package/live_build_log/openSUSE:Factory:ARM/kerne... -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c23 --- Comment #23 from Guillaume GARDET <guillaume.gardet@arm.com> --- After a quick check, the upstream commit is not included in kernels 5.14.0 and 5.14.1. -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 http://bugzilla.opensuse.org/show_bug.cgi?id=1188327#c24 --- Comment #24 from Takashi Iwai <tiwai@suse.com> --- Also don't forget SLE15-SP4 branch. Currently armv7hl build is still broken there, but it'll be available in near future. -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1188327 Matthias Brugger <mbrugger@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|kernel-bugs@opensuse.org |mbrugger@suse.com -- You are receiving this mail because: You are the assignee for the bug.
participants (1)
-
bugzilla_noreply@suse.com