(In reply to Kurt Garloff from comment #16) > this is how a cleaner approach to handling memory input/output would look > like from my PoV. Yes, this is cleaner (more fine grained) than the memory-clobber patch (which I only suggested for troubleshooting / proofing that the issue is due to asm constraints, not for upstreaming). > So whenever we read memory, we use a "m"(*ptr) input, for written memory, a > "="(*ptr) in/output. Yes, we're declaring here *ptr to be read/written, which is typically char*. however in the assembly I see things like: " ld1 {v1.4s, v2.4s}, [%[rk]], #32 \n" if I understand that correctly, this is a 32 byte wide load. so we're declaring 1 byte, we're reading 32byte. it might happen that the compiler happens to not miscompile that, but there is no guarantee. we'd have to declare the full width of the memory block we're accessing, or fall back to the "memory" clobber. > This goes beyond just arm_aes64.c. > It does pass testing, but as I could not reproduce the issue before, your > testing is appreciated. it passes testing, thanks!