Re: which configuration changes for x86_64-v2+ necessary?

28 Sep 2022

      On Wednesday 2022-09-28 02:20, Aaron Puchert wrote:
...
Am 27.09.22 um 15:03 schrieb Richard Biener:
...
On Tue, 27 Sep 2022, Jan Engelhardt wrote:
Stefan Seyfried wrote:
...
...
...
...
But does the compiler even use these [POPCNT, AVX, etc.] often for
"normal" software?
Jan Engelhardt wrote:
...
...
...
If the compiler sees an opportunity, sure.
unsigned int popcnt_for_the_poor(unsigned int v){...}
Richard Biener wrote:
...
...
We only handle
int fancy_popcnt (long b) {...}
While gcc did not manage to make a POPCNT insn out of poor_popcnt,
gcc did turn poor_popcnt into a loopless set of vector instructions,
therefore showing that "normal" software indeed can suddenly gain
"special" instructions.

Aaron Puchert wrote:
...
Same for LLVM [1], though I'm beginning to think we should support the "very
poor" variant as it naturally comes up when using e.g. <algorithm> on an
std::bitset or std::vector<bool>.
After all, if you can write the "b &= b - 1" loop, you can also
write __builtin_popcount [..] so we're essentially just doing this
to optimize legacy code. It's not really "necessary".
An original software provider may choose not to utilize
__builtin_popcount (I do not think this is available in MSVC under
the exact name), and packagers/users downstream don't normally go
looking for code and manually replace any code by __builtin_xyz.

So, it is fair to demand that compilers like gcc and clang
ought to recognize poor_popcnt patterns. We will always have legacy
code, might as well deal with it.