On Wed, 7 Dec 2022, Aaron Puchert wrote:
Am 05.12.22 um 09:21 schrieb Martin Jambor:
The fact is that -v3 baseline means better floating point performance out of the box.
Are the VEX-prefixed single-data instructions really faster, i.e. addsd vs vaddsd or mulsd vs vmulsd? I thought compilers emit the latter preferably if available only to avoid the need for vzeroupper and not because they're faster, but I admit I haven't benchmarked this.
No, they are not faster - unless the upper portions of the destination register is not cleared (as you say, vzeroupper is needed to avoid this situation). But they are bigger. Richard. -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg)