Mailinglist Archive: opensuse (3666 mails)

< Previous Next >
What is the proper way to optimize compilations?
  • From: "Carlos E. R." <robin1.listas@xxxxxxxxxx>
  • Date: Sun, 13 Mar 2005 13:58:51 +0100 (CET)
  • Message-id: <Pine.LNX.4.58.0503130326150.8009@xxxxxxxxxxxxxxxx>

Hi,

I have a P-IV, and when I compile something I like to have it optimized
for my cpu. The proper switch to give gcc would be "-march=pentium4".
Notice that "-mcpu" is not so "optimized":

| While picking a specific CPU-TYPE will schedule things
| appropriately for that particular chip, the compiler will not
| generate any code that does not run on the i386 without the
| `-march=CPU-TYPE' option being used.

Therefore, "-march" should give the code best adjusted to my cpu.


Now, when things get to the "configure" script part, things get more
difficult. For example, I'm compiling xine-lib. By default, my machine is
seen as "i686-pc-linux-gnu", which is probably read from the "uname"
command:

| Linux nimrodel 2.6.5-7.147-default #1 Thu Jan 27 09:19:29 UTC 2005 i686 i686 i386 GNU/Linux

If I look at the "Makefile", I can see what options are going to be used:

CFLAGS = -mcpu=pentiumpro -O3 -pipe -fomit-frame-pointer
-falign-functions=4 -falign-loops=4 -falign-jumps=4
-mpreferred-stack-boundary=2 -fexpensive-optimizations -fschedule-insns2
-fno-strict-aliasing -ffast-math -funroll-loops -finline-functions -Wall
-DNDEBUG -D_REENTRANT -D_FILE_OFFSET_BITS=64 -DXINE_COMPILE
$(MULTIPASS_CFLAGS) -Wnested-externs -Wcast-align -Wchar-subscripts
-Wmissing-declarations -Wmissing-prototypes

The pentiumpro things comes from this code in the configure script,
because the auto detection says it is an i686:

pentiumpro-* | pentium2-* | i686-*)
archopt_val="pentiumpro"


Now, as I said, I want the compilation to be for a pentium4. There are
three switches I can use:

System types:
--build=BUILD configure for building on BUILD [guessed]
--host=HOST cross-compile to build programs to run on HOST [BUILD]
--target=TARGET configure for building compilers for TARGET [HOST]

I'm not sure which one should I use, to get output compiled for the
pentium4. Ie, I want to override the auto detection as "i686". I have tried
all of them, and I don't see any "-mcpu=pentium4" getting to the compiler
anywhere.

In fact, the trick I have done for a long time is to edit the configure
script and hack it. In this case, I have changed two lines:

#CFLAGS="-O3 -pipe -fomit-frame-pointer $f_af $f_al $f_aj $m_wm $m_psb -fexpensive-optimizations $f_si $f_nsa -ffast-math -funroll-loops -finline-functions $CFLAGS"
#Cer
CFLAGS="-O3 -march=pentium4 -pipe -fomit-frame-pointer $f_af $f_al $f_aj $m_wm $m_psb -fexpensive-optimizations $f_si $f_nsa -ffast-math -funroll-loops -finline-functions $CFLAGS"

Ie, I put my option there, forced. A dirty hack. Also, this other change
(that one goes to -mcpu):

pentiumpro-* | pentium2-* | i686-*)
archopt_val="pentiumpro"
#Cer hack,
archopt_val="pentium4"

So now I'm getting:

CFLAGS = -mcpu=pentium4 -O3 -march=pentium4 -pipe -fomit-frame-pointer ...

It is a hack, and I have to remember it is there the next time I download
a new version. I don't like it, although it produces the expected result.

Can you suggest a better, and general, method?

--
Cheers,
Carlos Robinson

< Previous Next >