Mailinglist Archive: opensuse-amd64 (126 mails)
| < Previous | Next > |
Re: [suse-amd64] Installation of SUSE 9.0 AMD 64 Version on latest motherboards
- From: John McCorquodale <mcq1@xxxxxxxxxxxxxxxxxxxx>
- Date: Sun, 9 Nov 2003 01:40:42 +0000 (UTC)
- Message-id: <20031109014036.GA30717@xxxxxxxxxxxxxxxxxxxx>
> > I'm still working on something that'll
> > dump out the northbridge, 8131 and 8151 configuration registers
>
> You can just dump them with lspci -vxxx as root.
Yeah, I'm writing something that'll decode the register data into human-
readable form so I can learn something from it (a bunch of hex is just not
so handy). I'm doing it as a fake kernel module again rather than a
user-space thing just because I find that easier and can turn off interrupts
if I want to do a benchmark.
> > I wrote a kernel module that maps the framebuffer, sets up write combining
> > and sets AGP fast writes, then does a tight-loop microbenchmark. It's still
> > available at:
>
> We usually used testgart for this, which just benchmarks the AGP aperture
> as seen by the CPU. We even have a package for it, but I am not sure it is
> on the 9.0 DVD. If not google should be able to find it. testgart sets the
> needed WC MTRR by itself.
testgart measures write performance through the GART into AGP memory (which
is located on the main memory modules of the machine). The bandwidth I am
interested in is the bandwidth from the machine to the AGP card across the
AGP connector, which is not what testgart measures. This can be measured as
the bandwidth of a DMA from AGP memory to the card (again translated through
the gart), or similarly by mapping the card's framebuffer through PCI space
and doing programmed writes (AGP fast writes) from the CPU. These numbers
should be comparable.
The bandwidth through the AGP connector is the one that affects performance;
the bandwidth measured by testgart is not particularly useful to know, other
than, as the name says, to be a test that the gart itself does address
translation as advertised.
The kernel module I referred to earlier measures fast write bandwidth (which
is unusually bad on the S2885). The testgart program mesaures b/w of GART-
redirected writes to main memory, which behaves well on the S2885 (2GB/s).
Adding AGP DMA code to my framebuffer driver is not a hurdle I wanted to
tackle right now, but may just be what I need to do to verify that the
bandwidth observed doing that is the same as the bandwidth of programmed
fast writes. If ATI (or nvidia for that matter) would publically release
specs for their cards this would be easy, but as it is I have to recover the
documentation by reading the X DRI drivers (yuck).
> Actually it is the 8131 that is limited to 600Mt links, 8151 should be fine.
It could be both. It's at least the 8151. See AMD pub 25912, the AMD-8151
HyperTransport AGP3.0 Graphics Tunnel Revision Guide, page 9:
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25912.pdf
The motherboard design guides appear to be available only under NDA, so I
don't know if Tyan obeyed the recommended physics hacks in the board design.
If you are correct and they have to run the 8131 at 600MHz anyway, there'd
be little point in doing the recommended physics hacks.
> It could be still some memory attribute issue. If you have no
> write-combining performance will be very bad on anything AGP.
If I pull out 6GB of RAM (leaving me with 2GB), then BIOS doesn't set up the
weird MTRR and I can successfully turn on write combining. Doing this, my
AGP v2 4x fast-write b/w jumps to 270MB/s. But that should still be 1000 MB/s,
so something is still quite wrong (I need >400MB/s for my application).
Where did you get the idea that the 8151 is behind the 8131? I still don't
have my HT walker the point where it can dump the hypertransport graph, and
I'm curious to poke around wherever you got that idea for inspiration.
Thanks,
-mcq
> > dump out the northbridge, 8131 and 8151 configuration registers
>
> You can just dump them with lspci -vxxx as root.
Yeah, I'm writing something that'll decode the register data into human-
readable form so I can learn something from it (a bunch of hex is just not
so handy). I'm doing it as a fake kernel module again rather than a
user-space thing just because I find that easier and can turn off interrupts
if I want to do a benchmark.
> > I wrote a kernel module that maps the framebuffer, sets up write combining
> > and sets AGP fast writes, then does a tight-loop microbenchmark. It's still
> > available at:
>
> We usually used testgart for this, which just benchmarks the AGP aperture
> as seen by the CPU. We even have a package for it, but I am not sure it is
> on the 9.0 DVD. If not google should be able to find it. testgart sets the
> needed WC MTRR by itself.
testgart measures write performance through the GART into AGP memory (which
is located on the main memory modules of the machine). The bandwidth I am
interested in is the bandwidth from the machine to the AGP card across the
AGP connector, which is not what testgart measures. This can be measured as
the bandwidth of a DMA from AGP memory to the card (again translated through
the gart), or similarly by mapping the card's framebuffer through PCI space
and doing programmed writes (AGP fast writes) from the CPU. These numbers
should be comparable.
The bandwidth through the AGP connector is the one that affects performance;
the bandwidth measured by testgart is not particularly useful to know, other
than, as the name says, to be a test that the gart itself does address
translation as advertised.
The kernel module I referred to earlier measures fast write bandwidth (which
is unusually bad on the S2885). The testgart program mesaures b/w of GART-
redirected writes to main memory, which behaves well on the S2885 (2GB/s).
Adding AGP DMA code to my framebuffer driver is not a hurdle I wanted to
tackle right now, but may just be what I need to do to verify that the
bandwidth observed doing that is the same as the bandwidth of programmed
fast writes. If ATI (or nvidia for that matter) would publically release
specs for their cards this would be easy, but as it is I have to recover the
documentation by reading the X DRI drivers (yuck).
> Actually it is the 8131 that is limited to 600Mt links, 8151 should be fine.
It could be both. It's at least the 8151. See AMD pub 25912, the AMD-8151
HyperTransport AGP3.0 Graphics Tunnel Revision Guide, page 9:
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25912.pdf
The motherboard design guides appear to be available only under NDA, so I
don't know if Tyan obeyed the recommended physics hacks in the board design.
If you are correct and they have to run the 8131 at 600MHz anyway, there'd
be little point in doing the recommended physics hacks.
> It could be still some memory attribute issue. If you have no
> write-combining performance will be very bad on anything AGP.
If I pull out 6GB of RAM (leaving me with 2GB), then BIOS doesn't set up the
weird MTRR and I can successfully turn on write combining. Doing this, my
AGP v2 4x fast-write b/w jumps to 270MB/s. But that should still be 1000 MB/s,
so something is still quite wrong (I need >400MB/s for my application).
Where did you get the idea that the 8151 is behind the 8131? I still don't
have my HT walker the point where it can dump the hypertransport graph, and
I'm curious to poke around wherever you got that idea for inspiration.
Thanks,
-mcq
| < Previous | Next > |