Frank Steiner <fsteiner-mail1@bio.ifi.lmu.de> wrote:
Sorry if I jump in into that discussion as an absolute newbie to the 64bit world :-) I've just bought 2 AMD64 systems with Asus A8V boards and 4 1GB DIMMs each. ... Sorry when I'm asking so much, but I would really like to understand the background of these 64bit secrets :-)
No problem. It's a long and convoluted history that most people don't know much about.
Indeed I stepped over that memory problem and now I almost understand what the problem is. But let me ask two things:
1) If that reserved memory is required for PCI devices, why don't have 32bit systems that problem? On all my systems with 2GB RAM and video cards with 256MB video ram, there is the full 2GB available running linux. Why do only the 64bit systems need this explicit reservation between 3.5 and 4GB?
It's not a 64-bit specific thing (as I think Andi Kleen commented on shortly in another response) per se, it's about the 4GB physical addressing limitation of 32-bit x86 PC architecture. This isn't quite true for some modern processors, they have 36-bit and even 40-bit physical addressing limitations instead of 32-bit, but desktop chipsets don't support more than 32-bit in general still, and the OS support is not there. So, the PC architecture, before 64-bit PCs came about looked like: 0 ==== 640K - 1GB ======================= (MAXMEM) --- PCI Hole --- 4GB The problem is if MAXMEM is greater than the size of your PCI hole, then you would lose some RAM. The RAM would just be inaccessible. If you had less than (4GB - pci hole size) of RAM, this is not a problem. Now with the number of PCs with 4GB or more RAM out there increasing, new features to do memory remapping (sometimes called "hoisting") have come about. Also now that desktop machines have 64-bit OSes and drivers available. The remapping would take a part at the end of the address space and map that into the RAM behind the PCI hole. New picture for a machine with 4GB of RAM total: 0 ==== 640K - 1GB ==== (PCI hole start) --- 4GB === (4GB + pci hole size) In the 64-bit world, it's the same, except the remapped RAM would appear at the end of the normal RAM, so for example, my 8GB machine looked like this: 0 ==== 640K - 1GB ==== (PCI hole start) --- 4GB === 8GB ...and with remapping looks like this: 0 ==== 640K - 1GB ==== (PCI hole start) --- 4GB === (8GB + pci hole size) NOTE: Even on machines running only 64-bit OSes, the PCI hole pretty much has to be in the same place, because many PCI add-in cards only support being mapped into 32-bit addressable locations, so must be placed under 4GB in the address space. This is also the reason the IOTLB exists, so that RAM above the 4GB limit can be accessed by 32-bit PCI devices.
2) The Asus bios allows for software *and* hardware memory remapping. In both cases I have almost all 4GB of RAM available. However, I don't have 3D acceleration anymore because the fglrx driver complains about those uncacheable mtrr space between 3.5 and 4GB. Andi wrote sth. about that problem in an earlier thread that I found in the archives. What I don't understand is: How can that memory be available suddenly, when using the bios option to remap that memory hole, if the memory must be reserved for PCI devices? Or is it that it is indeed no longer reserved when doing the remapping, and that's why the fglrx drivers cannot access it?
NOTE: On Athlon64/Opteron systems, pre-rev-E parts only support software remapping, but rev-E parts support both. I'm not sure exactly what you mean by "suddenly"... it's a BIOS feature that had to be rolled out after lots of testing, so it's not surprising it took a while for it to appear even though the hardware could do it. The BIOS could be punting and not trying to map the spaces correctly because of the complexity of MTRR overlap rules. Before it only had cacheable memory below the PCI hole, now it has it on either size (above and below), and this could be making it punt and just mark them all as uncacheable. I strongly suspect it's just a bug when the remapping is enabled.
Andi wrote sth. in that other thread about manually creating a write- combined mtrr space from the uncacheable space (which I didn't try yet), but I don't understand why this should be possible: If I could easily create a write-combined space in the uncacheable space, why can't I just turn all the uncacheable mtrr space into write-combined and all the problems are gone?
There are multiple MTRRs available (generally) on modern x86 hardware to set memory types. You would want to take one of the available MTRRs and set the range of memory for the video card to be write-combining. Setting all uncacheable memory to write-combining would likely crash your system either instantly or nearly so. "write-combining" means individual writes may not be separable, and most hardware drivers require individual writes, at least to control registers, to be guaranteed as separate operations. It's OK to make a video frame buffer write-combining because it's generally irrelevant if you write half or a whole pixel or two pixels as one operation. Video drivers are also written to account for this issue. It allows them to get better performance since multiple writes can be combined into one.
What's the relation between that "reserved for PCI" memory and this uncacheable mtrr range between 3.5 and 4G?
PCI devices in general need their memory addresses to be marked as uncacheable, so I'm sure that's what the BIOS does by default. That's where they are in the physical address space. -- Erich Stefan Boleyn <erich@uruk.org> http://www.uruk.org/ "Reality is truly stranger than fiction; Probably why fiction is so popular"