Hi all, I've done testing for the moment. No luck. Summary below. On Sunday 11 April 2004 10:05, Paul C. Leopardi wrote:
I'll also try Option "NvAgp" "1" in /etc/X11/XF86Config as it states at http://minion.de/ and finally, I will try contacting zander@mail.minion.de.
Not yet. Since "iommu=noagp" does not prevent crashes even in runlevel 3 with no nvidia driver loaded and with X not running, need to try "agp=off" first. See separate email for results.
Here is a brief summary showing what happened when I tried "agp=off" with
various configurations. I had quite a number of other messages, most of which
I don't understand, and have kept more extensive logs.
Bottom line is that I have found no configuration stable enough to prevent gdb
from segfaulting or gpfing.
I noticed that general protection fault did not happen the first time I ran
the program after booting, but often happened the second time, if the first
test was large enough to go into swap, and swap was non-zero when I started
the second test. Otherwise the fault happened the 3rd, 4th or 5th time.
I'm not sure what my next step is, but I could try more testing with more
complete logging of configuration changes. If I was to report a bug to NVIDIA
or to Gigabyte Techology, I'm still not sure what I would be reporting.
Perhaps the next best thing would be to try to reproduce the problem on the
UNSW School of Mathematics cluster, which runs SuSE Enterprise Linux 8 on
dual Opteron nodes. The kernel and toolchain differ, as well as the hardware,
but the architecture is still AMD64, and I can run the same test program,
with more main memory and no possible interference from X11.
Best regards
1) 2.6.4-5 runlevel 5 with "nvidia" driver
apm=off acpi=off noapic agp=off pci=noacpi vga=0x31a desktop
hdc=ide-scsi hdclun=0 3 console=tty0
*** cat /proc/mtrr
reg00: base=0x00000000 ( 0MB), size=1024MB: write-back, count=1
reg01: base=0xe0000000 (3584MB), size= 128MB: write-combining, count=1
reg02: base=0xd0000000 (3328MB), size= 16MB: write-combining, count=1
init: Switching to runlevel: 5
nvidia: module license 'NVIDIA' taints kernel.
nvidia: loading NVIDIA Linux x86_64 NVIDIA Kernel Module
1.0-5332 Fri Jan 9 12:42:32 PST 2004
NVRM: AGPGART: unable to retrieve symbol table
NVRM: not using NVAGP, kernel was compiled with GART_IOMMU support!!
*** glucat-0.1.5 gfft_test 11 gave general protection and segfaults
2) 2.6.4-5 Failsafe runlevel 3
showopts ide=nodma apm=off acpi=off noapic agp=off vga=normal
iommu=noforce maxcpus=0 hdc=ide-scsi hdclun=0 3
*** cat /proc/mtrr
reg00: base=0x00000000 ( 0MB), size=1024MB: write-back, count=1
reg01: base=0xe0000000 (3584MB), size= 128MB: write-combining, count=1
*** nvidia kernel module was not loaded.
*** glucat-0.1.5 gfft_test 11 gave general protection and segfaults
*** gdb exited prematurely.
3) 2.4.21-209 Failsafe runlevel 2
showopts ide=nodma apm=off acpi=off noapic agp=off vga=normal
iommu=noforce maxcpus=0 hdc=ide-scsi hdclun=0 3
*** cat /proc/mtrr
reg00: base=0x00000000 ( 0MB), size=1024MB: write-back, count=1
reg01: base=0xe0000000 (3584MB), size= 128MB: write-combining, count=1
*** nvidia kernel module was not loaded.
*** glucat-0.1.5 gfft_test 10 gave segfaults when run in gdb
Program received signal SIGSEGV, Segmentation fault.
0x0000002a956f9797 in
std::__default_alloc_template