On Tuesday 10 May 2005 13:12, Andi Kleen wrote:
So that explains why nobody sees this problem. But the TSC-based fallback timekeeping is still broken on SMP systems with PowerNow and distributed IRQ handling, which both together seem to be rare enough ;-).
There is a patch pending for the TSC problem - using the pmtimer instead in this case.
But the distributed timer interrupt problem is weird. It should not happen. You sure it was IRQ 0 that was duplicated and not "LOC" ?
Yes. Only one CPU actually gets and handles the timer interrupt, but which one is somewhat random (for about 10 seconds, it's the same CPU, then it switches over).
When you watch -n1 cat /proc/interrupts does the rate roughly match up to 1000Hz?
Yes, and this is confirmed over longer time: # grep timer /proc/interrupts; uptime 0: 40347440 40582285 IO-APIC-edge timer 1:26pm an 22:28, 1 user, Durchschnittslast: 0,00, 0,01, 0,04 # echo $[(3600*22+28*60)*1000] $[40347440+40582285] 80880000 80929725 Given that uptime is only accurate to the minute, this sounds very reasonable. The distribution also is close to 50:50. That's (almost) true for all interrupt sources: # cat /proc/interrupts CPU0 CPU1 0: 40523846 40753939 IO-APIC-edge timer 1: 3 189 IO-APIC-edge i8042 8: 261 280 IO-APIC-edge rtc 9: 0 0 IO-APIC-level acpi 15: 364369 364479 IO-APIC-edge ide1 169: 59195 55498 IO-APIC-level 3w-9xxx 177: 618198 604643 IO-APIC-level 3w-9xxx 185: 8195891 8147619 IO-APIC-level aic79xx, eth1 193: 0 30 IO-APIC-level aic79xx 201: 0 0 IO-APIC-level ohci_hcd, ohci_hcd NMI: 1184 1013 LOC: 81273966 81271958 ERR: 0 MIS: 0 -- Bernd Paysan "If you want it done right, you have to do it yourself" http://www.jwdt.com/~paysan/