[Bug 664210] New: VME interrupts only received if both ACPI and APIC disabled either in BIOS or on kernel command line
https://bugzilla.novell.com/show_bug.cgi?id=664210 https://bugzilla.novell.com/show_bug.cgi?id=664210#c0 Summary: VME interrupts only received if both ACPI and APIC disabled either in BIOS or on kernel command line Classification: openSUSE Product: openSUSE 11.1 Version: Final Platform: i586 OS/Version: openSUSE 11.3 Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: amb@jb.man.ac.uk QAContact: qa@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.15) Gecko/2009102100 SUSE/3.0.15-0.1.2 Firefox/3.0.15 The systems use the Tundra Universe PCI to VME Bridge chip. VME interrupts are only received if: 1) ACPI is disabled either in the BIOS or using acpi=off on the kernel command line and 2) APIC is disabled either in the BIOS or using noapic on the kernel command line Reproducible: Always Steps to Reproduce: 1. Enable ACPI and APIC in the BIOS 2. Reboot 3. Load VME device driver(s) 4. Reboot with acpi=off and noapic on kernel command line 5. Load VME device driver(s) 6. Disable ACPI and APIC in the BIOS 7. Reboot 8. Load VME device driver(s) Actual Results: After step 3 above, interrupts from the VME backplane are not received After steps 5 and 8 interrupts from the VME backplane are received Expected Results: Interrupts should be received if ACPI and APIC are enabled. The relevant processors are described in http://www.xembedded.com/content/vme/processors/xvme-6200.php and http://www.xembedded.com/content/vme/processors/xvme-690.php Please note that the XVME-690 is on 11.1 and APIC is not applicable. (See Bug 558740 https://bugzilla.novell.com/show_bug.cgi?id=558740 ) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c
Angela Bayley
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c1
--- Comment #1 from Angela Bayley
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.0.15) Gecko/2009102100 SUSE/3.0.15-0.1.2 Firefox/3.0.15
Please note that the XVME-690 is on 11.1 and APIC is not applicable. (See Bug 558740 https://bugzilla.novell.com/show_bug.cgi?id=558740 )
Sorry, this is incorrect. APIC is applicable to both systems and must be disabled for VME interrupts to work, in addition to ACPI. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c2
Jiri Slaby
1) ACPI is disabled either in the BIOS or using acpi=off on the kernel command line
and
2) APIC is disabled either in the BIOS or using noapic on the kernel command line
Ok, let me repeat my question. Does it work with both "acpi=noirq noapic" kernel parameters (with ACPI and APIC enabled in the BIOS)? And could you attach dmesg of a kernel booted that way? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c3
Angela Bayley
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c4
Jiri Slaby
VME interrupts are received correctly with ACPI and APIC enabled in the BIOS and
I'm afraid the XIP-2480 module has incorrectly implemented irq setup and handling. Do you have sources of that? Where does it come from? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c5
Angela Bayley
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c6
--- Comment #6 from Angela Bayley
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c7
Jiri Slaby
It (and other drivers) has worked correctly for over 10 years until the introduction of later Xycom processor boards.
I would be grateful if you could explain how its irq setup and handling are wrong.
There are more issues in the driver, not only in the irq setup and handling. The (pci_ioaddr & PCI_BASE_ADDRESS_SPACE) == PCI_BASE_ADDRESS_SPACE_IO test is wrong (and useless). See __pci_read_base. It should be pci_resource_flags(uni_dev, 0) & IORESOURCE_IO. You should not read PCI_INTERRUPT_LINE and PCI_INTERRUPT_PIN. They are invalid in case of ACPI routing. You have to use uni_dev->irq for request_irq. You cannot call pci_dev_put at the end of the function. You need a reference. You have to do it in the exit function. You don't handle wait_event_interruptible and copy_to_user retvals. You shouldn't need disable_irq/enable_irq. Spinlocks should be used properly if needed. The read_register et al. should be switched to ioread32 et al. Then you don't need the conditionals. You will then remap via pci_iomap. IRQ handler doesn't have pt_regs parameter anymore. What I don't understand is the return value 200092 reported in dmesg from comment #3. Did you change the irq handler return values/paths recently? IRQF_DISABLED is nop and should not be used. So if you use uni_dev->irq, does it work with ACPI and APICs enabled? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c8
Angela Bayley
IRQ handler doesn't have pt_regs parameter anymore. What I don't understand is the return value 200092 reported in dmesg from comment #3. Did you change the irq handler return values/paths recently?
The bad return value seemed to have been caused by the ISR being entered immediately following request_irq. I didn't understand (still don't) how it could have exited the ISR without returning either IRQ_NONE or IRQ_HANDLED. The compiler warned that end of routine reached without return value. I concluded that somehow it 'dropped through' and added a catchall return (IRQ_NONE) at the end of the ISR which cured the bad returns. I also tightened up device handling to try and eliminate unexpected interrupts.
IRQF_DISABLED is nop and should not be used.
OK removed.
So if you use uni_dev->irq, does it work with ACPI and APICs enabled?
Well the answer is that interrupts now work on the XVME-690 Pentium with ACPI and APIC enabled (interrupt 20 - previously 10): 42ftctl:~ # cat /proc/interrupts CPU0 0: 97 IO-APIC-edge timer 1: 10 IO-APIC-edge i8042 2: 0 XT-PIC-XT cascade 3: 2 IO-APIC-edge 4: 4 IO-APIC-edge 8: 0 IO-APIC-edge rtc0 12: 103 IO-APIC-edge i8042 14: 37596 IO-APIC-edge ide0 15: 0 IO-APIC-edge ide1 16: 0 IO-APIC-fasteoi uhci_hcd:usb1 17: 253 IO-APIC-fasteoi Intel 6300ESB 18: 0 IO-APIC-fasteoi ata_piix 19: 0 IO-APIC-fasteoi uhci_hcd:usb2 20: 70603 IO-APIC-fasteoi XIP-2480 23: 0 IO-APIC-fasteoi ehci_hcd:usb3 24: 11048 IO-APIC-fasteoi eth0 NMI: 0 Non-maskable interrupts LOC: 584275 Local timer interrupts RES: 0 Rescheduling interrupts CAL: 0 function call interrupts TLB: 0 TLB shootdowns TRM: 0 Thermal event interrupts SPU: 0 Spurious interrupts ERR: 0 MIS: 0 but not on the XVME-6200 Intell Core Duo (interrupt 11): ltctl:~/Documents # cat /proc/interrupts CPU0 CPU1 0: 45 112070 IO-APIC-edge timer 1: 0 8 IO-APIC-edge i8042 3: 0 1 IO-APIC-edge 4: 0 2 IO-APIC-edge 6: 0 5 IO-APIC-edge floppy 7: 0 0 IO-APIC-edge parport0 8: 0 1 IO-APIC-edge rtc0 9: 0 0 IO-APIC-fasteoi acpi 11: 0 0 IO-APIC-edge XIP-2480 12: 0 121 IO-APIC-edge i8042 16: 0 0 IO-APIC-fasteoi uhci_hcd:usb2 18: 0 13099 IO-APIC-fasteoi ata_piix, ata_piix 19: 0 0 IO-APIC-fasteoi uhci_hcd:usb3 23: 0 0 IO-APIC-fasteoi ehci_hcd:usb1 48: 0 49636 PCI-MSI-edge eth0 49: 0 631 PCI-MSI-edge eth1 NMI: 0 0 Non-maskable interrupts LOC: 54325 14673 Local timer interrupts SPU: 0 0 Spurious interrupts CNT: 0 0 Performance counter interrupts PND: 0 0 Performance pending work RES: 15466 8110 Rescheduling interrupts CAL: 141 24 Function call interrupts TLB: 811 1005 TLB shootdowns TRM: 0 0 Thermal event interrupts THR: 0 0 Threshold APIC interrupts MCE: 0 0 Machine check exceptions MCP: 4 4 Machine check polls ERR: 1 MIS: 0 Note interrupt number 11 unchanged from previous value. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c9
--- Comment #9 from Angela Bayley
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c10
Jeff Mahoney
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c11
Jiri Slaby
(In reply to comment #7) I concluded that somehow it 'dropped through' and added a catchall return (IRQ_NONE) at the end of the ISR which cured the bad returns.
Then it makes sense...
So if you use uni_dev->irq, does it work with ACPI and APICs enabled? but not on the XVME-6200 Intell Core Duo (interrupt 11):
And what helps here now? Only noapic, only acpi=noirq or you need to use both? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c12
Angela Bayley
So if you use uni_dev->irq, does it work with ACPI and APICs enabled? but not on the XVME-6200 Intell Core Duo (interrupt 11):
And what helps here now? Only noapic, only acpi=noirq or you need to use both?
Both noapic and acpi=noirq are needed. If only one is given no interrupts are received. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c13
--- Comment #13 from Angela Bayley
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c14
--- Comment #14 from Angela Bayley
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c
Jiri Slaby
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c
Jiri Slaby
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c15
Jiri Slaby
Updated Universe source code incorporating suggestions from Comment #7 and Comment #10
You should feed the sources through <linux-src>/scripts/checkpatch.pl and fix the reported issues. Then you should send them to linux-kernel@vger.kernel.org for review as patches (see Documentation/SubmittingDrivers and Documentation/SubmittingPatches). Or to devel@linuxdriverproject.org where you can merge the driver (with a TODO file) to temporary location, work on it while reviews come and then move it to an appropriate place in the kernel. Just a thought, can't be the timer driver converted to the clocksource API like: arch/x86/kernel/hpet.c arch/x86/kernel/tsc.c do? Regarding the issue, could you attach here an acpidump? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c16
Angela Bayley
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c17
--- Comment #17 from Angela Bayley
Just a thought, can't be the timer driver converted to the clocksource API like: arch/x86/kernel/hpet.c arch/x86/kernel/tsc.c do?
The suggested clock source does not provide sufficient precision for this application. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c18
Jiri Slaby
Created an attachment (id=423144) --> (http://bugzilla.novell.com/attachment.cgi?id=423144) [details] Output from acpidump
I forgot to add from which system I want the output. Is it from the defunct XVME-6200 Intell Core Duo machine?
ltctl:~ # acpidump >acpidump2.log Wrong checksum for DSDT!
That's normal. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c19
--- Comment #19 from Jiri Slaby
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c20
Angela Bayley
(In reply to comment #16)
Created an attachment (id=423144) --> (http://bugzilla.novell.com/attachment.cgi?id=423144) [details] [details] Output from acpidump
I forgot to add from which system I want the output. Is it from the defunct XVME-6200 Intell Core Duo machine?
Yes the acpidump was from the XVME-6200 Intel Core Duo (which still has the problem). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c21
--- Comment #21 from Angela Bayley
Well, you don't call pci_enable_device from neither of the drivers. That one sets up the interrupt.
Yep! that's cured it. Thanks. I use Linux Device Drivers for my info. No mention of pci_enable_device there. Can you suggest a better source to keep up with kernel changes? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c22
Jiri Slaby
(In reply to comment #19)
Well, you don't call pci_enable_device from neither of the drivers. That one sets up the interrupt.
Yep! that's cured it. Thanks.
I use Linux Device Drivers for my info. No mention of pci_enable_device there.
Which version? ldd3[1] describes this in chapter 12, page 314. [1] http://lwn.net/Kernel/LDD3/
Can you suggest a better source to keep up with kernel changes?
Also Documentation/PCI/pci.txt in the linux sources. Now, you'll submit it upstream so that others review that and prune such bugs too, right? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=664210
https://bugzilla.novell.com/show_bug.cgi?id=664210#c23
--- Comment #23 from Angela Bayley
(In reply to comment #21)
I use Linux Device Drivers for my info. No mention of pci_enable_device there.
Which version? ldd3[1] describes this in chapter 12, page 314.
Yes, sorry I was wrong. It is there.
Can you suggest a better source to keep up with kernel changes?
Also Documentation/PCI/pci.txt in the linux sources.
Now, you'll submit it upstream so that others review that and prune such bugs too, right?
Yes will do. Thanks again! -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com