[Bug 679277] New: Adaptec aic7xxx driver does endless resets with kernel 2.6.37
https://bugzilla.novell.com/show_bug.cgi?id=679277 https://bugzilla.novell.com/show_bug.cgi?id=679277#c0 Summary: Adaptec aic7xxx driver does endless resets with kernel 2.6.37 Classification: openSUSE Product: openSUSE 11.4 Version: RC 2 Platform: i686 OS/Version: openSUSE 11.3 Status: NEW Severity: Critical Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: werner@novell.com QAContact: qa@suse.de CC: coolo@novell.com Found By: Development Blocker: --- I've tried to update my old system at home with an AHA-2940U2W Adaptec controller (from 2004) with attachted two hard disks, one cdrom, one cdrecorder, one zip drive, and one tape drive. I was able to install the system but I'm not able to boot from. At the point were in initrd the udevd processes the data from the worker kernel threads the aic7xxx will be caused to reset its scsi bus. I was able to access the system with the resuce boot option of the DVD from openSUSE 11.3, but even the kernel of the 11.3 show exactly the same problem if booted from the disk. Using the rescue option of the DVD from openSUSE 11.4 causes the same resets as booting from disk. Wild guess: the aic7xxx driver does not accept any access of the tagged data queue for the devices on its SCSI bus during the hardware. It looks like there are some locks missed between the aic7xxx driver and the upper SCSI transport layer. This may cause during the hardware scan the resets in the aic7xxx driver. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c
Jeff Mahoney
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c1
--- Comment #1 from James Bottomley
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c2
James Bottomley
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c3
--- Comment #3 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c4
--- Comment #4 from James Bottomley
For a log I have to have access to the system (: ... maybe I should try to add an error shell to initrd to be able to make a coppy of the boot messages.
OK, that's what I need to take this forwards.
IMHO the problems seems to be coldplug/udevd used in initrd of the 11.4. The rescue option of the DVD of the 11.3 does not show the problem whereas the kernel of the 11.3 installed in `rescue mode' shows also the same misbehaviour after udev is started.
Yesterday evening I've tried some patches from
https://bugzilla.kernel.org/show_bug.cgi?id=5921
but it seems that only the bus reset works a bit faster and I've seen that now the driver indeed wait 15 seconds after its own scan of the plugged hardware. The question is how I can prevent coldplug/udev from scanning all 16 slots of the SCSI bus of the AHA-2940U2W. It makes no sence to scan upto scsi0:A:0:0 as there is nothing, also the zip drive on scsi0:5:0:0 does not understand all SCSI commands. Cleary all is well terminated, that is every plugged hardware and also activ termination in the BIOS of the AHA-2940U2W.
It's not really that appropriate to use a bug to ask general SCSI questions. However, SPI isn't a hotplug bus, it has to be scanned and the driver has no way of knowing where devices will be on the bus. If you have a device on the BUS that's not SCSI compliant, then either take it off or work out what the problem is and see if it can be properly blacklisted. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c5
--- Comment #5 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c6
--- Comment #6 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c7
--- Comment #7 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c8
--- Comment #8 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c9
--- Comment #9 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c10
--- Comment #10 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c11
--- Comment #11 from James Bottomley
Just to be noted, the Traxdata is AFAICS from its BIOS seems to be a TEAC CD-R55S, but in comparision with linux/drivers/scsi/scsi_devinfo.c the first one is not black listed, also the most old ZIP drives are missed. I suggest to add
{"Traxdata", "CDR4120", NULL, BLIST_NOLUN}, /* locks up */
This looks fine: care to code up a patch and send it to linux-scsi@vger.kernel.org?
{"IOMEGA", "ZIP", NULL, BLIST_NOLUN}, /* locks up */
Unless you can verify it, I don't think this is necessary. We already have a line for the older zip drivers and it's {"iomega", "jaz 1GB", "J.86", BLIST_NOTQ | BLIST_NOLUN}, Incidentally, just make sure you hammer the Traxdata to verify it doesn't need a BLIST_NOTQ as well (that's the flag that says the drive falls over if we try to queue multiple commands to it). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c12
--- Comment #12 from Dr. Werner Fink
{"iomega", "jaz 1GB", "J.86", BLIST_NOTQ | BLIST_NOLUN},
IMHO jaz or even JAZ is not identical to ZIP. It seems to make a difference at least in scsi_get_device_flags() of drivers/scsi/scsi_devinfo.c And indeed I'm using BLIST_NOTQ aka 0x20 for the ZIP on the kernels command line in attachment #419889 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c13
--- Comment #13 from James Bottomley
(In reply to comment #11)
{"iomega", "jaz 1GB", "J.86", BLIST_NOTQ | BLIST_NOLUN},
IMHO jaz or even JAZ is not identical to ZIP. It seems to make a difference at least in scsi_get_device_flags() of drivers/scsi/scsi_devinfo.c
And indeed I'm using BLIST_NOTQ aka 0x20 for the ZIP on the kernels command line in attachment #419889 [details]
Right, sorry, misread the inquiry list. the two lines plus the BLIST_NOTQ for the iomega would be fine. THanks -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c14
--- Comment #14 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c15
--- Comment #15 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c16
--- Comment #16 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c17
--- Comment #17 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c18
--- Comment #18 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c19
James Bottomley
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c20
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c21
--- Comment #21 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c22
Kay Sievers
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c23
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c24
--- Comment #24 from Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c
Dr. Werner Fink
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c25
--- Comment #25 from Kay Sievers
The code
ATTRS{scsi_level}=="[6-9]*"
does work flawless. I can boot the system *without* trigger a reset on the SCSI bus.
Great. I will add that. Thanks!
Btw: Does udev something like modifiers for the content of the attribute `scsi_level', let's say:
ATTRS{scsi_level:d}
to be able to use
ATTRS{scsi_level:d}>=6
For file names including the colon this may require to escape the colon. On the other hand instead of a colon also the percent sign `%' could be used to set a modifier for the key of an attribute.
It's all just simple string and fnmatch(). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c26
Kay Sievers
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c27
--- Comment #27 from Kay Sievers
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c28
Marcus Meissner
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c29
--- Comment #29 from Matthias Andree
i guess you want this released as opensuse update?
That depends if you want to scare the remaining openSUSE users with older SCSI CD-ROMs away or not. Of course you want to have that fix backported to affected and supported distros (AFAIR 11.3 and 11.4). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c
Matthias Andree
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c30
--- Comment #30 from Marcus Meissner
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c31
Christian Dengler
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c32
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c33
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=679277
https://bugzilla.novell.com/show_bug.cgi?id=679277#c34
Christian Dengler
participants (1)
-
bugzilla_noreply@novell.com