Alan Ott changed bug 922405
What Removed Added
Status RESOLVED REOPENED
CC   alan@softiron.co.uk
Resolution NORESPONSE ---

Comment # 15 on bug 922405 from
Hi guys,

I know what the problem is, as I have seen this issue and chased it down.

The symptom:
Plug in a specific thumb drive, and it is unusable (eg: unable to be mounted)
for 60 seconds. dmesg will show that after 60 seconds the drive is reset and
then becomes usable (eg: able to be mounted).

Using a USB analyzer, I was able to determine that the host is sending a SCSI
INQUIRY request with the VPD bit set looking for page 0x80, which is where the
device strings are. The device then incorrectly returns a non-zero in a
reserved field, which the host incorrectly treats as part of a length field,
causing the host to re-send the SCSI INQUIRY request again with the much larger
length field, causing the thumb drive to hang. Make no mistake, this is a bug
on the thumb drive, but we need to work around it, like we always do.

See SCSI documentation from [1]. The section numbers below come from this
document.

Summary of events:
1. device is connected
2. host enumerates device
3. host sends SCSI INQUIRY with EVPD bit set, requesting page 0x80 (serial
numbers). See 3.6.1, Table 45, and information on EVPD in the same section.
4. The device returns the contents of page 0x80, but incorrectly puts a 0x6 in
byte 2, which is supposed to be a reserved field. See 4.4.10. This is a bug in
the thumb drive.
5. The host interprets bytes 2 and 3 together as a 16-bit length value, relying
on the device to have zeroed byte 2 (as required), which this thumb drive
incorrectly does not do.
6. With the length returned by the device being much larger than expected, the
host will then re-send the INQUIRY with the new, larger, incorrect length (in
this case 0x6fc instead of just 0xfc).
7. The thumb drive, for whatever reason, can't handle this request and hangs.
8. After 60 seconds, the request times out, and the device is reset. After the
device is reset, the host will NOT issue the INQUIRY with VPD for page 0x80,
and the device works fine (eg: is able to be mounted).

After trying to find where in the kernel this SCSI INQUIRY with VPD was
originating from, I eventually was able to determine that it was happening as a
result of an ioctl() initiated from user space by the sg_inq process, which is
part of the sg3_utils package. This is happening as a result of a udev rule[2],
which is also part of sg3_utils. This udev rule[2] calls sg_inq which sends the
SCSI commands directly to the device.

It's worth pointing out here, that the Linux kernel's code, which can read
these VPD pages, will _not_ issue these EVPD INQUIRY requests for USB devices.
In drivers/usb/storage/scsiglue.c, is the code:

                /* Some devices don't handle VPD pages correctly */
                sdev->skip_vpd_pages = 1;

Indeed, this device does not handle VPD correctly. There's a reason the kernel
doesn't issue these requests on USB devices, so they shouldn't be issued from
user space either.

1. Is there a reason to need this sg3_utils udev file which pokes at the drive
from user space when it's attached?
2. If the answer to the above is yes, can it be modified to _not_ be run for
USB devices?

My recommendation is to remove the udev rule at [2] completely. The kernel is
very good at working around broken hardware and giving each device the
interactions that it needs in order to work properly. This udev rule cuts
through all those checks and workarounds and issues SCSI commands directly from
user space, re-exposing the same hardware issues that the kernel is very
careful to work around.

I'm running on Leap 42.1 on aarch64, with sg3_utils version 1.41. The latest
sg3_utils source[3] still has the issue. Again though, the larger issue here is
not that sg3_utils is doing something incorrectly (even though it is); the
issue is that we are sending SCSI commands from user space to devices where
they are not appropriate (eg: USB devices).

Thanks!

Alan.

[1] http://www.seagate.com/staticfiles/support/disc/manuals/scsi/100293068a.pdf
[2] /usr/lib/udev/rules.d/55-scsi-sg3_id.rules
[3] https://github.com/hreinecke/sg3_utils/blob/master/src/sg_inq.c#L3058


You are receiving this mail because: