[opensuse-kernel] Race in usb/core/file.c?
Hi, I am experiencing a race in the usb stack when using different drivers for different hardware sharing the same major number. When trying to hunting down the problem I encountered the function usb_register_dev(struct usb_interface *intf, struct usb_class_driver *class_driver) in drivers/usb/core/file.c There is a call to init_usb_class() which access static struct usb_class { struct kref kref; struct class *class; } *usb_class; without explicit access protection. To me it looks like I can observe a race in usb_register_dev() when doing a coldplug during bootup of the system. (OpenSUSE 11.3) In my test setup I have 3 different but similar devices which all call usb_register_dev() from file.c concurrently as seen in the debug output. Within usb_register_dev() there is a call to device_create(usb_class->class, &intf->dev, MKDEV(USB_MAJOR, minor), class_driver, "%s", temp). Within device_create() there is a call to device_create_vargs() which returns ENODEV (19) struct device * device_create_vargs(class, parent, devt, drvdata, fmt, vargs) The callstack then looks like usb_register_dev() device_create() device_create_vargs() <-- returns ENODEV in case of the race condition I am now wondering if my observations and thoughts are reasonable. Assuming that there is indeed such a race I would like to know if the following trivial patch is the correct approach? --- linux-2.6.34.orig/drivers/usb/core/file.c.orig 2011-09-14 14:15:28.000000000 +0200 +++ linux-2.6.34/drivers/usb/core/file.c 2011-09-14 14:26:02.000000000 +0200 @@ -179,13 +179,17 @@ if (intf->minor >= 0) return -EADDRINUSE; + down_write(&minor_rwsem); + retval = init_usb_class(); - if (retval) + if (retval) { + up_write(&minor_rwsem); return retval; + } dev_dbg(&intf->dev, "looking for a minor, starting at %d", minor_base); - down_write(&minor_rwsem); + /*down_write(&minor_rwsem);*/ for (minor = minor_base; minor < MAX_USB_MINORS; ++minor) { if (usb_minors[minor]) continue; @@ -194,9 +198,11 @@ intf->minor = minor; break; } - up_write(&minor_rwsem); - if (intf->minor < 0) + /*up_write(&minor_rwsem);*/ + if (intf->minor < 0) { + up_write(&minor_rwsem); return -EXFULL; + } /* create a usb class device for this usb interface */ snprintf(name, sizeof(name), class_driver->name, minor - minor_base); @@ -208,6 +214,8 @@ intf->usb_dev = device_create(usb_class->class, &intf->dev, MKDEV(USB_MAJOR, minor), class_driver, "%s", temp); + up_write(&minor_rwsem); + if (IS_ERR(intf->usb_dev)) { down_write(&minor_rwsem); usb_minors[minor] = NULL; Yours, -- martin -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Wed, Sep 14, 2011 at 04:27:39PM +0200, EXTERNAL Konold Martin (Firma, RtP2/TEF72) wrote:
Hi,
I am experiencing a race in the usb stack when using different drivers for different hardware sharing the same major number.
What drivers would that be? Any pointers to them? We don't usually allow this at all.
When trying to hunting down the problem I encountered the function
usb_register_dev(struct usb_interface *intf, struct usb_class_driver *class_driver)
in drivers/usb/core/file.c
There is a call to init_usb_class() which access
static struct usb_class { struct kref kref; struct class *class; } *usb_class;
without explicit access protection.
To me it looks like I can observe a race in usb_register_dev() when doing a coldplug during bootup of the system. (OpenSUSE 11.3)
In my test setup I have 3 different but similar devices which all call usb_register_dev() from file.c concurrently as seen in the debug output.
As the USB subsystem init path is serialized, I don't see how this can be happening. What is creating these calls in a parallel manner? And does this also happen with the 11.4 kernel, and the 3.0 kernel in Kernel:stable? The 11.3 kernel is quite old at this point in time from a development point-of-view.
Within usb_register_dev() there is a call to device_create(usb_class->class, &intf->dev, MKDEV(USB_MAJOR, minor), class_driver, "%s", temp). Within device_create() there is a call to device_create_vargs() which returns ENODEV (19)
struct device * device_create_vargs(class, parent, devt, drvdata, fmt, vargs)
The callstack then looks like usb_register_dev() device_create() device_create_vargs() <-- returns ENODEV in case of the race condition
I am now wondering if my observations and thoughts are reasonable. Assuming that there is indeed such a race I would like to know if the following trivial patch is the correct approach?
Actually, in looking at your patch, I think this might be correct. Care to refresh it against the 3.0 kernel tree, and provide the proper Signed-off-by: line and cc: it to the linux-usb@vger.kernel.org mailing list and me, and we can take it from there? thanks, greg k-h -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
Am Donnerstag, 15. September 2011, 08:51:16 schrieb Greg KH:
Within usb_register_dev() there is a call to device_create(usb_class->class, &intf->dev, MKDEV(USB_MAJOR, minor), class_driver, "%s", temp). Within device_create() there is a call to device_create_vargs() which returns ENODEV (19)
struct device * device_create_vargs(class, parent, devt, drvdata, fmt, vargs)
The callstack then looks like usb_register_dev() device_create() device_create_vargs() <-- returns ENODEV in case of the race condition
This is the default error return of device_create_vargs(). Do you know which error case happens?
I am now wondering if my observations and thoughts are reasonable. Assuming that there is indeed such a race I would like to know if the following trivial patch is the correct approach?
This code depends on serialization static int init_usb_class(void) { int result = 0; if (usb_class != NULL) { kref_get(&usb_class->kref); goto exit; } usb_class = kmalloc(sizeof(*usb_class), GFP_KERNEL); if (!usb_class) { result = -ENOMEM; goto exit; } Very strage things could happen if you are preempted at the wrong time here. We probably should create usb_class much earlier. But after that it boils down to kref_get() which mustn't need locking.
Care to refresh it against the 3.0 kernel tree, and provide the proper Signed-off-by: line and cc: it to the linux-usb@vger.kernel.org mailing list and me, and we can take it from there?
But you also want to extend the lock down to device_create(). It should be possible to create two devices with a different major:minor at the same time. As far as locking is needed it should be provided by the driver core, not usbcore. So this approach seems wrong. Please take this to the list. Regards Oliver -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
participants (3)
-
EXTERNAL Konold Martin (Firma, RtP2/TEF72)
-
Greg KH
-
Oliver Neukum