[opensuse] USB Docking Station Brought Tumbleweed Tumbling Down a Rocky Slope

16 Jan 2016

      Hi,

I'm using dd on openSUSE TW 20160113 to zero out (/dev/zero) 2 SATA hard 
drives, and have plugged them both into a generic USB docking station 
that has always allowed me to read SMART data from any drive and never 
had any problems transferring terabytes of data. I wanted to zero both 
drives out because they were given to me as possibly faulty, and in my 
research I've found that zeroing out a disk can make bad sectors 
surface, and then show up in the SMART data which is exactly my intention.

The issue I just had is that with the first disk (WD Blue 500 GB), after 
dd ran for some time it locked my system up and I couldn't log out of or 
restart Plasma. I switched to a different TTY and issued the "reboot" 
command as root, and even then systemd was giving me "a start job is 
running" message which appeared about 3 times, for different processes. 
One notable thing which errored out is my home partition wasn't properly 
unmounted. I then just hit the reset switch because I wasn't going to 
sit there all night. The UEFI BIOS then hung until I yanked the USB 
cable out of the USB port. Before powering the SATA docking station 
down, I had a Windows laptop nearby which I plugged the docking station 
into, and it hung up on installing the USB drivers, and just sat there 
with a spinning circle. I then powered the docking station down and then 
back up, and drivers were instantly found. I then plugged the SATA 
docking station's USB cable into my openSUSE PC and it didn't hang on 
boot and everything was normal again. When I ran smartctl on the drive, 
a new error came up:
----------
*Read SMART Log Directory failed: scsi error medium or hardware error 
(serious) **
****
**Warning! SMART ATA Error Log Structure error: invalid SMART checksum. **
**SMART Error Log Version: 1 **
**No Errors Logged *
----------

 From then on, that error came up when running smartctl on that drive 
and was very slow to display information in terminal. So I figured the 
drive was pretty much toast, even though there wasn't anything else 
interesting in the SMART logs reads with smartmontools. I checked on 
smartmontools' website and it says the software does support reading 
SMART data over USB.

It looks as though possibly a bad logic board on the drive temporarily 
messed up firmware in the USB docking station, which in turn wreaked 
havoc with Linux. I'm getting the feeling that Linux with its monolithic 
kernel is very unforgiving if you have a bad USB device, or, in my case 
(if my guess is right) a bad logic board on an HDD which in turn messed 
up the firmware on the docking station. The same exact thing happened 
with the second drive; a Hitachi 750 GB HDD. Drive was inserted into the 
USB docking station, I ran dd while periodically checking SMART data in 
another terminal window, and eventually plasma started choking and I had 
to reboot from another TTY. This time there was no problem unmounting 
/home, but the "a start job is running" error came up on a couple 
processes. I believe this time I did unplug the USB cable before issuing 
the reboot command, and maybe if I hadn't done that the same thing would 
have happened as before where it couldn't unmount /home on my SATA HDD 
in my computer and it would have hung up making me then press reset.

Once again, I couldn't boot the system with the USB docking station 
plugged into USB port. The UEFI graphical BIOS would hang at "B4", which 
is in a little readout in the lower right-hand corner which gives you 
status on the POST process. As soon as I unplugged the docking station, 
it stopped hanging at "B4" and was fine. So obviously the USB docking 
station can be brought to its knees by a bad drive or is faulty, but I 
doubt the latter because I've never had this issue with it. Again, as 
before with the previous drive, as soon as I power cycled the docking 
station with the Hitachi plugged in, there were no hangs on boot and 
everything went back to normal. Currently i'm running dd on a known good 
drive which itself is plugged into the docking station, and monitoring 
its progress to see if maybe the problem wasn't with the two drives but 
rather the docking station. However, it looks as though it's going to 
make it through the full wipe. No freezes yet, and it's almost finished. 
I may test with a live USB Linux image plugged into my laptop, and do 
the dd wipe there since I can't afford to have my main desktop cut down 
to a stump again, and have everything go to complete crap. I would like 
some input as to how something like what I wrote about can crap up the 
whole system bus and make Linux an unhappy camper. Is this one of the 
weaker areas of the kernel, a bug, or something else? Maybe it wouldn't 
have happened on Leap, and has something to do with the kernel on TW? 
Update: the known good drive finished the dd wipe process without 
locking up my system.

SDM
-- 
To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org
To contact the owner, e-mail: opensuse+owner@opensuse.org

SDM

jdd

tags

participants (2)