Fwd: [opensuse] identifying device in dmesg output

24 Nov 2016

      resend with the list in copy.  Sorry about private reply.

---------- Forwarded message ----------
From: Greg Freemyer 
Date: Thu, Nov 24, 2016 at 8:39 AM
Subject: [opensuse] identifying device in dmesg output
To: John Andersen 

On Thursday, November 24, 2016, John Andersen  wrote:
...
On November 23, 2016 9:10:19 PM PST, Greg Freemyer  wrote:
...
On Wed, Nov 23, 2016 at 11:13 AM, John Andersen 
wrote:
...
On November 23, 2016 7:49:57 AM PST, Malcolm
 wrote:
...
On Wed 23 Nov 2016 04:42:15 PM CST, Istvan Gabor wrote:
...
Knurpht - Gertjan Lettink írta:
...
Op woensdag 23 november 2016 15:25:24 CET schreef Istvan Gabor:
> ata1.00: device reported invalid CHS sector 0
A quick search indicates a drive failure / degrading condition. To
find out which entry in /dev is meant by ata1 run
ls -l /sys/block/sd* | sed 's/.*\(sd.*\) -.*\(ata.*\)\/h.*/\2 =>
\1/'
Thanks, running this gives:
ls -l /sys/block/sd* | sed 's/.*\(sd.*\) -.*\(ata.*\)\/h.*/\2 =>
\1/'
ata1 => sda
ata3 => sdb
ata4 => sdc
ls for these devices gives:
ls -gG /sys/block/
lrwxrwxrwx 1 0 Nov 23 11:04 sda
->
../devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda
lrwxrwxrwx 1 0 Nov 23 11:04 sdb
->
../devices/pci0000:00/0000:00:1f.2/ata3/host2/target2:0:0/2:0:0:0/block/sdb
lrwxrwxrwx 1 0 Nov 23 11:04 sdc
->
../devices/pci0000:00/0000:00:1f.2/ata4/host3/target3:0:0/3:0:0:0/block/sdc
Here there's no .00 in the names like in the dmesg message.
What is then ata1.00 exactly? Why is .00 attached to the device name
in dmesg? What does it mean?
Thanks,
Istvan
Hi
The hwinfo --disk command should give you all the info, device files
etc which should help clarify?
More to the point he already knows which drive is ata1 and it's time
to order a new drive.  It's failing.
Smartctl might show some info, but to what purpose?
He needs to start moving that data now.
If smartctl doesn't show historical drive errors, it probably isn't
the drive.  I've had lots of sata cables fail over the years.
Unreliable communications causes strange error reports.  Drive's tend
to have total failure or bad sectors.
If this is a bad drive, I'd say it's a rare failure mode.
Greg
Wait, what?
Drive failure is rare, but cable failure is common?
Greg, what bizarro world do you live?
All drives fail.
John,

I used to be part of the QA team at a end of life PC recycler (I was
an outside consultant that did quarterly audits of their process).
Used(end of life) PCs were tested and resold.  30,000 PCs a month.  If
an overall PC failed and wasn't easily repaired the good parts were
pulled and stocked for repairs of other PCs.

Every drive was wiped with at least a 7 pass wipe.  Any that had
hardware issues where pulled and run through a mechanical shredder.
The 7 pass wipe was done with a Linux boot and shred was invoked by a
custom written control program that documented the process.  The Linux
log files were scanned and if the drive had issues it was tagged for
physical destruction.

Data security (destruction) was the primary selling feature of this
company, and it was my job to ensure no data got through the wiping
process.

I saw the wiping results and the error reports.

In your world, the facility would be buying hard drives to replace all
the failed drives coming in the otherwise good computers they were
recycling.

In mine, they scrapped so many PCs that they had a hard drive stock
room with thousands of drives in it that got pulled from the PCs that
weren't worth repairing.  They sold off excess used hard drive stock.

SATA cables they did NOT try to re-use/salvage; simply not reliable
enough. If a PC came in with a bad cable, they put in a new one.

Greg
--
Greg Freemyer

-- 
To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org
To contact the owner, e-mail: opensuse+owner@opensuse.org

Greg Freemyer

tags

participants (1)