[opensuse-factory] i/o error on sdb shown in strg+alt+f10
Hi, since about 2 weeks I have random crashes of different apps on one of my Tumbleweed machines. Strg+Alt+F10 shows i/o errors on /dev/sdb after such an app crash. sdb is an SSD where root and home lived happily together for the past 4 years. Could this mean that my SSD dies? And where is the output of Strg+Alt+F10 stored? I can´t find the i/o error messages in /var/log/messages, dmesg, xsession-errors or boot.log. Are there other important log-files I missed? Which diag program do you use to check your SSD with mounted root? Or is someone else seeing such errors, thus might it be a TW bug? Thank you. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
10.10.2015 10:24, Thomas Langkamp пишет:
Hi,
since about 2 weeks I have random crashes of different apps on one of my Tumbleweed machines. Strg+Alt+F10 shows i/o errors on /dev/sdb after such an app crash. sdb is an SSD where root and home lived happily together for the past 4 years.
Could this mean that my SSD dies?
Possibly. It can also be controller or motherboard.
And where is the output of Strg+Alt+F10 stored? I can´t find the i/o error messages in /var/log/messages, dmesg, xsession-errors or boot.log. Are there other important log-files I missed?
Try journalctl
Which diag program do you use to check your SSD with mounted root?
Or is someone else seeing such errors, thus might it be a TW bug?
Thank you.
-- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On 10/10/2015 09:59 AM, Andrei Borzenkov wrote:
10.10.2015 10:24, Thomas Langkamp пишет:
Could this mean that my SSD dies?
Possibly. It can also be controller or motherboard.
or the cheapest possibility: cables. Have a nice day, Berny -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2015-10-10 09:24, Thomas Langkamp wrote:
Could this mean that my SSD dies?
Might be, but first try replacing the cable. I really do hope SSDs last longer than four years, unless it runs full time.
And where is the output of Strg+Alt+F10 stored?
In the system log, /if/ the system doesn't crash before it can write to the disk.
I can´t find the i/o error messages in /var/log/messages, dmesg, xsession-errors or boot.log. Are there other important log-files I missed?
dmesg would show everything, if the system is still running after the crash. boot.log would show nothing. "/var/log/messages" should have the same as dmesg shows, /if/ it is enabled, because in 13.2 it is not. Instead, you have to query "journalctl".
Which diag program do you use to check your SSD with mounted root?
smartctl :-) See the man page, it has examples. The test runs while the system runs, in the disk hardware, without involving the system cpu. When it finishes, you just use "smartctl -a /dev/sdX" to query the result. I don't use SSD, but rotating disks, so there will be differences in the output. There should be a life estimation. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iF4EAREIAAYFAlYY/7AACgkQja8UbcUWM1xGVwD/caEIr2btRSv5IL9wm3cdi69Z KK+dTBl7NFOjRZwycqIA+gIukqKABqWgJXsRW+PyAve/JJn+Np4zXF/vrGoDKFbN =iD8k -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
Thanks Carlos and the others good information :) reply inline Am 10.10.2015 um 14:08 schrieb Carlos E. R.:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
On 2015-10-10 09:24, Thomas Langkamp wrote:
Could this mean that my SSD dies?
Might be, but first try replacing the cable. I really do hope SSDs last longer than four years, unless it runs full time.
uptime 6000 hours. I did a smartctl -t short which showed no errors then I saw that smartctl tells me that tomshardware/crucial suggests a firmware update because some of those drives are known to hang every hour after 5000 hours uptime. However, then my ssd would be 1000 hours late with this known bug. I updated the firmware nevertheless AND also changed the cable - why not. Now I will have to wait if the error comes back.
And where is the output of Strg+Alt+F10 stored?
In the system log, /if/ the system doesn't crash before it can write to the disk.
my system log seems to be non-persistend. however I grabbed pen and paper when it showed up some minutes ago. There are many repeating lines with this: "date+time kernel: [long-number] blk_update_request: I/O error /dev/sdb2 sector $changing-number" sdb2 is root, ext4, SSD, latest TW x64
I can´t find the i/o error messages in /var/log/messages, dmesg, xsession-errors or boot.log. Are there other important log-files I missed?
dmesg would show everything, if the system is still running after the crash. boot.log would show nothing. "/var/log/messages" should have the same as dmesg shows, /if/ it is enabled, because in 13.2 it is not. Instead, you have to query "journalctl".
Which diag program do you use to check your SSD with mounted root?
smartctl :-) See the man page, it has examples.
The test runs while the system runs, in the disk hardware, without involving the system cpu. When it finishes, you just use "smartctl -a /dev/sdX" to query the result.
I don't use SSD, but rotating disks, so there will be differences in the output. There should be a life estimation.
- -- Cheers / Saludos,
Carlos E. R.
(from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux)
iF4EAREIAAYFAlYY/7AACgkQja8UbcUWM1xGVwD/caEIr2btRSv5IL9wm3cdi69Z KK+dTBl7NFOjRZwycqIA+gIukqKABqWgJXsRW+PyAve/JJn+Np4zXF/vrGoDKFbN =iD8k -----END PGP SIGNATURE-----
-- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Sat, Oct 10, 2015 at 5:04 PM, Thomas Langkamp <thomas.lassdiesonnerein@gmx.de> wrote:
"date+time kernel: [long-number] blk_update_request: I/O error /dev/sdb2 sector $changing-number"
I will lay odds your SSD is fine. First it is not a "media error" that is being reported and second that changing number indicates the problem is elsewhere. I'd give it at least 90% odds it's the cable. Greg -- Greg Freemyer www.IntelligentAvatar.net -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2015-10-10 23:04, Thomas Langkamp wrote:
uptime 6000 hours. I did a smartctl -t short which showed no errors
Try the long test, too. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iF4EAREIAAYFAlYZl/gACgkQja8UbcUWM1zevAD/WQkxQKYh/Qs8i4SHK4JRB6e8 26zsOUAzpkZuyyzFMlMA/iGcq+OgNtVT3RMbluh1L6DfoEsZlDs87IDYoMgLj9j5 =5T0H -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Sat, Oct 10, 2015 at 3:24 AM, Thomas Langkamp <thomas.lassdiesonnerein@gmx.de> wrote:
Hi,
since about 2 weeks I have random crashes of different apps on one of my Tumbleweed machines. Strg+Alt+F10 shows i/o errors on /dev/sdb after such an app crash. sdb is an SSD where root and home lived happily together for the past 4 years.
Could this mean that my SSD dies?
Post a few sample errors and it may be possible to give an educated guess. I find sata cables have to be replaced occasionally if they are flexed or connected/disconnected often. The rated life a typical sata cable is very few insertions (500 I think). They are cheap to buy. I would try that as my first troubleshooting step.
And where is the output of Strg+Alt+F10 stored? I can´t find the i/o error messages in /var/log/messages, dmesg, xsession-errors or boot.log.
If they are "i/o errors" and not "media errors" then I suspect the cables even more. Greg -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
On Saturday 10 October 2015 13.52:17 Greg Freemyer wrote:
On Sat, Oct 10, 2015 at 3:24 AM, Thomas Langkamp <thomas.lassdiesonnerein@gmx.de> wrote:
Hi,
since about 2 weeks I have random crashes of different apps on one of my Tumbleweed machines. Strg+Alt+F10 shows i/o errors on /dev/sdb after such an app crash. sdb is an SSD where root and home lived happily together for the past 4 years.
Could this mean that my SSD dies?
Post a few sample errors and it may be possible to give an educated guess.
I find sata cables have to be replaced occasionally if they are flexed or connected/disconnected often. The rated life a typical sata cable is very few insertions (500 I think). They are cheap to buy. I would try that as my first troubleshooting step.
And where is the output of Strg+Alt+F10 stored? I can´t find the i/o error messages in /var/log/messages, dmesg, xsession-errors or boot.log.
If they are "i/o errors" and not "media errors" then I suspect the cables even more.
Greg
In exceptional case it is not the cable but more the internal of the ssd that became crazy. I've seen that with samsung (a lot) a bit with corsair and not with crucial (yet?) my M550 crucial is only 4817 hours on power.... The usual way when those kind of wired things arrive is to use what is called secure erase. This blank totally the ssd and make ram/chips in order. With the corsair used under encrypted lvm ext4 (with 12.3/13.1) this appear once every 2 years. I'm now using kernel standard and a tricked lvm luks to enable trim on them. and don't have to redo the whole secure erase on the corsair ... But it doesn't have the crashing lifetime spend yet (should be around Christmas) What is painful is backup everything, and restore. Hopefully big ssd and usb3 are no more that expensive and fast ;-) Also yes be sure to have the latest firmware installed. -- Bruno Friedmann Ioda-Net Sàrl www.ioda-net.ch openSUSE Member & Board, fsfe fellowship GPG KEY : D5C9B751C4653227 irc: tigerfoot -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
participants (6)
-
Andrei Borzenkov
-
Bernhard Voelker
-
Bruno Friedmann
-
Carlos E. R.
-
Greg Freemyer
-
Thomas Langkamp