Re: [opensuse] Help! RO File System Lock down OpenSUSE 10.2 # PART SOLUTION #

1 Jul 2007

      On Wednesday 20 June 2007 00:56:00 Darryl Gregorash wrote:
...
You'll need to give us a lot more information about your system hardware
(including the modules that are loaded for hard drive i/o), plus
information from /var/log/messages about what is happening when the
filesystem goes RO.
OK. I will give as much as I can. The mail is therefore a bit long ...

I have solved the problem partly by keeping to one FS per drive, as suggested by Carl Hartung. Thanx Carl.

On Tuesday 19 June 2007 23:47:43 Carl Hartung wrote:
...
On Tue June 19 2007 17:11, LLLActive@GMX.Net wrote:
<snip>
...
... Can using different FS's in one system cause such problems?
Theoretically, no, but in actual fact there are circumstances where conflicts 
*can* arise.
In my case... with this specific chipset and corresponding kernel IDE 
controller module... cache buffering is enabled or disabled on a per drive 
basis. Running disparate filesystem types in adjacent partitions on the same 
drive (i.e. reiserfs + ext3) triggered errors comparable to those you're 
experiencing now.
I ultimately coaxed those errors away permanently by standardizing my 
installations to using only one journaling filesystem type per drive.
The system is much more stable.

################################

Last night. however, it happened again !!

I put my mobile phone on the USB port.
I left the mobile phone on the USB on to charge the batteries, thinking nothing of it.
Only when I did some access to it the files disappeared after the listing. The USB was detached automatically from the USB HUB. Again thinking nothing of it, I attached it directly to a USB port om the MOBO.
I then wanted to install from a dvd mounted as /dev/hdd, and did a lot of disk access, the system went RO FS again.
I went to bed ....

On Wednesday 20 June 2007 00:56:00 Darryl Gregorash wrote:
...
I tend to doubt that the specific filesystem(s) in use have anything at
all to do with this, but the high disk access probably does. There is a
thread on Dell about problems with the MegaRAID sas driver (module name
megasas) --
http://lists.us.dell.com/pipermail/linux-poweredge/2007-March/029974.html
-- but you have not given enough information for anyone to know if this
is relevant to your problem. Grep /var/log/messages for "megasas".
sudo more /var/log/messages | grep "megasys"  reports nothing

Looking at the logs again afterwards this morning, I noticed these SCSI part /dev/sda1. 

sico@sico:~> sudo more /var/log/messages | grep "sda"
Jun 30 22:43:28 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB)
Jun 30 22:43:28 sico kernel: sda: Write Protect is off
Jun 30 22:43:28 sico kernel: sda: Mode Sense: 00 6a 00 00
Jun 30 22:43:28 sico kernel: sda: assuming drive cache: write through
Jun 30 22:43:28 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB)
Jun 30 22:43:28 sico kernel: sda: Write Protect is off
Jun 30 22:43:28 sico kernel: sda: Mode Sense: 00 6a 00 00
Jun 30 22:43:28 sico kernel: sda: assuming drive cache: write through
Jun 30 22:43:28 sico kernel:  sda: sda1
Jun 30 22:43:28 sico kernel: sd 0:0:0:0: Attached scsi removable disk sda
Jun 30 22:43:30 sico hald: mounted /dev/sda1 on behalf of uid 1000
Jun 30 22:47:57 sico kernel: sda: Current: sense key: No Sense
...  (repeated many times) ...
Jun 30 22:47:58 sico kernel: end_request: I/O error, dev sda, sector 14464
Jun 30 22:47:59 sico hald: unmounted /dev/sda1 from '/media/disk' on behalf of uid 0
Jun 30 22:47:59 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB)
Jun 30 22:47:59 sico kernel: sda: Write Protect is off
Jun 30 22:47:59 sico kernel: sda: Mode Sense: 00 6a 00 00
Jun 30 22:47:59 sico kernel: sda: assuming drive cache: write through
Jun 30 22:47:59 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB)
Jun 30 22:47:59 sico kernel: sda: Write Protect is off
Jun 30 22:47:59 sico kernel: sda: Mode Sense: 00 6a 00 00
Jun 30 22:47:59 sico kernel: sda: assuming drive cache: write through
Jun 30 22:47:59 sico kernel:  sda: sda1
Jun 30 22:47:59 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB)
Jun 30 22:47:59 sico kernel: sda: Write Protect is off
Jun 30 22:47:59 sico kernel: sda: Mode Sense: 00 6a 00 00
Jun 30 22:47:59 sico kernel: sda: assuming drive cache: write through
Jun 30 22:47:59 sico kernel:  sda: sda1
Jun 30 22:48:01 sico hald: mounted /dev/sda1 on behalf of uid 1000
Jun 30 22:48:26 sico kernel: sda: Current: sense key: No Sense
Jun 30 22:48:27 sico kernel: end_request: I/O error, dev sda, sector 9152
Jun 30 22:48:27 sico kernel: end_request: I/O error, dev sda, sector 9152
...  (repeated many times) ...
Jun 30 22:48:27 sico kernel: end_request: I/O error, dev sda, sector 19328
Jun 30 22:48:27 sico hald: unmounted /dev/sda1 from '/media/disk' on behalf of uid 0
Jun 30 22:48:29 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB)
Jun 30 22:48:29 sico kernel: sda: Write Protect is off
Jun 30 22:48:29 sico kernel: sda: Mode Sense: 00 6a 00 00
Jun 30 22:48:29 sico kernel: sda: assuming drive cache: write through
Jun 30 22:48:29 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB)
Jun 30 22:48:29 sico kernel: sda: Write Protect is off
Jun 30 22:48:29 sico kernel: sda: Mode Sense: 00 6a 00 00
Jun 30 22:48:29 sico kernel: sda: assuming drive cache: write through
Jun 30 22:48:29 sico kernel:  sda: sda1
Jun 30 22:48:31 sico hald: mounted /dev/sda1 on behalf of uid 1000
Jun 30 22:48:37 sico kernel: sda: Current: sense key: No Sense
Jun 30 22:48:37 sico kernel: end_request: I/O error, dev sda, sector 40320
...  (repeated many times) ...
Jun 30 22:48:38 sico kernel: end_request: I/O error, dev sda, sector 46144
Jun 30 22:48:38 sico hald: unmounted /dev/sda1 from '/media/disk' on behalf of uid 0
Jun 30 22:48:40 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB)
Jun 30 22:48:40 sico kernel: sda: Write Protect is off
Jun 30 22:48:40 sico kernel: sda: Mode Sense: 00 6a 00 00
Jun 30 22:48:40 sico kernel: sda: assuming drive cache: write through
Jun 30 22:48:40 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB)
Jun 30 22:48:40 sico kernel: sda: Write Protect is off
Jun 30 22:48:40 sico kernel: sda: Mode Sense: 00 6a 00 00
Jun 30 22:48:40 sico kernel: sda: assuming drive cache: write through
Jun 30 22:48:40 sico kernel:  sda: sda1
Jun 30 22:48:41 sico hald: mounted /dev/sda1 on behalf of uid 1000
Jun 30 22:48:45 sico kernel: sda: Current: sense key: No Sense
Jun 30 22:48:45 sico kernel: end_request: I/O error, dev sda, sector 41280
Jun 30 22:48:45 sico kernel: end_request: I/O error, dev sda, sector 41280
...  (repeated many times) ...
Jun 30 22:48:46 sico kernel: end_request: I/O error, dev sda, sector 73344
Jun 30 22:48:47 sico hald: unmounted /dev/sda1 from '/media/disk' on behalf of uid 0
Jun 30 22:48:47 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB)
Jun 30 22:48:47 sico kernel: sda: Write Protect is off
Jun 30 22:48:47 sico kernel: sda: Mode Sense: 00 6a 00 00
Jun 30 22:48:47 sico kernel: sda: assuming drive cache: write through
Jun 30 22:48:47 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB)
Jun 30 22:48:47 sico kernel: sda: Write Protect is off
Jun 30 22:48:47 sico kernel: sda: Mode Sense: 00 6a 00 00
Jun 30 22:48:47 sico kernel: sda: assuming drive cache: write through
Jun 30 22:48:47 sico kernel:  sda: sda1
Jun 30 22:48:48 sico hald: mounted /dev/sda1 on behalf of uid 1000
Jun 30 22:48:50 sico kernel: sda: Current: sense key: No Sense
Jun 30 22:48:50 sico kernel: end_request: I/O error, dev sda, sector 1985
Jun 30 22:48:50 sico kernel: end_request: I/O error, dev sda, sector 41088
...  (repeated many times) ...
Jun 30 22:48:51 sico kernel: end_request: I/O error, dev sda, sector 50624
Jun 30 22:48:51 sico hald: unmounted /dev/sda1 from '/media/disk' on behalf of uid 0
Jun 30 22:48:53 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB)
Jun 30 22:48:53 sico kernel: sda: Write Protect is off
Jun 30 22:48:53 sico kernel: sda: Mode Sense: 00 6a 00 00
Jun 30 22:48:53 sico kernel: sda: assuming drive cache: write through
Jun 30 22:48:53 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB)
Jun 30 22:48:53 sico kernel: sda: Write Protect is off
Jun 30 22:48:53 sico kernel: sda: Mode Sense: 00 6a 00 00
Jun 30 22:48:53 sico kernel: sda: assuming drive cache: write through
Jun 30 22:48:53 sico kernel:  sda: sda1
Jun 30 22:48:54 sico hald: mounted /dev/sda1 on behalf of uid 1000
Jun 30 22:48:59 sico kernel: sda: Current: sense key: No Sense
Jun 30 22:48:59 sico kernel: end_request: I/O error, dev sda, sector 41088
...  (repeated many times) ...
...
One writer in that thread (on Dell) writes "the problem is that the
Linux kernel's SCSI layer insists on a single timeout for all SCSI
requests, and doesn't tolerate high variances in command completion
times. If any single command times out, it resets the whole bus, even if
there is still significant activity." This suggests that the problem is
more widespread than just a RAID issue. This is that writer's message --
http://lists.us.dell.com/pipermail/linux-poweredge/2007-March/029982.html
-- and it contains a suggestion that may be of use to you.
I found the mail of Joe Malicki (http://lists.us.dell.com/pipermail/linux-poweredge/2007-March/029982.html) about this topic and changed the SCSI timeout:

sico@sico:~> more /sys/block/sda/device/timeout
60
sico@sico:~> sudo echo 120 > /sys/block/sda/device/timeout
bash: /sys/block/sda/device/timeout: Permission denied
sico@sico:~> su -
Password:
sico:~ # echo 120 > /sys/block/sda/device/timeout

Current state:
...
Jul  1 14:47:35 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB)
Jul  1 14:47:35 sico kernel: sda: Write Protect is off
Jul  1 14:47:35 sico kernel: sda: Mode Sense: 00 6a 00 00
Jul  1 14:47:35 sico kernel: sda: assuming drive cache: write through
Jul  1 14:47:35 sico kernel: SCSI device sda: 3903488 512-byte hdwr sectors (1999 MB)
Jul  1 14:47:35 sico kernel: sda: Write Protect is off
Jul  1 14:47:35 sico kernel: sda: Mode Sense: 00 6a 00 00
Jul  1 14:47:35 sico kernel: sda: assuming drive cache: write through
Jul  1 14:47:35 sico kernel:  sda: sda1
Jul  1 14:47:51 sico hald: mounted /dev/sda1 on behalf of uid 1000

The lines:
Jun 30 22:48:59 sico kernel: sda: Current: sense key: No Sense
Jun 30 22:48:59 sico kernel: end_request: I/O error, dev sda, sector 41088
...  (repeated many times) ...

do not seem to come anymore after some extensive disk access as before. 

################################

I am not sure what to make of these RO comments in the last lines in messages. Can it be that it just reports that the DVD is RO?:

Jul  1 18:18:29 sico sudo:   sico : TTY=pts/1 ; PWD=/home/sico ; USER=root ; COMMAND=/bin/more /var/log/messages
Jul  1 18:20:29 sico kernel: ISO 9660 Extensions: Microsoft Joliet Level 3
Jul  1 18:20:29 sico kernel: ISO 9660 Extensions: RRIP_1991A
Jul  1 18:20:29 sico hald: mounted /dev/hdd on behalf of uid 1000
Jul  1 18:21:53 sico gconfd (sico-5635): GConf server is not in use, shutting down.
Jul  1 18:21:53 sico gconfd (sico-5635): Exiting
Jul  1 18:26:43 sico gconfd (sico-18750): starting (version 2.14.0), pid 18750 user 'sico'
Jul  1 18:26:43 sico gconfd (sico-18750): Resolved address "xml:readonly:/etc/opt/gnome/gconf/gconf.xml.mandatory" to a read-only configuration source at position 0
Jul  1 18:26:43 sico gconfd (sico-18750): Resolved address "xml:readwrite:/home/sico/.gconf" to a writable configuration source at position 1
Jul  1 18:26:43 sico gconfd (sico-18750): Resolved address "xml:readonly:/etc/opt/gnome/gconf/gconf.xml.defaults" to a read-only configuration source at position 2
Jul  1 18:26:43 sico gconfd (sico-18750): Resolved address "xml:readonly:/etc/opt/gnome/gconf/gconf.xml.schemas" to a read-only configuration source at position 3
Jul  1 18:27:13 sico gconfd (sico-18750): GConf server is not in use, shutting down.

################################

Is it normal for USB to use the SCSI layer? Can the SCSI layer be avoided? Can it be changed to IDE like /dev/hde?

:-)
Al
-- 
To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org
For additional commands, e-mail: opensuse+help@opensuse.org

Re: [opensuse] Help! RO File System Lock down OpenSUSE 10.2 # PART SOLUTION #

LLLActive＠GMX.Net