[opensuse] btrfs problem - how to run btrfsck on boot
Hello, got a problem with a machine here. First question: how to run btrfsck on boot? adding fsck.mode=force as kernel parameter seems only to only to check boot partition. The rebooted system fails again after some minutes (see log at end of post) 2nd question. What happened? The system ran for weeks without problems, five days ago a reboot because of updates. Then a reset on the SATA bus occured, but the btrfs is running on a level 1 mdraid. btrfs failed and volumes get remount ro. This is what I found in the logs: [15233.444359] BTRFS info (device md2): relocating block group 62373494784 flags 34 [15234.994089] BTRFS info (device md2): relocating block group 63480791040 flags 34 [15236.060747] BTRFS info (device md2): relocating block group 63514345472 flags 34 [15237.072850] BTRFS info (device md2): relocating block group 63547899904 flags 34 [15238.148345] BTRFS info (device md2): relocating block group 63581454336 flags 34 [393168.772025] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x1810000 action 0xe frozen [393168.772030] ata3.00: irq_stat 0x00400000, PHY RDY changed [393168.772032] ata3: SError: { PHYRdyChg LinkSeq TrStaTrns } [393168.772036] ata3.00: failed command: FLUSH CACHE EXT [393168.772041] ata3.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 19 res 40/00:94:c0:26:9c/00:00:08:00:00/40 Emask 0x10 (ATA bus error) [393168.772043] ata3.00: status: { DRDY } [393168.772046] ata3: hard resetting link [393172.792055] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [393172.793691] ata3.00: configured for UDMA/100 [393172.793695] ata3.00: retrying FLUSH 0xea Emask 0x10 [393172.793889] ata3: EH complete [394038.449156] BTRFS error (device md2): parent transid verify failed on 51072286720 wanted 132076 found 86207 [394038.463093] BTRFS error (device md2): parent transid verify failed on 51072286720 wanted 132076 found 86207 [394038.463105] ------------[ cut here ]------------ [394038.463159] WARNING: CPU: 1 PID: 491 at ../fs/btrfs/extent-tree.c:2927 btrfs_run_delayed_refs+0x27a/0x2f0 [btrfs]() [394038.463159] BTRFS: Transaction aborted (error -5) [394038.463200] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache af_packet iscsi_ibft iscsi_boot_sysfs snd_hda_codec_realtek snd_hda_codec_generic ext4 crc16 jbd2 mbcache snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore pcspkr acpi_cpufreq fjes i2c_nforce2 coretemp forcedeth shpchp processor btrfs xor raid6_pq raid1 md_mod uas usb_storage sr_mod cdrom sd_mod ata_generic nouveau firewire_ohci ahci ohci_pci libahci pata_amd mxm_wmi video i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm serio_raw drm firewire_core crc_itu_t ehci_pci ohci_hcd ehci_hcd usbcore libata usb_common wmi button sg scsi_mod autofs4 [394038.463203] CPU: 1 PID: 491 Comm: btrfs-transacti Not tainted 4.4.36-8-default #1 [394038.463203] Hardware name: PACKARD BELL BV iMedia X1082/MCP73, BIOS PBDIEGMB.P15 07/10/2008 [394038.463206] 0000000000000000 ffffffff81327b17 ffff880035fd3d78 ffffffffa0595699 [394038.463208] ffffffff8107e841 ffff880095f649a0 ffff880035fd3dc8 ffff88002a160f80 [394038.463209] 0000000000000000 ffff88002a160e10 ffffffff8107e8bc ffffffffa0598470 [394038.463210] Call Trace: [394038.463223] [<ffffffff81019ea9>] dump_trace+0x59/0x320 [394038.463226] [<ffffffff8101a26a>] show_stack_log_lvl+0xfa/0x180 [394038.463229] [<ffffffff8101b011>] show_stack+0x21/0x40 [394038.463233] [<ffffffff81327b17>] dump_stack+0x5c/0x85 [394038.463237] [<ffffffff8107e841>] warn_slowpath_common+0x81/0xb0 [394038.463240] [<ffffffff8107e8bc>] warn_slowpath_fmt+0x4c/0x50 [394038.463256] [<ffffffffa04fffea>] btrfs_run_delayed_refs+0x27a/0x2f0 [btrfs] [394038.463282] [<ffffffffa0515b20>] btrfs_commit_transaction+0x40/0xaf0 [btrfs] [394038.463301] [<ffffffffa051062b>] transaction_kthread+0x21b/0x280 [btrfs] [394038.463305] [<ffffffff8109d308>] kthread+0xc8/0xe0 [394038.463311] [<ffffffff8160ac8f>] ret_from_fork+0x3f/0x70 [394038.464013] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70 [394038.464013] Leftover inexact backtrace: [394038.464013] [<ffffffff8109d240>] ? kthread_park+0x50/0x50 [394038.465233] ---[ end trace 558b3d028338b98f ]--- [394038.465237] BTRFS: error (device md2) in btrfs_run_delayed_refs:2927: errno=-5 IO failure [394038.465239] BTRFS info (device md2): forced readonly [394038.465490] pending csums is 4096 what confuses me twice is this: parent transid verify failed [...] 132076 found 86207 this difference of transaction ids is very huge?! that is what happens after a reboot: [ 329.623911] BTRFS error (device md2): parent transid verify failed on 51072483328 wanted 132076 found 117819 [ 329.629476] BTRFS error (device md2): parent transid verify failed on 51072483328 wanted 132076 found 117819 [ 329.629493] ------------[ cut here ]------------ [ 329.629560] WARNING: CPU: 0 PID: 303 at ../fs/btrfs/extent-tree.c:2927 btrfs_run_delayed_refs+0x27a/0x2f0 [btrfs]() [ 329.629561] BTRFS: Transaction aborted (error -5) [ 329.629602] Modules linked in: joydev rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache snd_hda_codec_realtek snd_hda_codec_generic af_packet iscsi_ibft iscsi_boot_sysfs ext4 crc16 jbd2 mbcache snd_hda_codec_hdmi coretemp snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep pcspkr snd_pcm snd_timer acpi_cpufreq snd forcedeth fjes shpchp soundcore processor i2c_nforce2 btrfs xor raid6_pq raid1 md_mod uas usb_storage sd_mod sr_mod cdrom ata_generic ahci libahci pata_amd firewire_ohci nouveau ohci_pci mxm_wmi video i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm firewire_core serio_raw crc_itu_t libata ehci_pci ohci_hcd ehci_hcd usbcore usb_common wmi button sg scsi_mod autofs4 [ 329.629605] CPU: 0 PID: 303 Comm: kworker/u8:3 Not tainted 4.4.36-8-default #1 [ 329.629606] Hardware name: PACKARD BELL BV iMedia X1082/MCP73, BIOS PBDIEGMB.P15 07/10/2008 [ 329.629625] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs] [ 329.629628] 0000000000000000 ffffffff81327b17 ffff8800351f3d28 ffffffffa05a9699 [ 329.629630] ffffffff8107e841 ffff8800bf5b0870 ffff8800351f3d78 ffff8800ba5611c0 [ 329.629631] 0000000000000020 ffff8800ba561050 ffffffff8107e8bc ffffffffa05ac470 [ 329.629632] Call Trace: [ 329.629651] [<ffffffff81019ea9>] dump_trace+0x59/0x320 [ 329.629655] [<ffffffff8101a26a>] show_stack_log_lvl+0xfa/0x180 [ 329.629658] [<ffffffff8101b011>] show_stack+0x21/0x40 [ 329.629662] [<ffffffff81327b17>] dump_stack+0x5c/0x85 [ 329.629667] [<ffffffff8107e841>] warn_slowpath_common+0x81/0xb0 [ 329.629670] [<ffffffff8107e8bc>] warn_slowpath_fmt+0x4c/0x50 [ 329.629686] [<ffffffffa0513fea>] btrfs_run_delayed_refs+0x27a/0x2f0 [btrfs] [ 329.629709] [<ffffffffa0514092>] delayed_ref_async_start+0x32/0x80 [btrfs] [ 329.629728] [<ffffffffa055a633>] normal_work_helper+0xc3/0x320 [btrfs] [ 329.629734] [<ffffffff810971e5>] process_one_work+0x155/0x440 [ 329.629737] [<ffffffff81097d26>] worker_thread+0x116/0x4b0 [ 329.629740] [<ffffffff8109d308>] kthread+0xc8/0xe0 [ 329.629746] [<ffffffff8160ac8f>] ret_from_fork+0x3f/0x70 [ 329.631628] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70 [ 329.631629] Leftover inexact backtrace: [ 329.631636] [<ffffffff8109d240>] ? kthread_park+0x50/0x50 [ 329.631638] ---[ end trace 92c71c247ee8d7b4 ]--- [ 329.631644] BTRFS: error (device md2) in btrfs_run_delayed_refs:2927: errno=-5 IO failure [ 329.631646] BTRFS info (device md2): forced readonly I am just looking for a blank DVD to start a live/rescue system. With other filesystems there was no such trouble, fsck was forced after unclean unmounts. Thanks a lot and happy christmas Paul -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 12/23/2016 01:26 PM, Paul Neuwirth wrote:
Hello, got a problem with a machine here. First question: how to run btrfsck on boot? adding fsck.mode=force as kernel parameter seems only to only to check boot partition. The rebooted system fails again after some minutes (see log at end of post) 2nd question. What happened? The system ran for weeks without problems, five days ago a reboot because of updates. Then a reset on the SATA bus occured, but the btrfs is running on a level 1 mdraid. btrfs failed and volumes get remount ro. This is what I found in the logs: [15233.444359] BTRFS info (device md2): relocating block group 62373494784 flags 34 [15234.994089] BTRFS info (device md2): relocating block group 63480791040 flags 34 [15236.060747] BTRFS info (device md2): relocating block group 63514345472 flags 34 [15237.072850] BTRFS info (device md2): relocating block group 63547899904 flags 34 [15238.148345] BTRFS info (device md2): relocating block group 63581454336 flags 34 [393168.772025] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x1810000 action 0xe frozen [393168.772030] ata3.00: irq_stat 0x00400000, PHY RDY changed [393168.772032] ata3: SError: { PHYRdyChg LinkSeq TrStaTrns } [393168.772036] ata3.00: failed command: FLUSH CACHE EXT [393168.772041] ata3.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 19 res 40/00:94:c0:26:9c/00:00:08:00:00/40 Emask 0x10 (ATA bus error) [393168.772043] ata3.00: status: { DRDY } [393168.772046] ata3: hard resetting link [393172.792055] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [393172.793691] ata3.00: configured for UDMA/100 [393172.793695] ata3.00: retrying FLUSH 0xea Emask 0x10 [393172.793889] ata3: EH complete [394038.449156] BTRFS error (device md2): parent transid verify failed on 51072286720 wanted 132076 found 86207 [394038.463093] BTRFS error (device md2): parent transid verify failed on 51072286720 wanted 132076 found 86207 [394038.463105] ------------[ cut here ]------------ [394038.463159] WARNING: CPU: 1 PID: 491 at ../fs/btrfs/extent-tree.c:2927 btrfs_run_delayed_refs+0x27a/0x2f0 [btrfs]() [394038.463159] BTRFS: Transaction aborted (error -5) [394038.463200] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache af_packet iscsi_ibft iscsi_boot_sysfs snd_hda_codec_realtek snd_hda_codec_generic ext4 crc16 jbd2 mbcache snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore pcspkr acpi_cpufreq fjes i2c_nforce2 coretemp forcedeth shpchp processor btrfs xor raid6_pq raid1 md_mod uas usb_storage sr_mod cdrom sd_mod ata_generic nouveau firewire_ohci ahci ohci_pci libahci pata_amd mxm_wmi video i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm serio_raw drm firewire_core crc_itu_t ehci_pci ohci_hcd ehci_hcd usbcore libata usb_common wmi button sg scsi_mod autofs4 [394038.463203] CPU: 1 PID: 491 Comm: btrfs-transacti Not tainted 4.4.36-8-default #1 [394038.463203] Hardware name: PACKARD BELL BV iMedia X1082/MCP73, BIOS PBDIEGMB.P15 07/10/2008 [394038.463206] 0000000000000000 ffffffff81327b17 ffff880035fd3d78 ffffffffa0595699 [394038.463208] ffffffff8107e841 ffff880095f649a0 ffff880035fd3dc8 ffff88002a160f80 [394038.463209] 0000000000000000 ffff88002a160e10 ffffffff8107e8bc ffffffffa0598470 [394038.463210] Call Trace: [394038.463223] [<ffffffff81019ea9>] dump_trace+0x59/0x320 [394038.463226] [<ffffffff8101a26a>] show_stack_log_lvl+0xfa/0x180 [394038.463229] [<ffffffff8101b011>] show_stack+0x21/0x40 [394038.463233] [<ffffffff81327b17>] dump_stack+0x5c/0x85 [394038.463237] [<ffffffff8107e841>] warn_slowpath_common+0x81/0xb0 [394038.463240] [<ffffffff8107e8bc>] warn_slowpath_fmt+0x4c/0x50 [394038.463256] [<ffffffffa04fffea>] btrfs_run_delayed_refs+0x27a/0x2f0 [btrfs] [394038.463282] [<ffffffffa0515b20>] btrfs_commit_transaction+0x40/0xaf0 [btrfs] [394038.463301] [<ffffffffa051062b>] transaction_kthread+0x21b/0x280 [btrfs] [394038.463305] [<ffffffff8109d308>] kthread+0xc8/0xe0 [394038.463311] [<ffffffff8160ac8f>] ret_from_fork+0x3f/0x70 [394038.464013] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
[394038.464013] Leftover inexact backtrace:
[394038.464013] [<ffffffff8109d240>] ? kthread_park+0x50/0x50 [394038.465233] ---[ end trace 558b3d028338b98f ]--- [394038.465237] BTRFS: error (device md2) in btrfs_run_delayed_refs:2927: errno=-5 IO failure [394038.465239] BTRFS info (device md2): forced readonly [394038.465490] pending csums is 4096
what confuses me twice is this: parent transid verify failed [...] 132076 found 86207 this difference of transaction ids is very huge?!
that is what happens after a reboot: [ 329.623911] BTRFS error (device md2): parent transid verify failed on 51072483328 wanted 132076 found 117819 [ 329.629476] BTRFS error (device md2): parent transid verify failed on 51072483328 wanted 132076 found 117819 [ 329.629493] ------------[ cut here ]------------ [ 329.629560] WARNING: CPU: 0 PID: 303 at ../fs/btrfs/extent-tree.c:2927 btrfs_run_delayed_refs+0x27a/0x2f0 [btrfs]() [ 329.629561] BTRFS: Transaction aborted (error -5) [ 329.629602] Modules linked in: joydev rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache snd_hda_codec_realtek snd_hda_codec_generic af_packet iscsi_ibft iscsi_boot_sysfs ext4 crc16 jbd2 mbcache snd_hda_codec_hdmi coretemp snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep pcspkr snd_pcm snd_timer acpi_cpufreq snd forcedeth fjes shpchp soundcore processor i2c_nforce2 btrfs xor raid6_pq raid1 md_mod uas usb_storage sd_mod sr_mod cdrom ata_generic ahci libahci pata_amd firewire_ohci nouveau ohci_pci mxm_wmi video i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm firewire_core serio_raw crc_itu_t libata ehci_pci ohci_hcd ehci_hcd usbcore usb_common wmi button sg scsi_mod autofs4 [ 329.629605] CPU: 0 PID: 303 Comm: kworker/u8:3 Not tainted 4.4.36-8-default #1 [ 329.629606] Hardware name: PACKARD BELL BV iMedia X1082/MCP73, BIOS PBDIEGMB.P15 07/10/2008 [ 329.629625] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs] [ 329.629628] 0000000000000000 ffffffff81327b17 ffff8800351f3d28 ffffffffa05a9699 [ 329.629630] ffffffff8107e841 ffff8800bf5b0870 ffff8800351f3d78 ffff8800ba5611c0 [ 329.629631] 0000000000000020 ffff8800ba561050 ffffffff8107e8bc ffffffffa05ac470 [ 329.629632] Call Trace: [ 329.629651] [<ffffffff81019ea9>] dump_trace+0x59/0x320 [ 329.629655] [<ffffffff8101a26a>] show_stack_log_lvl+0xfa/0x180 [ 329.629658] [<ffffffff8101b011>] show_stack+0x21/0x40 [ 329.629662] [<ffffffff81327b17>] dump_stack+0x5c/0x85 [ 329.629667] [<ffffffff8107e841>] warn_slowpath_common+0x81/0xb0 [ 329.629670] [<ffffffff8107e8bc>] warn_slowpath_fmt+0x4c/0x50 [ 329.629686] [<ffffffffa0513fea>] btrfs_run_delayed_refs+0x27a/0x2f0 [btrfs] [ 329.629709] [<ffffffffa0514092>] delayed_ref_async_start+0x32/0x80 [btrfs] [ 329.629728] [<ffffffffa055a633>] normal_work_helper+0xc3/0x320 [btrfs] [ 329.629734] [<ffffffff810971e5>] process_one_work+0x155/0x440 [ 329.629737] [<ffffffff81097d26>] worker_thread+0x116/0x4b0 [ 329.629740] [<ffffffff8109d308>] kthread+0xc8/0xe0 [ 329.629746] [<ffffffff8160ac8f>] ret_from_fork+0x3f/0x70 [ 329.631628] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
[ 329.631629] Leftover inexact backtrace:
[ 329.631636] [<ffffffff8109d240>] ? kthread_park+0x50/0x50 [ 329.631638] ---[ end trace 92c71c247ee8d7b4 ]--- [ 329.631644] BTRFS: error (device md2) in btrfs_run_delayed_refs:2927: errno=-5 IO failure [ 329.631646] BTRFS info (device md2): forced readonly
I am just looking for a blank DVD to start a live/rescue system. With other filesystems there was no such trouble, fsck was forced after unclean unmounts.
Thanks a lot and happy christmas
Paul
What version of Opensuse are you using, and what kernel. Back in 13.2 early days I ran btrfs for about 6 months. After the second corruption with data loss I went back to ext4 and xfs, and swore off btrfs for the next 5 years. I had to boot to rescue cd and do a btrfs check and when that showed the actual errors follow up with a btrfs check --repair As soon as I got the system back, I backed up all the data off to another drive, and restored it to the above mentioned file systems. Now I've been told that btrfs is MUCH MUCH better now and I should go back to using it but I don't see that happening until there is a pressing need for some feature btrfs provides, and no such thing has caught my eye. -- After all is said and done, more is said than done. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Friday 2016-12-23 22:37, John Andersen wrote:
Date: Fri, 23 Dec 2016 22:37:37 From: John Andersen <jsamyth@gmail.com> To: opensuse@opensuse.org Subject: Re: [opensuse] btrfs problem - how to run btrfsck on boot
On 12/23/2016 01:26 PM, Paul Neuwirth wrote:
Hello, got a problem with a machine here. First question: how to run btrfsck on boot? adding fsck.mode=force as kernel parameter seems only to only to check boot partition. The rebooted system fails again after some minutes (see log at end of post) 2nd question. What happened? The system ran for weeks without problems, five days ago a reboot because of updates. Then a reset on the SATA bus occured, but the btrfs is running on a level 1 mdraid. btrfs failed and volumes get remount ro. This is what I found in the logs: [15233.444359] BTRFS info (device md2): relocating block group 62373494784 flags 34 [15234.994089] BTRFS info (device md2): relocating block group 63480791040 flags 34 [15236.060747] BTRFS info (device md2): relocating block group 63514345472 flags 34 [15237.072850] BTRFS info (device md2): relocating block group 63547899904 flags 34 [15238.148345] BTRFS info (device md2): relocating block group 63581454336 flags 34 [393168.772025] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x1810000 action 0xe frozen [393168.772030] ata3.00: irq_stat 0x00400000, PHY RDY changed [393168.772032] ata3: SError: { PHYRdyChg LinkSeq TrStaTrns } [393168.772036] ata3.00: failed command: FLUSH CACHE EXT [393168.772041] ata3.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 19 res 40/00:94:c0:26:9c/00:00:08:00:00/40 Emask 0x10 (ATA bus error) [393168.772043] ata3.00: status: { DRDY } [393168.772046] ata3: hard resetting link [393172.792055] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [393172.793691] ata3.00: configured for UDMA/100 [393172.793695] ata3.00: retrying FLUSH 0xea Emask 0x10 [393172.793889] ata3: EH complete [394038.449156] BTRFS error (device md2): parent transid verify failed on 51072286720 wanted 132076 found 86207 [394038.463093] BTRFS error (device md2): parent transid verify failed on 51072286720 wanted 132076 found 86207 [394038.463105] ------------[ cut here ]------------ [394038.463159] WARNING: CPU: 1 PID: 491 at ../fs/btrfs/extent-tree.c:2927 btrfs_run_delayed_refs+0x27a/0x2f0 [btrfs]() [394038.463159] BTRFS: Transaction aborted (error -5) [394038.463200] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache af_packet iscsi_ibft iscsi_boot_sysfs snd_hda_codec_realtek snd_hda_codec_generic ext4 crc16 jbd2 mbcache snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore pcspkr acpi_cpufreq fjes i2c_nforce2 coretemp forcedeth shpchp processor btrfs xor raid6_pq raid1 md_mod uas usb_storage sr_mod cdrom sd_mod ata_generic nouveau firewire_ohci ahci ohci_pci libahci pata_amd mxm_wmi video i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm serio_raw drm firewire_core crc_itu_t ehci_pci ohci_hcd ehci_hcd usbcore libata usb_common wmi button sg scsi_mod autofs4 [394038.463203] CPU: 1 PID: 491 Comm: btrfs-transacti Not tainted 4.4.36-8-default #1 [394038.463203] Hardware name: PACKARD BELL BV iMedia X1082/MCP73, BIOS PBDIEGMB.P15 07/10/2008 [394038.463206] 0000000000000000 ffffffff81327b17 ffff880035fd3d78 ffffffffa0595699 [394038.463208] ffffffff8107e841 ffff880095f649a0 ffff880035fd3dc8 ffff88002a160f80 [394038.463209] 0000000000000000 ffff88002a160e10 ffffffff8107e8bc ffffffffa0598470 [394038.463210] Call Trace: [394038.463223] [<ffffffff81019ea9>] dump_trace+0x59/0x320 [394038.463226] [<ffffffff8101a26a>] show_stack_log_lvl+0xfa/0x180 [394038.463229] [<ffffffff8101b011>] show_stack+0x21/0x40 [394038.463233] [<ffffffff81327b17>] dump_stack+0x5c/0x85 [394038.463237] [<ffffffff8107e841>] warn_slowpath_common+0x81/0xb0 [394038.463240] [<ffffffff8107e8bc>] warn_slowpath_fmt+0x4c/0x50 [394038.463256] [<ffffffffa04fffea>] btrfs_run_delayed_refs+0x27a/0x2f0 [btrfs] [394038.463282] [<ffffffffa0515b20>] btrfs_commit_transaction+0x40/0xaf0 [btrfs] [394038.463301] [<ffffffffa051062b>] transaction_kthread+0x21b/0x280 [btrfs] [394038.463305] [<ffffffff8109d308>] kthread+0xc8/0xe0 [394038.463311] [<ffffffff8160ac8f>] ret_from_fork+0x3f/0x70 [394038.464013] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
[394038.464013] Leftover inexact backtrace:
[394038.464013] [<ffffffff8109d240>] ? kthread_park+0x50/0x50 [394038.465233] ---[ end trace 558b3d028338b98f ]--- [394038.465237] BTRFS: error (device md2) in btrfs_run_delayed_refs:2927: errno=-5 IO failure [394038.465239] BTRFS info (device md2): forced readonly [394038.465490] pending csums is 4096
what confuses me twice is this: parent transid verify failed [...] 132076 found 86207 this difference of transaction ids is very huge?!
that is what happens after a reboot: [ 329.623911] BTRFS error (device md2): parent transid verify failed on 51072483328 wanted 132076 found 117819 [ 329.629476] BTRFS error (device md2): parent transid verify failed on 51072483328 wanted 132076 found 117819 [ 329.629493] ------------[ cut here ]------------ [ 329.629560] WARNING: CPU: 0 PID: 303 at ../fs/btrfs/extent-tree.c:2927 btrfs_run_delayed_refs+0x27a/0x2f0 [btrfs]() [ 329.629561] BTRFS: Transaction aborted (error -5) [ 329.629602] Modules linked in: joydev rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache snd_hda_codec_realtek snd_hda_codec_generic af_packet iscsi_ibft iscsi_boot_sysfs ext4 crc16 jbd2 mbcache snd_hda_codec_hdmi coretemp snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep pcspkr snd_pcm snd_timer acpi_cpufreq snd forcedeth fjes shpchp soundcore processor i2c_nforce2 btrfs xor raid6_pq raid1 md_mod uas usb_storage sd_mod sr_mod cdrom ata_generic ahci libahci pata_amd firewire_ohci nouveau ohci_pci mxm_wmi video i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm firewire_core serio_raw crc_itu_t libata ehci_pci ohci_hcd ehci_hcd usbcore usb_common wmi button sg scsi_mod autofs4 [ 329.629605] CPU: 0 PID: 303 Comm: kworker/u8:3 Not tainted 4.4.36-8-default #1 [ 329.629606] Hardware name: PACKARD BELL BV iMedia X1082/MCP73, BIOS PBDIEGMB.P15 07/10/2008 [ 329.629625] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs] [ 329.629628] 0000000000000000 ffffffff81327b17 ffff8800351f3d28 ffffffffa05a9699 [ 329.629630] ffffffff8107e841 ffff8800bf5b0870 ffff8800351f3d78 ffff8800ba5611c0 [ 329.629631] 0000000000000020 ffff8800ba561050 ffffffff8107e8bc ffffffffa05ac470 [ 329.629632] Call Trace: [ 329.629651] [<ffffffff81019ea9>] dump_trace+0x59/0x320 [ 329.629655] [<ffffffff8101a26a>] show_stack_log_lvl+0xfa/0x180 [ 329.629658] [<ffffffff8101b011>] show_stack+0x21/0x40 [ 329.629662] [<ffffffff81327b17>] dump_stack+0x5c/0x85 [ 329.629667] [<ffffffff8107e841>] warn_slowpath_common+0x81/0xb0 [ 329.629670] [<ffffffff8107e8bc>] warn_slowpath_fmt+0x4c/0x50 [ 329.629686] [<ffffffffa0513fea>] btrfs_run_delayed_refs+0x27a/0x2f0 [btrfs] [ 329.629709] [<ffffffffa0514092>] delayed_ref_async_start+0x32/0x80 [btrfs] [ 329.629728] [<ffffffffa055a633>] normal_work_helper+0xc3/0x320 [btrfs] [ 329.629734] [<ffffffff810971e5>] process_one_work+0x155/0x440 [ 329.629737] [<ffffffff81097d26>] worker_thread+0x116/0x4b0 [ 329.629740] [<ffffffff8109d308>] kthread+0xc8/0xe0 [ 329.629746] [<ffffffff8160ac8f>] ret_from_fork+0x3f/0x70 [ 329.631628] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
[ 329.631629] Leftover inexact backtrace:
[ 329.631636] [<ffffffff8109d240>] ? kthread_park+0x50/0x50 [ 329.631638] ---[ end trace 92c71c247ee8d7b4 ]--- [ 329.631644] BTRFS: error (device md2) in btrfs_run_delayed_refs:2927: errno=-5 IO failure [ 329.631646] BTRFS info (device md2): forced readonly
I am just looking for a blank DVD to start a live/rescue system. With other filesystems there was no such trouble, fsck was forced after unclean unmounts.
Thanks a lot and happy christmas
Paul
What version of Opensuse are you using, and what kernel.
Back in 13.2 early days I ran btrfs for about 6 months. After the second corruption with data loss I went back to ext4 and xfs, and swore off btrfs for the next 5 years.
I had to boot to rescue cd and do a btrfs check and when that showed the actual errors follow up with a btrfs check --repair
As soon as I got the system back, I backed up all the data off to another drive, and restored it to the above mentioned file systems.
Now I've been told that btrfs is MUCH MUCH better now and I should go back to using it but I don't see that happening until there is a pressing need for some feature btrfs provides, and no such thing has caught my eye.
sorry, I missed to mention this important info: it's openSUSE Leap 42.2, recent kernel 4.4.36-8-default
-- After all is said and done, more is said than done.
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 12/23/2016 03:37 PM, John Andersen wrote:
Now I've been told that btrfs is MUCH MUCH better now and I should go back to using it but I don't see that happening until there is a pressing need for some feature btrfs provides, and no such thing has caught my eye.
Leap 42.2 install, partitioning-> expert options, wipe suggested, create extended partition full size of disk, then /boot, /, /home (all ext4), swap and no complaints whatsoever. I've watched the list albeit irregularly since 13.1 regarding btrfs. Still far too many "btrfs problem" posts to trust with production machine. I'll try a spare drive with it, but honestly, I have no complaint with ext4 performance or journal size, etc.. and I've not lost a bit of data due to a filesystem issue since suse 7.0 pro (Air) with reiserfs, ext2, ext3, and now ext4. I'm still of the mind that install should default to ext4/xfs and let people choose btrfs as an "option" rather than the current approach. openSuSE has had a dubious history foisting less than proven software as the release 'default' (some of which has later been shown to probably not have been the wisest choice at the time). Anyone recall the wonderful 'Release' of KDE 4.0.4 in May 2008? I'm not saying btrfs is kde 4.0.4 (nobody's install would run for more than 5 minutes without segfaulting -- and that is certainly not the case) But when a simple search of the list turn up hundreds of posts regarding btrfs problems (many with the words 'data loss' included) I prefer to stick with the bullet-proof, tried-and-true, option -- even if it may be a tad less efficient spacewise, or a tad slower. -- David C. Rankin, J.D.,P.E. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 12/31/2016 03:41 PM, David C. Rankin wrote:
Anyone recall the wonderful 'Release' of KDE 4.0.4 in May 2008?
I was determined not to go there, because its still a sore point with many. -- After all is said and done, more is said than done. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2016-12-23 22:26, Paul Neuwirth wrote:
Hello, got a problem with a machine here. First question: how to run btrfsck on boot? adding fsck.mode=force as kernel parameter seems only to only to check boot partition. The rebooted system fails again after some minutes (see log at end of post)
As far as I know, it is not possible. Use a rescue system.
[393168.772025] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x1810000 action 0xe frozen [393168.772030] ata3.00: irq_stat 0x00400000, PHY RDY changed [393168.772032] ata3: SError: { PHYRdyChg LinkSeq TrStaTrns } [393168.772036] ata3.00: failed command: FLUSH CACHE EXT [393168.772041] ata3.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 19 res 40/00:94:c0:26:9c/00:00:08:00:00/40 Emask 0x10 (ATA bus error)
I would check the hardware. Run the long SMART test on both hard disks.
Thanks a lot and happy christmas
Same. Just remember to trim the logs out of your responses, there is no need to send them again. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iF4EAREIAAYFAlhdnYgACgkQja8UbcUWM1weKwD/c4Nhd5lcS3qY+w+Sy18VfwMI oCtSZYY0uktPXJ0G8egA/0nfmlHEYAAPz+Q+toqRPuRR8JyQvC+Uo4MkIqwlHIPU =1kpT -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Friday 2016-12-23 22:56, Carlos E. R. wrote:
Date: Fri, 23 Dec 2016 22:56:24 From: Carlos E. R. <robin.listas@telefonica.net> To: opensuse@opensuse.org Subject: Re: [opensuse] btrfs problem - how to run btrfsck on boot
On 2016-12-23 22:26, Paul Neuwirth wrote:
Hello, got a problem with a machine here. First question: how to run btrfsck on boot? adding fsck.mode=force as kernel parameter seems only to only to check boot partition. The rebooted system fails again after some minutes (see log at end of post)
As far as I know, it is not possible. Use a rescue system.
maybe another reason to switch back to ext4. btrfsck failed IIRC with exit code 234. ran btrfsck --repair /dev/md2 . now I started btrfsck --repair -p /dev/md2 and waiting for result (tons of lines).
[393168.772025] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x1810000 action 0xe frozen [393168.772030] ata3.00: irq_stat 0x00400000, PHY RDY changed [393168.772032] ata3: SError: { PHYRdyChg LinkSeq TrStaTrns } [393168.772036] ata3.00: failed command: FLUSH CACHE EXT [393168.772041] ata3.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 19 res 40/00:94:c0:26:9c/00:00:08:00:00/40 Emask 0x10 (ATA bus error)
I would check the hardware. Run the long SMART test on both hard disks.
I also forgot to mention, mdadm is not reporting errors, (last night before) long selftests ran without error, SMART data looks also ok.
Thanks a lot and happy christmas
Same.
Just remember to trim the logs out of your responses, there is no need to send them again.
thanks for the hint
- -- Cheers / Saludos,
Carlos E. R.
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2016-12-23 23:04, Paul Neuwirth wrote:
On Friday 2016-12-23 22:56, Carlos E. R. wrote:
On 2016-12-23 22:26, Paul Neuwirth wrote:
Hello, got a problem with a machine here. First question: how to run btrfsck on boot? adding fsck.mode=force as kernel parameter seems only to only to check boot partition. The rebooted system fails again after some minutes (see log at end of post)
As far as I know, it is not possible. Use a rescue system.
maybe another reason to switch back to ext4.
Same thing happens with XFS, can not be repaired from itself. The problem for us is that openSUSE no longer provides a live rescue system.
40/00:94:c0:26:9c/00:00:08:00:00/40 Emask 0x10 (ATA bus error)
I would check the hardware. Run the long SMART test on both hard disks.
I also forgot to mention, mdadm is not reporting errors, (last night before) long selftests ran without error, SMART data looks also ok.
Ah, good. Then, I don't know more, sorry. I personally do not use btrfs. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iF4EAREIAAYFAlhdoHsACgkQja8UbcUWM1wD1AD/cg1YPC/Zj4QIV+Qea3vww3v4 AjN1+rw+UVNjV4jze5IA/12bNYByo2i2v9PBO3Xc2+seB52EGPBBvvS/fZ4UCCPE =Toti -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 12/23/2016 02:04 PM, Paul Neuwirth wrote:
maybe another reason to switch back to ext4. btrfsck failed IIRC with exit code 234. ran btrfsck --repair /dev/md2 . now I started btrfsck --repair -p /dev/md2 and waiting for result (tons of lines).
The chances of btrfs failing in RAID 1 are very slim, like near 0, unless a controller issue or problems with the drive(s) themselves. mdadm is software RAID and it's not surprising people have issues with it as software RAID is not production-level RAID. Software RAID (poor man's RAID) just isn't that good. If you want real RAID, then you need a true controller card in a server that has ECC ram, with something like a PERC RAID controller card. That has a true battery backup should the server or workstation lose power, so whatever is in the buffer that didn't get written to disk gets written the next time it's powered up. It's easy to point the finger at the filesystem when in actuality it's shoddy hardware or just because the parity bits get all messed up because of software RAID. The more recent problems with btrfs and RAID had to do with RAID 5/6, and supposedly the issues have been mostly fixed. btrfs does have numerous advantages over EXT4, one being checksumming and another being snapshots. This is just my opinion, but with btrfs and any type of RAID, I will only run it with a true RAID controller card. mdadm and btrfs RAID (btrfs has built-in software RAID) I believe users end up having more problems with in the long run. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2016-12-23 23:49, sdm wrote:
On 12/23/2016 02:04 PM, Paul Neuwirth wrote:
maybe another reason to switch back to ext4. btrfsck failed IIRC with exit code 234. ran btrfsck --repair /dev/md2 . now I started btrfsck --repair -p /dev/md2 and waiting for result (tons of lines).
The chances of btrfs failing in RAID 1 are very slim, like near 0, unless a controller issue or problems with the drive(s) themselves.
Not true.
mdadm is software RAID and it's not surprising people have issues with it as software RAID is not production-level RAID.
That's not true, either.
It's easy to point the finger at the filesystem when in actuality it's shoddy hardware or just because the parity bits get all messed up because of software RAID. The more recent problems with btrfs and RAID had to do with RAID 5/6, and supposedly the issues have been mostly fixed.
Those are problems with btrfs own version of raid, which has nothing to do with traditional Linux software raid, which is what the OP is using. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iF4EAREIAAYFAlhdv4IACgkQja8UbcUWM1wLUgD/b/WkgI9vo78cfWilvWcn5HUL Vy7yQbNPxSfaCZMpRyQA/ioDovAcKFj3niqcHW8m2xDK8K/Yq2WyGpRBtgZw5Jri =dU9K -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 23/12/16 22:49, sdm wrote:
It's easy to point the finger at the filesystem when in actuality it's shoddy hardware or just because the parity bits get all messed up because of software RAID. The more recent problems with btrfs and RAID had to do with RAID 5/6, and supposedly the issues have been mostly fixed. btrfs does have numerous advantages over EXT4, one being checksumming and another being snapshots. This is just my opinion, but with btrfs and any type of RAID, I will only run it with a true RAID controller card. mdadm and btrfs RAID (btrfs has built-in software RAID) I believe users end up having more problems with in the long run.
Until your raid controller fails, and you can't source an identical replacement. Hardware raid may (or may not) be better than reliable software raid, but hardware raids don't seem to be standardised and losing the card or mobo often seems to translate to losing the entire array. Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Friday 2016-12-23 23:49, sdm wrote:
The chances of btrfs failing in RAID 1 are very slim, like near 0, unless a controller issue or problems with the drive(s) themselves. mdadm is software RAID and it's not surprising people have issues with it as software RAID is not production-level RAID. Software RAID (poor man's RAID) just isn't that good. If you want real RAID, then you need a true controller card in a server that has ECC ram, with something like a PERC RAID controller card. That has a true battery backup should the server or workstation lose power, so whatever is in the buffer that didn't get written to disk gets written the next time it's powered up.
another point why I also use mdadm RAID on some production machines is more redundancy: Two HDD controllers, half of the disks on controller 1, the other on the second controller. ECC RAM is present. maybe related to firmware or module bugs, I had one controller failing sometimes (problem did not reaccure after some kernel updates), system did not went unscheduled down. Spare drive was instantly been included into the array. wondering if hot spares are supported by btrfs' raid, quick web search gave no clear result. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/01/17 20:31, Paul Neuwirth wrote:
Spare drive was instantly been included into the array. wondering if hot spares are supported by btrfs' raid, quick web search gave no clear result.
As I understand btrfs, "hot spare" is a meaningless term. Given that btrfs understands the concept of a filesystem spanning multiple drives, if a drive fails underneath it any files which have been declared as "mirrored" and have a copy on the failed drive, will have suddenly become "single copy" so the file system needs to replicate them onto another drive. And I'm guessing losing a drive would immediately trigger some sort of filesystem scan and rebalance - be weird if it didn't. Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
01.01.2017 23:38, Wols Lists пишет:
On 01/01/17 20:31, Paul Neuwirth wrote:
Spare drive was instantly been included into the array. wondering if hot spares are supported by btrfs' raid, quick web search gave no clear result.
As I understand btrfs, "hot spare" is a meaningless term. Given that btrfs understands the concept of a filesystem spanning multiple drives, if a drive fails underneath it any files which have been declared as "mirrored" and have a copy on the failed drive, will have suddenly become "single copy" so the file system needs to replicate them onto another drive.
Which is exact definition of "hot spare". Drive that is used to restore redundancy *automatically*, without involving manual actions. I do not see what magic in btrfs suddenly makes it "meaningless".
And I'm guessing losing a drive would immediately trigger some sort of filesystem scan and rebalance - be weird if it didn't.
You cannot even mount degraded btrfs without manual steps. As long as it is not mounted, there can be no re-balance. You need to mount degraded RAID1 btrfs with option "-o degraded" - and you can do it exactly once. If you failed to restore redundancy while doing it (i.e. - add drive to make it RAID1 again), you cannot mount this filesystem for writing anymore. You can still mount it with "-o degraded,ro", but then you can neither replace failed drive (btrfs replace start silently succeeds but does nothing) nor remove/add new device (because filesystem is mounted read-only). And no, there is no automatic scan and re-balance in btrfs. You could simply test instead of guessing. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/01/17 22:38, Andrei Borzenkov wrote:
As I understand btrfs, "hot spare" is a meaningless term. Given that
btrfs understands the concept of a filesystem spanning multiple drives, if a drive fails underneath it any files which have been declared as "mirrored" and have a copy on the failed drive, will have suddenly become "single copy" so the file system needs to replicate them onto another drive.
Which is exact definition of "hot spare". Drive that is used to restore redundancy *automatically*, without involving manual actions. I do not see what magic in btrfs suddenly makes it "meaningless".
Which actually it is most definitely *N*O*T* the definition of a hot spare. Where in my definition does it even imply the existence of a SPARE disk, let alone a hot spare? Which is why I said the concept of a hot spare in the context of btrfs is meaningless. There is spare space on a hot drive. There is no spare drive which is hot. Let's explain, nice and simple. We'll assume we have a btrfs file system spanning three drives, A, B, and C. We have a root directory a, two subdirectories b and c, and four files b1, b2, c1, c2. The structure is such that a is recursively mirrored. That means that a copy of a exists on A and B. b exists on C and A, c exists on B and C, etc etc. Now which drive is the hot spare? Yet the entire directory structure is mirrored across the three drives, such that any single failure can be recovered from. (Oh, and by the way, this exact sort of structure is one of the raid structures supported by mdraid, iirc.) Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Monday 2017-01-02 00:55, Wols Lists wrote:
As I understand btrfs, "hot spare" is a meaningless term. Given that
btrfs understands the concept of a filesystem spanning multiple drives, if a drive fails underneath it any files which have been declared as "mirrored" and have a copy on the failed drive, will have suddenly become "single copy" so the file system needs to replicate them onto another drive.
Which is exact definition of "hot spare". Drive that is used to restore redundancy *automatically*, without involving manual actions. I do not see what magic in btrfs suddenly makes it "meaningless".
Which actually it is most definitely *N*O*T* the definition of a hot spare. Where in my definition does it even imply the existence of a SPARE disk, let alone a hot spare? Which is why I said the concept of a hot spare in the context of btrfs is meaningless. There is spare space on a hot drive. There is no spare drive which is hot.
Let's explain, nice and simple. We'll assume we have a btrfs file system spanning three drives, A, B, and C. We have a root directory a, two subdirectories b and c, and four files b1, b2, c1, c2. The structure is such that a is recursively mirrored.
That means that a copy of a exists on A and B. b exists on C and A, c exists on B and C, etc etc.
Now which drive is the hot spare? Yet the entire directory structure is mirrored across the three drives, such that any single failure can be recovered from.
(Oh, and by the way, this exact sort of structure is one of the raid structures supported by mdraid, iirc.)
I think that depends on the raid level. hot spares in this meaning are useful on e.g. raid 10 systems where each disk has (at least) one exactly mirror. If there are three devices A,B,C,D and spare E with A=C, B=D - if one drive fails there is no redundancy for data anymore, so the hot spare gets involved and redundancy is recreated. No use for on RAID 0,1,5 or so. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 02/01/17 00:10, Paul Neuwirth wrote:
(Oh, and by the way, this exact sort of structure is one of the raid structures supported by mdraid, iirc.)
I think that depends on the raid level. hot spares in this meaning are useful on e.g. raid 10 systems where each disk has (at least) one exactly mirror. If there are three devices A,B,C,D and spare E with A=C, B=D - if one drive fails there is no redundancy for data anymore, so the hot spare gets involved and redundancy is recreated. No use for on RAID 0,1,5 or so.
I've been thinking about this since I made that last post. This is actually linux raid ten iirc. And please note that while "raid one plus zero" is often referred to as raid ten, they are actually very different beasts. Which is why, in true raid-10 arrays, you can have an ODD number of drives, NONE of which is an exact mirror of any of the others :-) While in raid-1+0 you need an even number of drives, minimum 4. Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/01/2017 05:38 PM, Andrei Borzenkov wrote:
Which is exact definition of "hot spare". Drive that is used to restore redundancy *automatically*, without involving manual actions. I do not see what magic in btrfs suddenly makes it "meaningless".
+ BIGNUM ! ! Indeed. "Hot Spare" that way pre-dates BtrFS! It pre-dates Linux, it pre-dates the PC, it pre-dates UNIX, it pre-dates the mainframe, it pre-dates digital and analog computers. it probably pre-dates steam engines. I'm pretty sure that if I dig I can find examples of ancient Greek or Roman imaginary (the Romans were good, inventive engineers) that automatically tripped over to a standby if the primary failed. "Mirror standby" is remarkably easy to engineer in many situations. There is a philosophical argument that what causes the failure of the primary may be a design or implement flaw that means the mirror will fail exactly the same way at the same time, and we've touched on that here a few times (e.g. having drives from the same production batch). Savvy engineers usually foresee this possibility and navigate around it. Andrei is right! Whatever advantages BtrFS offers, it does not make simple good engineering practices irrelevant! -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 12/23/2016 01:56 PM, Carlos E. R. wrote:
As far as I know, it is not possible. Use a rescue system.
Yes it is Carlos, as long as that rescue CD has btrfs of at least the same version available, OR the / directory of the problem system is mountable. In fact this is the recommended way, and the way I had to do it. -- After all is said and done, more is said than done.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2016-12-23 23:11, John Andersen wrote:
On 12/23/2016 01:56 PM, Carlos E. R. wrote:
As far as I know, it is not possible. Use a rescue system.
Yes it is Carlos, as long as that rescue CD has btrfs of at least the same version available, OR the / directory of the problem system is mountable.
In fact this is the recommended way, and the way I had to do it.
Read again what I wrote. I said that you can not check automatically during boot a btrfs filesystem; instead you have to use a rescue system exactly as you describe. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iF4EAREIAAYFAlhdotUACgkQja8UbcUWM1ysDQD9G8pBU/DD73yhX0izaWAyr4QD RckrCxV6zpcEHqAej78A+waS+CxUVwQdrU92QS3TMTkxQamajUgS6YErzLeU7rGN =NAgv -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Friday 2016-12-23 23:19, Carlos E. R. wrote:
On 2016-12-23 23:11, John Andersen wrote:
On 12/23/2016 01:56 PM, Carlos E. R. wrote:
As far as I know, it is not possible. Use a rescue system.
Yes it is Carlos, as long as that rescue CD has btrfs of at least the same version available, OR the / directory of the problem system is mountable.
In fact this is the recommended way, and the way I had to do it.
Read again what I wrote. I said that you can not check automatically during boot a btrfs filesystem; instead you have to use a rescue system exactly as you describe.
why is it possible for ext[234] and not for XFS and btrfs, why not just put the suitable fsck into the initrd? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2016-12-23 23:22, Paul Neuwirth wrote:
On Friday 2016-12-23 23:19, Carlos E. R. wrote:
why is it possible for ext[234] and not for XFS and btrfs, why not just put the suitable fsck into the initrd?
In the case of xfs, there is no... wait, look for yourself: minas-tirith:~ # file /usr/sbin/fsck.xfs /usr/sbin/fsck.xfs: POSIX shell script, ASCII text executable and that script simply bails out and tells you to use something else, manually. If you try to do an automatic check, which is what the init would do, it bails out with success status. /usr/sbin/fsck.btrfs is another script, which does even less: minas-tirith:~ # cat /usr/sbin/fsck.btrfs #!/bin/sh exit 0 minas-tirith:~ # So the conclusion is that it is impossible. Why the filesystems devs do not make it possible, I do not know. I expect the "mount" to do a quick check and bail out in the case of problems, with a message. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iF4EAREIAAYFAlhdwX8ACgkQja8UbcUWM1yGRgEAgVYvKD7Wri6J75gmnRzN68e1 DBNah8Rf78WuF8KjchcA/javLOtXPR7uCZkg1/v7THwQNG2fNifoGMNC6cYNvWTw =WGtN -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 12/23/2016 02:04 PM, Paul Neuwirth wrote:
maybe another reason to switch back to ext4. btrfsck failed IIRC with exit code 234. ran btrfsck --repair /dev/md2 . now I started btrfsck --repair -p /dev/md2 and waiting for result (tons of lines). Did you read: https://btrfs.wiki.kernel.org/index.php/Btrfsck? *"Deprecated* The tool btrfsck functionality has been merged to 'btrfs check' command. See Manpage/btrfs-check <https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-check>."
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Friday 2016-12-23 23:31, sdm wrote:
On 12/23/2016 02:04 PM, Paul Neuwirth wrote:
maybe another reason to switch back to ext4. btrfsck failed IIRC with exit code 234. ran btrfsck --repair /dev/md2 . now I started btrfsck --repair -p /dev/md2 and waiting for result (tons of lines). Did you read: https://btrfs.wiki.kernel.org/index.php/Btrfsck? *"Deprecated* The tool btrfsck functionality has been merged to 'btrfs check' command. See Manpage/btrfs-check <https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-check>."
interesting information, i should have read before.. I am new to btrfs. sure that. But: /usr/sbin/btrfsck is a symbolic link to btrfs. But i should have checked the other options mentioned in the article. btrfs check is still running "checking fs roots" i think each inode is being mentioned in the outputs.. I am really considering to switch back (or to keep) ext4 as standard file system. never had problems wit ext3/4, but with reiser and now btrfs - especially I do not have a clue what happened. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 12/23/2016 02:04 PM, Paul Neuwirth wrote:
maybe another reason to switch back to ext4. btrfsck failed IIRC with exit code 234. ran btrfsck --repair /dev/md2 . now I started btrfsck --repair -p /dev/md2 and waiting for result (tons of lines).
The chances of btrfs failing in RAID 1 are very slim, like near 0, unless a controller issue or problems with the drive(s) themselves. mdadm is software RAID and it's not surprising people have issues with it as software RAID is not production-level RAID. Software RAID (poor man's RAID) just isn't that good. If you want real RAID, then you need a true controller card in a server that has ECC ram, with something like a PERC RAID controller card. That has a true battery backup should the server or workstation lose power, so whatever is in the buffer that didn't get written to disk gets written the next time it's powered up. It's easy to point the finger at the filesystem when in actuality it's shoddy hardware or just because the parity bits get all messed up because of software RAID. The more recent problems with btrfs and RAID had to do with RAID 5/6, and supposedly the issues have been mostly fixed. btrfs does have numerous advantages over EXT4, one being checksumming and another being snapshots. This is just my opinion, but with btrfs and any type of RAID, I will only run it with a true RAID controller card. mdadm and btrfs RAID (btrfs has built-in software RAID) I believe users end up having more problems with in the long run. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Friday 2016-12-23 23:52, sdm wrote:
On 12/23/2016 02:04 PM, Paul Neuwirth wrote:
maybe another reason to switch back to ext4. btrfsck failed IIRC with exit code 234. ran btrfsck --repair /dev/md2 . now I started btrfsck --repair -p /dev/md2 and waiting for result (tons of lines).
The chances of btrfs failing in RAID 1 are very slim, like near 0, unless a controller issue or problems with the drive(s) themselves. mdadm is software RAID and it's not surprising people have issues with it as software RAID is not production-level RAID. Software RAID (poor man's RAID) just isn't that good. If you want real RAID, then you need a true controller card in a server that has ECC ram, with something like a PERC RAID controller card. That has a true battery backup should the server or workstation lose power, so whatever is in the buffer that didn't get written to disk gets written the next time it's powered up.
It's easy to point the finger at the filesystem when in actuality it's shoddy hardware or just because the parity bits get all messed up because of software RAID. The more recent problems with btrfs and RAID had to do with RAID 5/6, and supposedly the issues have been mostly fixed. btrfs does have numerous advantages over EXT4, one being checksumming and another being snapshots. This is just my opinion, but with btrfs and any type of RAID, I will only run it with a true RAID controller card. mdadm and btrfs RAID (btrfs has built-in software RAID) I believe users end up having more problems with in the long run.
The question is, what is suitable combination, if you only have unreliable consumer hardware (in this case a workstation). software RAID is better than no RAID. With ext4 I recently restored data from a dead RAID 10 (3 out of 4 disks failed) nearly completely, actually the only usable disks had failed at different points. Maybe i should read more about btrfs' software raid for this kind of setup. I am glad I do nightly backups.. now btrfs check failed with exit code 1. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2016-12-23 23:52, sdm wrote:
On 12/23/2016 02:04 PM, Paul Neuwirth wrote:
The chances of btrfs failing in RAID 1 are very slim, like near 0,
Not true.
mdadm is software RAID and it's not surprising people have issues with it as software RAID is not production-level RAID.
Not true, either. It is decades old software and very much tested and stable. Production level by all means. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iF4EAREIAAYFAlhdwkEACgkQja8UbcUWM1xzEwD/bb7hhqD9L/3EeY03mhtKkSdd Pd/k0PYrHTcSaWGVnJwA/21RdbLIPk2b1nCh0xVVHj/0I7HQH/pt+qoXUDyMehAR =iwiF -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 12/23/2016 04:33 PM, Carlos E. R. wrote:
Not true, either. It is decades old software and very much tested and stable. Production level by all means. Right... https://youtu.be/hquOIFJU3og?t=819
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2016-12-24 01:46, sdm wrote:
On 12/23/2016 04:33 PM, Carlos E. R. wrote:
Not true, either. It is decades old software and very much tested and stable. Production level by all means. Right... https://youtu.be/hquOIFJU3og?t=819
RAID Part 1: RAID can fail and lead to data loss So? your point being...? - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iF4EAREIAAYFAlhdzSsACgkQja8UbcUWM1wADQD/Um9fKqN+EbZ8zFKrXD+Dlf7p xu9n6wkf5rSxDTNL2tUA/jRC7UX7tDOcTWkS9rYhXGMLo+gMttZCzar09DhEVRH6 =x6Jq -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Saturday 2016-12-24 02:19, Carlos E. R. wrote:
On 2016-12-24 01:46, sdm wrote:
On 12/23/2016 04:33 PM, Carlos E. R. wrote:
Not true, either. It is decades old software and very much tested and stable. Production level by all means. Right... https://youtu.be/hquOIFJU3og?t=819
RAID Part 1: RAID can fail and lead to data loss
So? your point being...?
After resuming to solve my problem, I noticed during comparing of outputs of btrfs checks, that each run reported quite random errors. mdadm never ever reported any problem, there was no unclean stop/unmount or shutdown at all. But running a check of the RAID 1 array, there are mismatches (/sys/block/md2/md/mismatch_cnt shows around 15k at 20% progress). smartctl not showing any problems, repeated selftests are ok. What can have caused this? I am still wondering if this "flush" can cause this: [393168.772025] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x1810000 action 0xe frozen [393168.772030] ata3.00: irq_stat 0x00400000, PHY RDY changed [393168.772032] ata3: SError: { PHYRdyChg LinkSeq TrStaTrns } [393168.772036] ata3.00: failed command: FLUSH CACHE EXT [393168.772041] ata3.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 19 res 40/00:94:c0:26:9c/00:00:08:00:00/40 Emask 0x10 (ATA bus error) [393168.772043] ata3.00: status: { DRDY } [393168.772046] ata3: hard resetting link [393172.792055] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [393172.793691] ata3.00: configured for UDMA/100 [393172.793695] ata3.00: retrying FLUSH 0xea Emask 0x10 [393172.793889] ata3: EH complete [394038.449156] BTRFS error (device md2): parent transid verify failed on 51072286720 wanted 132076 found 86207 [394038.463093] BTRFS error (device md2): parent transid verify failed on 51072286720 wanted 132076 found 86207 the difference between the transactions is quite big > 45k - never been written physically to one mirror device? -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Topal (http://freshmeat.net/projects/topal) iQIcBAEBAgAGBQJYY88jAAoJEIiaa+Y8YDPuCDMP/320iixWg2r6hTT+AQM7yE3R Wv9cDUTMgW5Uc/X8zMoK+CqZUAS1QfiNL5pUeo9esRfrGZKAmgPvHPHHxRaVqmld qDd3yiR9WYQ8FCS9GD+S2YOrFf2KDhUNNk6VLT2ogcAGSGx7AwOJNPSKHGrPCvqH dt012Q9PcCBOYz1NKH+53BoN51y5ftdEfJsoGYl4Dw2I79niJzP6mFkV6uZ5bOti SU6QB81tVz4v/TWPHh9CyQll/djJ/+YOQwLQZCf2ibvrdd8bCiqqGuRgMwiK61Ns HnAcSk+7RXzAMOAzua+cMYSHE4BqhpIexZKct+vJPItjnvJVV3csasy/vVneLzPV 4JkOzGAJ6BNMp9nMxb4/4WUzVrdRxA6vVRi0sBMkZlcAHjBMTzCD48ZLFeXXuiqG nRXFc07dW4PPgo/QZpKSmdUMWY8gh5KOF0xMerO4GYW+E62pPYoLAcp6AqzMlhkf btQWSSBR0JTh7b2pl701+ZMyuyEUNOeArMBoFZQKMtOr0VQePfazLTV43H4nWhx6 QL8iziv3cVupQ7NCc4qHMf1kDBp+Xp23euKstpCSqlPN0Puwp3fwoCkEv5fYysw1 T6aH0mix8Cbh6MlBaubKzrE+agTHrQNHXDLhzrrmhskcSbKdyzFz/diFug1E70Qw xwMSXQyQ9PRzIq7nnu83 =IxFq -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-12-28 15:41, Paul Neuwirth wrote:
After resuming to solve my problem, I noticed during comparing of outputs of btrfs checks, that each run reported quite random errors. mdadm never ever reported any problem, there was no unclean stop/unmount or shutdown at all. But running a check of the RAID 1 array, there are mismatches (/sys/block/md2/md/mismatch_cnt shows around 15k at 20% progress). smartctl not showing any problems, repeated selftests are ok.
I suppose that mdadm doesn't report a problem because it is working on it. It is not an issue yet. maybe btrfs is more sensitive.
What can have caused this? I am still wondering if this "flush" can cause this:
Maybe. I don't know. Some people say that consumer class hard disks are not really raid ready. Consider this post, for instance: https://lists.opensuse.org/opensuse/2016-12/msg01097.html There have been several threads related to raid this month. Maybe you should have a look for ideas. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
RAID Part 1: RAID can fail and lead to data loss
So? your point being...? My point is that mdadm does nothing for the write hole problem, and
On 12/23/2016 05:19 PM, Carlos E. R. wrote: people end up losing data because of it on unclean shutdowns. The video I linked to explains it, here again: https://youtu.be/hquOIFJU3og?t=1300 "One more option to avoid a write hole is to use a ZFS which is a hybrid of a filesystem and a RAID. ZFS uses "copy-on-write" to provide write atomicity. However, this technology requires a special type of RAID (RAID-Z) which cannot be reduced to a combination of common RAID types (RAID 0, RAID 1, or RAID 5)." - http://www.raid-recovery-guide.com/raid5-write-hole.aspx "Yes, unfortunately btrfs RAID5/6 still suffers from the write hole (10/2015). The one missing piece, from a reliability point of view, is that it is still vulnerable to the parity RAID "write hole", where a partial write as a result of a power failure may result in inconsistent parity data." - http://superuser.com/questions/701111/is-btrfs-vulnerable-to-raid-write-hole... Write hole is behaviour mdadm can't get around and it does the best it can considering the implementation. btrfs and write atomicity still appears to be a work in progress. OP's problem most likely isn't that btrfs is the culprit. Sounds more like mdadm and an unclean shutdown (power loss). If he can't afford a hardware RAID controller (A real one, not a $30 one) and ECC memory, then if it were me, I would opt for btrfs built-in software RAID 1 over mdadm RAID 1+btrfs. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 2016-12-28 17:32, sdm wrote:
On 12/23/2016 05:19 PM, Carlos E. R. wrote:
RAID Part 1: RAID can fail and lead to data loss
So? your point being...? My point is that mdadm does nothing for the write hole problem, and people end up losing data because of it on unclean shutdowns. The video I linked to explains it, here again: https://youtu.be/hquOIFJU3og?t=1300
"One more option to avoid a write hole is to use a ZFS which is a hybrid of a filesystem and a RAID. ZFS uses "copy-on-write" to provide write atomicity. However, this technology requires a special type of RAID (RAID-Z) which cannot be reduced to a combination of common RAID types (RAID 0, RAID 1, or RAID 5)." - http://www.raid-recovery-guide.com/raid5-write-hole.aspx
"Yes, unfortunately btrfs RAID5/6 still suffers from the write hole (10/2015). The one missing piece, from a reliability point of view, is that it is still vulnerable to the parity RAID "write hole", where a partial write as a result of a power failure may result in inconsistent parity data." - http://superuser.com/questions/701111/is-btrfs-vulnerable-to-raid-write-hole...
Write hole is behaviour mdadm can't get around and it does the best it can considering the implementation. btrfs and write atomicity still appears to be a work in progress. OP's problem most likely isn't that btrfs is the culprit. Sounds more like mdadm and an unclean shutdown (power loss). If he can't afford a hardware RAID controller (A real one, not a $30 one) and ECC memory, then if it were me, I would opt for btrfs built-in software RAID 1 over mdadm RAID 1+btrfs.
Thus software raid is are reliable as motherboard raid, at least, and both are considered production ready. - -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" (Minas Tirith)) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iF4EAREIAAYFAlhj6rQACgkQja8UbcUWM1woNAD+NcFPCImj7sfTNaXkfZvttQfX VwjRaMXP79UElKj4N7cA/jDzXfRGWyCQwwGZlTcQcZc9z9HTe15la271BUdWVTcy =dvxb -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 28/12/16 16:32, sdm wrote:
"Yes, unfortunately btrfs RAID5/6 still suffers from the write hole (10/2015). The one missing piece, from a reliability point of view, is that it is still vulnerable to the parity RAID "write hole", where a partial write as a result of a power failure may result in inconsistent parity data." - http://superuser.com/questions/701111/is-btrfs-vulnerable-to-raid-write-hole...
Write hole is behaviour mdadm can't get around and it does the best it can considering the implementation. btrfs and write atomicity still appears to be a work in progress. OP's problem most likely isn't that btrfs is the culprit. Sounds more like mdadm and an unclean shutdown (power loss). If he can't afford a hardware RAID controller (A real one, not a $30 one) and ECC memory, then if it were me, I would opt for btrfs built-in software RAID 1 over mdadm RAID 1+btrfs.
This will - hopefully - soon be a fixed problem as far as mdadm/raid is concerned. Dunno which kernel it will end up in but there's a new feature being on its way - "PPL" or "Partial Parity Log". If switched on, it writes a log for a stripe being updated so it knows the stripe didn't write successfully, and it'll recover it. I don't understand the details, you'll have to read the linux-raid list for that. Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Wednesday 2016-12-28 17:32, sdm wrote:
then if it were me, I would opt for btrfs built-in software RAID 1 over mdadm RAID 1+btrfs.
I think it's time to give up, after repairing the mdadm raid, btrfs check segfaults trying to repair the volume. luckily I have a backup... How would you proceed, to create a new setup with btrfs with built-in software RAID 1. I would like to clone the subvolume layout. first idea is to do a clean new Install (does Yast setup let me set up this btrfs RAID?) - start a rescue system, restore the backup to the new install, clean up /etc/fstab, chroot, dracut, reinstall bootloader. or is there a faster way (i don't know how to clone the btrfs layout and set up btrfs raid manually) instead of a fresh install. Any literature? thanks a lot -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Fri, 30 Dec 2016, Paul Neuwirth wrote:
On Wednesday 2016-12-28 17:32, sdm wrote:
then if it were me, I would opt for btrfs built-in software RAID 1 over mdadm RAID 1+btrfs.
I think it's time to give up, after repairing the mdadm raid, btrfs check segfaults trying to repair the volume. luckily I have a backup... How would you proceed, to create a new setup with btrfs with built-in software RAID 1. I would like to clone the subvolume layout.
first idea is to do a clean new Install (does Yast setup let me set up this btrfs RAID?) - start a rescue system, restore the backup to the new install, clean up /etc/fstab, chroot, dracut, reinstall bootloader.
or is there a faster way (i don't know how to clone the btrfs layout and set up btrfs raid manually) instead of a fresh install. Any literature?
thanks a lot
I recently wrote up how to clone the root sub-volume structure here: https://forums.opensuse.org/showthread.php/521277-LEAP-42-2-btrfs-root-files... But I have no experience with brtrs RAID, so I cannot help there. I suspect it may be easier to completely reinstall a minimum install and then restore over the top of it. But I guess this may result in other problems. You should probably practice in a virtualbox. I don't think the available btrfs docs and google-results have sufficient details to be safe without practising before hand. If you follow the link provided, you will also see that I gave up on btrfs and returned to ext4 because, like yourself, I have recovery processes and procedures that would need quite a bit of rethinking for btrfs. Cheers, Michael -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thursday 2016-12-29 20:39, Michael Hamilton wrote:
On Fri, 30 Dec 2016, Paul Neuwirth wrote:
On Wednesday 2016-12-28 17:32, sdm wrote:
then if it were me, I would opt for btrfs built-in software RAID 1 over mdadm RAID 1+btrfs.
I think it's time to give up, after repairing the mdadm raid, btrfs check segfaults trying to repair the volume. luckily I have a backup... How would you proceed, to create a new setup with btrfs with built-in software RAID 1. I would like to clone the subvolume layout.
first idea is to do a clean new Install (does Yast setup let me set up this btrfs RAID?) - start a rescue system, restore the backup to the new install, clean up /etc/fstab, chroot, dracut, reinstall bootloader.
or is there a faster way (i don't know how to clone the btrfs layout and set up btrfs raid manually) instead of a fresh install. Any literature?
thanks a lot
I recently wrote up how to clone the root sub-volume structure here:
https://forums.opensuse.org/showthread.php/521277-LEAP-42-2-btrfs-root-files...
But I have no experience with brtrs RAID, so I cannot help there.
I suspect it may be easier to completely reinstall a minimum install and then restore over the top of it. But I guess this may result in other problems. You should probably practice in a virtualbox. I don't think the available btrfs docs and google-results have sufficient details to be safe without practising before hand.
If you follow the link provided, you will also see that I gave up on btrfs and returned to ext4 because, like yourself, I have recovery processes and procedures that would need quite a bit of rethinking for btrfs.
Cheers, Michael
a) solution finally.. I did a clean install (btrfs raid options are not supported by the installer) on a partition of the 1st hdd, then created a identical partition on the 2nd hard disk, added the partition to the btrfs filesystem, converted to RAID1 layout, according to this manual https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices a bit more tricky was the restore, because there is no recent openSuSE live system downloadable anymore. I needed dar to restore backup, i do not think it's in the install system available. So I used a debian live image, mounted all subvolumes to /mnt and restored all files with dar from a nfs mount, assembled a new fstab. But create a new initrd/install bootloader did not work in chroot environment (debian is maybe too different). But with some convenience I manually booted through grub2 console using an initrd from the btrfs snapshot (version before restore). very useful these snapshots in this case.. ;) these could have saved lots of time in the passed. b) open issue btrfs raid1 Only question, how can I install the generic bootcode to the MBR of both disks, not only the first one? there's no "custom root partition" or anything in the yast module. Also configuration for serial console seems to be gone :( c) bugs found I also found some bugs (maybe) 1) setting keyboard layout using the images from software.opensuse.org does not work (set language to english, but keyboard layout de-ch) - reproducable 2) in yast (ncurses) shift+tab did the same as only tab.. in the newly installed OS (without additional repos), very annoying ;) thank you all for your help, learned a lot about btrfs . d) problem continued just wanted to send.. but somehow I destroyed the bootloader now, only getting into rescue mode. what I did: I booted into system with the kernel/initrd from install, but the filesystem contents were from the old install. Worked fine, created new initrd, installed bootloader. When I booted using the new kernel/initrd, many kernel modules were not loaded: messages were: systemd-modules-load: Failed to lookup alias 'sg': Function not implemented systemd: Unit systemd-modules-load.service entered failed state. maybe related to changed hardware (added scsi card, 2 scanners, 1 tape drive, 1 fdd) ? I tried again to install the bootloader in this broken environment. Afterwards grub2 only gets into rescue mode, it looks for data in the old md-volume, which does not exist anymore (error: disk 'mduuid/...' not found.), seems I missed some configuration... ? e) and solved no progress using grub rescue possible, because btrfs support is missing "unknown filesystem". using the installer image, rescue mode worked to boot into system with current kernel. did not find anything referencing the old md-volume grub2 wanted in any config in /etc/grub.d/ /etc/sysconfig/ or /etc/default/ ? installed new updates/patches, including new kernel. looking into new grub2 generated config, and dracut output seems to be ok. reboot. grub2 menu loads and looks ok. boot. all modules came up. systemctl does not show any failed unit. finally done :) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Topal (http://freshmeat.net/projects/topal) iQIcBAEBAgAGBQJYZ3yAAAoJEIiaa+Y8YDPuVWUP/2RbaCSaVE1E73pO3O57fbp6 7UVgDPNu2D1xlIKlnuyxp9fc0o3DBRw0inUuVVus7Ds1e/kL8Bz2ZJaJSvbL0tHO Xoc5hQJVpXM8GW7s+TQGQMZLr7uaJr+NMAPyusI67R7OADYvizZl9yqiY5kC7FRA P1tBBRoSiW2zIUUDHzSvqXsijAN06esxi1XWGt7WlzM+t3b6/UeJE4fIF+0R/4LI QI2IgUmq1fU+GzRMpESPGdQ8sFFuyVLoN22GN0udVuADkNXOquQQ76nHtpBc5i66 2D61W7z1ZJc7kJcZsBBdHsPZDlrOqvPCcHRKSJCc8HHgruMoaf/37c2+s8Tc7i4P 2CqEzamKHP3Aziv2BBapp+KUD8uovwyetUF6jxAQV9wC4nP+VZ82IM7vy8QL49Bw WZ6zJCaFqAKnIZ666dpdlwIAEN08JTosDpCJjez8MiYvuvVwbPOfL5D0TvZOuWuC uhRhmpATghAvm/QlcyuyqNo5LLRhoO2S+n2Tqe2Qo5zwN0B9TGK0Q6GZ5MgHfAOF /U9Y3nu9OuY4NPVaW19TrSuAFi1We0WtNUYt/hwp2vd/4RTqVEEwbDL4LeHk5zwV 2zmbCrhlQdGdgw/kFBnVnjnt0OXqSuSx07yGRSrJabBzAbfuD5UEpO0VOMLOXgZJ BDFWu/GgMAY1MKNjS6ya =yB9G -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
a bit more tricky was the restore, because there is no recent openSuSE live system downloadable anymore.
there are live images for argon and krypton, (might be for tumbleweed) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Top-posting because I believe the information I'm about to share supercedes the majority of the other posts: I'm shocked and somewhat disappointed that while someone linked https://btrfs.wiki.kernel.org/index.php/Btrfsck I cannot see that anyone cited the most important part: In a nutshell, you should look at: btrfs scrub to detect issues on live filesystems look at btrfs detected errors in syslog (look at Marc's blog above on how to use sec.pl to do this) mount -o ro,recovery to mount a filesystem with issues btrfs-zero-log might help in specific cases. Go read Btrfs-zero-log btrfs restore will help you copy data off a broken btrfs filesystem. See its page: Restore btrfs check --repair, aka btrfsck is your last option if the ones above have not worked. Did you do all of the above before trying btrfs check --repair? A lot of the horror stories I've seen regarding btrfs are due to misuse / abuse of btrfs far more than actual issues or bugs, and those issues that do remain need to be filed as bugs in bugzilla.opensuse.org so our Kernel team can fix them - they're very good at that, every btrfs issue I've had over the last 3 years has been robustly resolved, even bizarre edge cases where you could justifiably turn around to me and say "well don't do that, it's stupid" ;) Hope this helps On 23 December 2016 at 22:26, Paul Neuwirth <mail@paul-neuwirth.nl> wrote:
Hello, got a problem with a machine here. First question: how to run btrfsck on boot? adding fsck.mode=force as kernel parameter seems only to only to check boot partition. The rebooted system fails again after some minutes (see log at end of post) 2nd question. What happened? The system ran for weeks without problems, five days ago a reboot because of updates. Then a reset on the SATA bus occured, but the btrfs is running on a level 1 mdraid. btrfs failed and volumes get remount ro. This is what I found in the logs: [15233.444359] BTRFS info (device md2): relocating block group 62373494784 flags 34 [15234.994089] BTRFS info (device md2): relocating block group 63480791040 flags 34 [15236.060747] BTRFS info (device md2): relocating block group 63514345472 flags 34 [15237.072850] BTRFS info (device md2): relocating block group 63547899904 flags 34 [15238.148345] BTRFS info (device md2): relocating block group 63581454336 flags 34 [393168.772025] ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x1810000 action 0xe frozen [393168.772030] ata3.00: irq_stat 0x00400000, PHY RDY changed [393168.772032] ata3: SError: { PHYRdyChg LinkSeq TrStaTrns } [393168.772036] ata3.00: failed command: FLUSH CACHE EXT [393168.772041] ata3.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 19 res 40/00:94:c0:26:9c/00:00:08:00:00/40 Emask 0x10 (ATA bus error) [393168.772043] ata3.00: status: { DRDY } [393168.772046] ata3: hard resetting link [393172.792055] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [393172.793691] ata3.00: configured for UDMA/100 [393172.793695] ata3.00: retrying FLUSH 0xea Emask 0x10 [393172.793889] ata3: EH complete [394038.449156] BTRFS error (device md2): parent transid verify failed on 51072286720 wanted 132076 found 86207 [394038.463093] BTRFS error (device md2): parent transid verify failed on 51072286720 wanted 132076 found 86207 [394038.463105] ------------[ cut here ]------------ [394038.463159] WARNING: CPU: 1 PID: 491 at ../fs/btrfs/extent-tree.c:2927 btrfs_run_delayed_refs+0x27a/0x2f0 [btrfs]() [394038.463159] BTRFS: Transaction aborted (error -5) [394038.463200] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache af_packet iscsi_ibft iscsi_boot_sysfs snd_hda_codec_realtek snd_hda_codec_generic ext4 crc16 jbd2 mbcache snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore pcspkr acpi_cpufreq fjes i2c_nforce2 coretemp forcedeth shpchp processor btrfs xor raid6_pq raid1 md_mod uas usb_storage sr_mod cdrom sd_mod ata_generic nouveau firewire_ohci ahci ohci_pci libahci pata_amd mxm_wmi video i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm serio_raw drm firewire_core crc_itu_t ehci_pci ohci_hcd ehci_hcd usbcore libata usb_common wmi button sg scsi_mod autofs4 [394038.463203] CPU: 1 PID: 491 Comm: btrfs-transacti Not tainted 4.4.36-8-default #1 [394038.463203] Hardware name: PACKARD BELL BV iMedia X1082/MCP73, BIOS PBDIEGMB.P15 07/10/2008 [394038.463206] 0000000000000000 ffffffff81327b17 ffff880035fd3d78 ffffffffa0595699 [394038.463208] ffffffff8107e841 ffff880095f649a0 ffff880035fd3dc8 ffff88002a160f80 [394038.463209] 0000000000000000 ffff88002a160e10 ffffffff8107e8bc ffffffffa0598470 [394038.463210] Call Trace: [394038.463223] [<ffffffff81019ea9>] dump_trace+0x59/0x320 [394038.463226] [<ffffffff8101a26a>] show_stack_log_lvl+0xfa/0x180 [394038.463229] [<ffffffff8101b011>] show_stack+0x21/0x40 [394038.463233] [<ffffffff81327b17>] dump_stack+0x5c/0x85 [394038.463237] [<ffffffff8107e841>] warn_slowpath_common+0x81/0xb0 [394038.463240] [<ffffffff8107e8bc>] warn_slowpath_fmt+0x4c/0x50 [394038.463256] [<ffffffffa04fffea>] btrfs_run_delayed_refs+0x27a/0x2f0 [btrfs] [394038.463282] [<ffffffffa0515b20>] btrfs_commit_transaction+0x40/0xaf0 [btrfs] [394038.463301] [<ffffffffa051062b>] transaction_kthread+0x21b/0x280 [btrfs] [394038.463305] [<ffffffff8109d308>] kthread+0xc8/0xe0 [394038.463311] [<ffffffff8160ac8f>] ret_from_fork+0x3f/0x70 [394038.464013] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
[394038.464013] Leftover inexact backtrace:
[394038.464013] [<ffffffff8109d240>] ? kthread_park+0x50/0x50 [394038.465233] ---[ end trace 558b3d028338b98f ]--- [394038.465237] BTRFS: error (device md2) in btrfs_run_delayed_refs:2927: errno=-5 IO failure [394038.465239] BTRFS info (device md2): forced readonly [394038.465490] pending csums is 4096
what confuses me twice is this: parent transid verify failed [...] 132076 found 86207 this difference of transaction ids is very huge?!
that is what happens after a reboot: [ 329.623911] BTRFS error (device md2): parent transid verify failed on 51072483328 wanted 132076 found 117819 [ 329.629476] BTRFS error (device md2): parent transid verify failed on 51072483328 wanted 132076 found 117819 [ 329.629493] ------------[ cut here ]------------ [ 329.629560] WARNING: CPU: 0 PID: 303 at ../fs/btrfs/extent-tree.c:2927 btrfs_run_delayed_refs+0x27a/0x2f0 [btrfs]() [ 329.629561] BTRFS: Transaction aborted (error -5) [ 329.629602] Modules linked in: joydev rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache snd_hda_codec_realtek snd_hda_codec_generic af_packet iscsi_ibft iscsi_boot_sysfs ext4 crc16 jbd2 mbcache snd_hda_codec_hdmi coretemp snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep pcspkr snd_pcm snd_timer acpi_cpufreq snd forcedeth fjes shpchp soundcore processor i2c_nforce2 btrfs xor raid6_pq raid1 md_mod uas usb_storage sd_mod sr_mod cdrom ata_generic ahci libahci pata_amd firewire_ohci nouveau ohci_pci mxm_wmi video i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm firewire_core serio_raw crc_itu_t libata ehci_pci ohci_hcd ehci_hcd usbcore usb_common wmi button sg scsi_mod autofs4 [ 329.629605] CPU: 0 PID: 303 Comm: kworker/u8:3 Not tainted 4.4.36-8-default #1 [ 329.629606] Hardware name: PACKARD BELL BV iMedia X1082/MCP73, BIOS PBDIEGMB.P15 07/10/2008 [ 329.629625] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs] [ 329.629628] 0000000000000000 ffffffff81327b17 ffff8800351f3d28 ffffffffa05a9699 [ 329.629630] ffffffff8107e841 ffff8800bf5b0870 ffff8800351f3d78 ffff8800ba5611c0 [ 329.629631] 0000000000000020 ffff8800ba561050 ffffffff8107e8bc ffffffffa05ac470 [ 329.629632] Call Trace: [ 329.629651] [<ffffffff81019ea9>] dump_trace+0x59/0x320 [ 329.629655] [<ffffffff8101a26a>] show_stack_log_lvl+0xfa/0x180 [ 329.629658] [<ffffffff8101b011>] show_stack+0x21/0x40 [ 329.629662] [<ffffffff81327b17>] dump_stack+0x5c/0x85 [ 329.629667] [<ffffffff8107e841>] warn_slowpath_common+0x81/0xb0 [ 329.629670] [<ffffffff8107e8bc>] warn_slowpath_fmt+0x4c/0x50 [ 329.629686] [<ffffffffa0513fea>] btrfs_run_delayed_refs+0x27a/0x2f0 [btrfs] [ 329.629709] [<ffffffffa0514092>] delayed_ref_async_start+0x32/0x80 [btrfs] [ 329.629728] [<ffffffffa055a633>] normal_work_helper+0xc3/0x320 [btrfs] [ 329.629734] [<ffffffff810971e5>] process_one_work+0x155/0x440 [ 329.629737] [<ffffffff81097d26>] worker_thread+0x116/0x4b0 [ 329.629740] [<ffffffff8109d308>] kthread+0xc8/0xe0 [ 329.629746] [<ffffffff8160ac8f>] ret_from_fork+0x3f/0x70 [ 329.631628] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
[ 329.631629] Leftover inexact backtrace:
[ 329.631636] [<ffffffff8109d240>] ? kthread_park+0x50/0x50 [ 329.631638] ---[ end trace 92c71c247ee8d7b4 ]--- [ 329.631644] BTRFS: error (device md2) in btrfs_run_delayed_refs:2927: errno=-5 IO failure [ 329.631646] BTRFS info (device md2): forced readonly
I am just looking for a blank DVD to start a live/rescue system. With other filesystems there was no such trouble, fsck was forced after unclean unmounts.
Thanks a lot and happy christmas
Paul
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Saturday 2016-12-31 11:00, Richard Brown wrote:
Date: Sat, 31 Dec 2016 11:00:31 From: Richard Brown <RBrownCCB@opensuse.org> To: Paul Neuwirth <mail@paul-neuwirth.nl> Cc: SuSE Linux <opensuse@opensuse.org> Subject: Re: [opensuse] btrfs problem - how to run btrfsck on boot
Top-posting because I believe the information I'm about to share supercedes the majority of the other posts:
I'm shocked and somewhat disappointed that while someone linked https://btrfs.wiki.kernel.org/index.php/Btrfsck I cannot see that anyone cited the most important part:
In a nutshell, you should look at:
btrfs scrub to detect issues on live filesystems look at btrfs detected errors in syslog (look at Marc's blog above on how to use sec.pl to do this) mount -o ro,recovery to mount a filesystem with issues btrfs-zero-log might help in specific cases. Go read Btrfs-zero-log btrfs restore will help you copy data off a broken btrfs filesystem. See its page: Restore btrfs check --repair, aka btrfsck is your last option if the ones above have not worked.
Did you do all of the above before trying btrfs check --repair?
unfortunately not before. But this is good to know for the future. I am used to ext2-4 - my first option always was fsck and nearly never had problems with.
A lot of the horror stories I've seen regarding btrfs are due to misuse / abuse of btrfs far more than actual issues or bugs, and those issues that do remain need to be filed as bugs in bugzilla.opensuse.org so our Kernel team can fix them - they're very good at that, every btrfs issue I've had over the last 3 years has been robustly resolved, even bizarre edge cases where you could justifiably turn around to me and say "well don't do that, it's stupid"
;)
Hope this helps
I believe in my case btrfs did not cause these problems. data of one mdadm raid device actually was not written for a long time. after a bus reset and cache flush, data of the mirrors mismatched. Instead of repairing the array, I destroyed the fs finally using btrfsck check --repair. Now I gave btrfs with included raid1 a try. 30 / 266 GiB currently in use. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-12-31 11:00, Richard Brown wrote:
Top-posting because I believe the information I'm about to share supercedes the majority of the other posts:
You can simply post without any quoted material ;-)
I'm shocked and somewhat disappointed that while someone linked https://btrfs.wiki.kernel.org/index.php/Btrfsck I cannot see that anyone cited the most important part:
In a nutshell, you should look at:
btrfs scrub to detect issues on live filesystems look at btrfs detected errors in syslog (look at Marc's blog above on how to use sec.pl to do this) mount -o ro,recovery to mount a filesystem with issues btrfs-zero-log might help in specific cases. Go read Btrfs-zero-log btrfs restore will help you copy data off a broken btrfs filesystem. See its page: Restore btrfs check --repair, aka btrfsck is your last option if the ones above have not worked.
Did you do all of the above before trying btrfs check --repair?
Well, I find that procedure too complex. One of the reasons I refuse to use btrfs. IMO, btrfs should have a single tool that automatically analyzes a btrfs filesystem and automatically decides on the best course of action, perhaps asking the user some simple questions. You mention, for instance:
mount -o ro,recovery to mount a filesystem with issues
But you need another running system to do that... and Leap doesn't have a rescue image anymore. The OP mentions having to use a Debian live image instead.
A lot of the horror stories I've seen regarding btrfs are due to misuse / abuse of btrfs far more than actual issues or bugs,
Well, you can not expect users to be knowledgeable about filesystem nuances, specially being btrfs that advanced and complex. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
Well, I find that procedure too complex. One of the reasons I refuse to use btrfs. too complex for a low occurence event?, compared to what in the rest of linux ecosystem?
The OP mentions having to use a Debian live image instead. see comment regarding argon and krypton.
comparisons to ext thus far seem to be on the scope of ext etc, not on the extra features of btrfs. ext will always be easier to use. btrfs can never win based on the metrics you assume. placed into to *context* of cost vs benefits, i would be extremly surprised if the cost of root/update screwups etc is not greater than problems with btrfs to the typical user. btrfs itself and opensuse defaults have improved rapidly this year. not sure anecdotal review and pointing to a minor problem with recent quotas is enough to condem the system. facebook dont seem to think so. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 12/31/2016 10:33 AM, nicholas wrote:
comparisons to ext thus far seem to be on the scope of ext etc, not on the extra features of btrfs. ext will always be easier to use. btrfs can never win based on the metrics you assume.
placed into to*context* of cost vs benefits, i would be extremly surprised if the cost of root/update screwups etc is not greater than problems with btrfs to the typical user.
btrfs itself and opensuse defaults have improved rapidly this year.
not sure anecdotal review and pointing to a minor problem with recent quotas is enough to condem the system. facebook dont seem to think so.
I'm skeptical about btrfs myself. What are the benefits that will outweigh the complexity-induced risks? I tried btrfs a few years ago on a large hardware RAID6 array and experienced filesystem failures and loss of data when writing more than 16-TB. That experience and the negative points we've been seeing here convinced me to continue to use ext-4 on root and xfs everywhere else. I've never experienced root/update screwups, so where is the value for me at this point in time? Is it time for me to try it again? If so, why? Regards, Lew -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 12/31/2016 11:16 AM, Lew Wolfgang wrote:
I tried btrfs a few years ago on a large hardware RAID6 array and experienced filesystem failures and loss of data when writing more than 16-TB.
Meh. 16TB... I tried it when 13.2 came out, and had two data-loss events (one severe) within 6 months on a mere 500 gig drive that was only a quarter full. If it happened Once I might have chocked it up to my ignorance. After the second event, I wiped the drive and went to something else. (In spite of that rude awakening, I came away with the impression that 13.2 was one of the best Opensuse versions in a long time). Sans btrfs. Joe Home User does not need anything btrfs has to offer, and is in a particularly poor position to maintain it and recover from problems. I don't think it was invented for the needs of the home-user. -- After all is said and done, more is said than done. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-12-31 21:03, John Andersen wrote:
On 12/31/2016 11:16 AM, Lew Wolfgang wrote:
Joe Home User does not need anything btrfs has to offer, and is in a particularly poor position to maintain it and recover from problems. I don't think it was invented for the needs of the home-user.
Well, I have seen novices recover from disaster (remove too many packages, say the whole of KDE, by accident) just by booting the previous btrfs snapshot. This feature was designed for Joe Home user. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 12/31/2016 03:03 PM, John Andersen wrote:
Joe Home User does not need anything btrfs has to offer, and is in a particularly poor position to maintain it and recover from problems. I don't think it was invented for the needs of the home-user.
+1 -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-12-31 20:16, Lew Wolfgang wrote:
On 12/31/2016 10:33 AM, nicholas wrote:
comparisons to ext thus far seem to be on the scope of ext etc, not on the extra features of btrfs. ext will always be easier to use. btrfs can never win based on the metrics you assume.
placed into to*context* of cost vs benefits, i would be extremly surprised if the cost of root/update screwups etc is not greater than problems with btrfs to the typical user.
btrfs itself and opensuse defaults have improved rapidly this year.
not sure anecdotal review and pointing to a minor problem with recent quotas is enough to condem the system. facebook dont seem to think so.
I'm skeptical about btrfs myself. What are the benefits that will outweigh the complexity-induced risks? I tried btrfs a few years ago on a large hardware RAID6 array and experienced filesystem failures and loss of data when writing more than 16-TB. That experience and the negative points we've been seeing here convinced me to continue to use ext-4 on root and xfs everywhere else. I've never experienced root/update screwups, so where is the value for me at this point in time? Is it time for me to try it again? If so, why?
Yes, my points exactly. It is not exactly anecdotal: many more issues come out here than with ext4. I know how to solve those on my own, but not with btrfs. Another issue: there is no known way to rebuild the btrfs root structure needed for any version of openSUSE from scratch, in order to restore from backup media. or clone the installation. One needs to install again. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 31 December 2016 at 21:14, Carlos E. R. <robin.listas@telefonica.net> wrote:
Another issue: there is no known way to rebuild the btrfs root structure needed for any version of openSUSE from scratch, in order to restore from backup media. or clone the installation. One needs to install again.
Sure there is and no you don't need to reinstall again dd works fine with btrfs, but if you'd prefer to use native tools https://btrfs.wiki.kernel.org/index.php/Restore Or if you want to rebuilt the btrfs root structure without any data https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-image Or maybe you'd be interested in the btrfs send/recieve for something more incremental? https://btrfs.wiki.kernel.org/index.php/Incremental_Backup -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Saturday 2016-12-31 21:29, Richard Brown wrote:
On 31 December 2016 at 21:14, Carlos E. R. <robin.listas@telefonica.net> wrote:
Another issue: there is no known way to rebuild the btrfs root structure needed for any version of openSUSE from scratch, in order to restore from backup media. or clone the installation. One needs to install again.
Sure there is and no you don't need to reinstall again
dd works fine with btrfs, but if you'd prefer to use native tools
https://btrfs.wiki.kernel.org/index.php/Restore
Or if you want to rebuilt the btrfs root structure without any data
https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-image
Or maybe you'd be interested in the btrfs send/recieve for something more incremental?
all these tools did not work in my case, on the corrupted btrfs volume. some btrfs tools even just dumped core (version from Leap 42.2 install image). I think the yast partitioner/installer should have a better btrfs implementation (CoW option? skinny extents? RAID? ...) or just an option to add standard subvolumes to an existing or manually created btrfs volume would be helpful. P.S.: This way Incremental Backup sounds interesting, a good implementation with dar, or something else would be interesting. I currently use dar for incremental backups, very time consuming, files being changed at time of backup sometimes fail to backup, problems if cdate goes backwards... -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-12-31 21:29, Richard Brown wrote:
On 31 December 2016 at 21:14, Carlos E. R. <robin.listas@telefonica.net> wrote:
Another issue: there is no known way to rebuild the btrfs root structure needed for any version of openSUSE from scratch, in order to restore from backup media. or clone the installation. One needs to install again.
Sure there is and no you don't need to reinstall again
dd works fine with btrfs,
If you have an image. Not if you have an rsync or other type of standard backup.
but if you'd prefer to use native tools
There was a recent thread about this and the conclusion was there were no tools. Just a list of tools and a sequence of things to do, error prone. No single tool to create a root in one single command. Perhaps you'd like to join that thread and discus it there. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 2016-12-31 19:33, nicholas wrote:
Well, I find that procedure too complex. One of the reasons I refuse to use btrfs. too complex for a low occurence event?, compared to what in the rest of linux ecosystem?
Compared to what one needs to do on, say, ext4, xfs, or reiserfs for recovery.
The OP mentions having to use a Debian live image instead. see comment regarding argon and krypton.
I saw them. No mention of a rescue image on https://software.opensuse.org/ Not found here: https://software.opensuse.org/search?utf8=%E2%9C%93&q=argon&search_devel=false&search_unsupported=false&baseproject=openSUSE%3ALeap%3A42.2 Hard to find: https://en.opensuse.org/Special:Search --> https://en.opensuse.org/Derivatives Packages built for KDE Git using stable and tested openSUSE technologies to track the latest development state of KDE software. Argon is a live installable image based on openSUSE Leap and Krypton is a live installable image based on openSUSE Tumbleweed. A KDE image... to track the latest development state. Hardly what I expect of a rescue image: tools like gparted, yast, mc, rsync, backup/restore tools, etc. And all those tools for filesystem work. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 31 December 2016 at 15:42, Carlos E. R. <robin.listas@telefonica.net> wrote:
Did you do all of the above before trying btrfs check --repair?
Well, I find that procedure too complex. One of the reasons I refuse to use btrfs.
While I understand that, I see the problem as the price of progress The same way that the correct management of an mdraid or LVM volume group requires more tooling and expertise than a traditional partition, btrfs (which is a filesystem with functionality matching and exceeding in some areas those of LVM and mdraid) has a similar requirement for more tooling and expertise when resolving issues on it I think if people use btrfs and expect a 'basic' ext4-like experience, then they're going to be sorely disappointed. But on the flipside, the additional features of btrfs enable very broad features with a system-wide impact, such as snapshot and restore, or transactional updates. Clinging to 'traditional' filesystems indefinately holds you back from experiencing modern capabilities you can't get on Linux any other way And before people point out that ZFS already has the features that enable this stuff with btrfs - yes, you're right. But ZFS has licencing issues. And even if it didn't, ZFS has a practically identical level of tooling and expertise required - you cannot just go doing a fsck on zfs and expecting it to work either. And before people point out that XFS is on the way to having some of those features - sure, and when it does, expect the tooling and expertise problem to raise even higher - XFS already has more concerns with it's repair than traditional ext, which is why fsck for xfs does nothing at all and, just like btrfs, you need to use different tools.
IMO, btrfs should have a single tool that automatically analyzes a btrfs filesystem and automatically decides on the best course of action, perhaps asking the user some simple questions.
That's a good idea, +1 from me, know anyone who could write it?
You mention, for instance:
mount -o ro,recovery to mount a filesystem with issues
But you need another running system to do that... and Leap doesn't have a rescue image anymore. The OP mentions having to use a Debian live image instead.
A Leap system that fails to boot its btrfs root automatically goes into its rescue system - which has the tooling required And we have a Tumbleweed Live Rescue Image So this is not a problem and you're straying into territory that makes it sound like you're just searching for things to complain about :)
A lot of the horror stories I've seen regarding btrfs are due to misuse / abuse of btrfs far more than actual issues or bugs,
Well, you can not expect users to be knowledgeable about filesystem nuances, specially being btrfs that advanced and complex.
Sure I can - there is lots of good documentation about it out these days: https://www.suse.com/documentation/sles-12/stor_admin/data/sec_filesystems_m... -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-12-31 20:18, Richard Brown wrote:
On 31 December 2016 at 15:42, Carlos E. R. <robin.listas@telefonica.net> wrote:
Did you do all of the above before trying btrfs check --repair?
Well, I find that procedure too complex. One of the reasons I refuse to use btrfs.
While I understand that, I see the problem as the price of progress
The same way that the correct management of an mdraid or LVM volume group requires more tooling and expertise than a traditional partition, btrfs (which is a filesystem with functionality matching and exceeding in some areas those of LVM and mdraid) has a similar requirement for more tooling and expertise when resolving issues on it
I also go away from LVM... unneeded complexity, difficulty of disaster recovery on my own. Raid does not justify itself for most uses, IMO. A backup strategy is better.
I think if people use btrfs and expect a 'basic' ext4-like experience, then they're going to be sorely disappointed.
Well, I expect a filesystem to "just work".
But on the flipside, the additional features of btrfs enable very broad features with a system-wide impact, such as snapshot and restore, or transactional updates.
I recognize and want the features of btrfs. But I don't trust it, from experience, so I have to say "not for me". I don't actively discourage others from trying, even novices. No. But if they do ask my opinion, then I have to give pros and cons honestly.
And before people point out that XFS is on the way to having some of those features - sure, and when it does, expect the tooling and expertise problem to raise even higher - XFS already has more concerns with it's repair than traditional ext, which is why fsck for xfs does nothing at all and, just like btrfs, you need to use different tools.
XFS has a well designed tool set, and a mail list where the developers kindly help people with disasters. They helped me more than once, and I helped them track a few bugs.
IMO, btrfs should have a single tool that automatically analyzes a btrfs filesystem and automatically decides on the best course of action, perhaps asking the user some simple questions.
That's a good idea, +1 from me, know anyone who could write it?
You ask me? Obviously that's for filesystem devs. A very specialized kind of dev.
You mention, for instance:
mount -o ro,recovery to mount a filesystem with issues
But you need another running system to do that... and Leap doesn't have a rescue image anymore. The OP mentions having to use a Debian live image instead.
A Leap system that fails to boot its btrfs root automatically goes into its rescue system - which has the tooling required
CLI and small. You only have the tools in the initrd. Insufficient. There a bit more in the install image, but barely so.
And we have a Tumbleweed Live Rescue Image
Yes, but not a Leap one.
So this is not a problem and you're straying into territory that makes it sound like you're just searching for things to complain about :)
Sorry if it sounds like that, but you have to expect more complains than congrats on a public place like this ;-)
A lot of the horror stories I've seen regarding btrfs are due to misuse / abuse of btrfs far more than actual issues or bugs,
Well, you can not expect users to be knowledgeable about filesystem nuances, specially being btrfs that advanced and complex.
Sure I can - there is lots of good documentation about it out these days: https://www.suse.com/documentation/sles-12/stor_admin/data/sec_filesystems_m...
SLE... Better on the openSUSE book. Can't search myself this instant, I have to go. New year celebrations etcetera. :-) -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
Le 31/12/2016 à 21:32, Carlos E. R. a écrit :
Well, I expect a filesystem to "just work".
most people simply don't know what a file system is... (many people don't even know what an application is :-())
I recognize and want the features of btrfs. But I don't trust it, from experience, so I have to say "not for me".
I have the same experience, but on the same side, when a new openSUSE is coming one can expect it to be better and at some point, BTRFS will certainly overcome ext4, but when? for example, in a 20Gb root, booting once in a while, an update is (was?) probably going to kill btrfs default install, simply because lack of space. I had such problem even with 50Gb root. but in a near future, with 500Gb roots (and 5Tb home), will this still happen, probably not...
XFS has a well designed tool set, and a mail list where the developers kindly help people with disasters. They helped me more than once, and I helped them track a few bugs.
but it can't shrink the filesystem, the only thing I needed (to balance with root) jdd -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2016-12-31 22:38, jdd wrote:
Le 31/12/2016 à 21:32, Carlos E. R. a écrit :
Well, I expect a filesystem to "just work".
most people simply don't know what a file system is... (many people don't even know what an application is :-())
Yes.
I recognize and want the features of btrfs. But I don't trust it, from experience, so I have to say "not for me".
I have the same experience, but on the same side, when a new openSUSE is coming one can expect it to be better and at some point, BTRFS will certainly overcome ext4, but when?
for example, in a 20Gb root, booting once in a while, an update is (was?) probably going to kill btrfs default install, simply because lack of space. I had such problem even with 50Gb root.
Ah, yes, there is that. For me btrfs means repartition and reinstall. My root is spread on three disks, using different partitions for parts of the filesystem. The "/" itself is too small (30GB). Then my strategy is to upgrade the system, from SuSE 5.2 till Leap 42.2. I would have to break an install fresh, because there is no method to create an empty root btrfs
but in a near future, with 500Gb roots (and 5Tb home), will this still happen, probably not...
XFS has a well designed tool set, and a mail list where the developers kindly help people with disasters. They helped me more than once, and I helped them track a few bugs.
but it can't shrink the filesystem, the only thing I needed (to balance with root)
Ah. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
Le 01/01/2017 à 14:54, Carlos E. R. a écrit :
On 2016-12-31 22:38, jdd wrote:
for example, in a 20Gb root, booting once in a while, an update is (was?) probably going to kill btrfs default install, simply because lack of space. I had such problem even with 50Gb root.
Ah, yes, there is that. For me btrfs means repartition and reinstall. My root is spread on three disks, using different partitions for parts of the filesystem. The "/" itself is too small (30GB).
Then my strategy is to upgrade the system, from SuSE 5.2 till Leap 42.2.
I don't speak of upgrade, simply of update. When this is done after, say, two month unactive, the amount of files is large and snapshots big. This so takes large amounts of disks. It's what seems to happen - no problem anymore after removing all the snapshots (and forgiving they use). But may be this problem is solved now, how can I say? but as I never had problem with exy4, at present time I use ext4 on all my installs. Until I have time to test better jdd -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-01-01 17:30, jdd wrote:
Le 01/01/2017 à 14:54, Carlos E. R. a écrit :
On 2016-12-31 22:38, jdd wrote:
for example, in a 20Gb root, booting once in a while, an update is (was?) probably going to kill btrfs default install, simply because lack of space. I had such problem even with 50Gb root.
Ah, yes, there is that. For me btrfs means repartition and reinstall. My root is spread on three disks, using different partitions for parts of the filesystem. The "/" itself is too small (30GB).
Then my strategy is to upgrade the system, from SuSE 5.2 till Leap 42.2.
I don't speak of upgrade, simply of update. When this is done after, say, two month unactive, the amount of files is large and snapshots big. This so takes large amounts of disks.
I meant that it is impossible for me to replace the existing ext4 root with a btrfs one, even if I wanted, which I do not. I clarify: cer@Telcontar:~> df -h Filesystem Size Used Avail Use% Mounted on /dev/sdb8 30G 11G 18G 37% / /dev/sdc8 20G 19G 2,0G 91% /usr /dev/sdb5 1011M 44M 917M 5% /boot /dev/sdc9 9,1G 1,5G 7,6G 17% /opt /dev/sdd12 25G 6,9G 18G 29% /usr/gamedata /dev/sdd13 20G 1,2G 19G 6% /var/spool/news /dev/sdd7 20G 3,2G 17G 17% /usr/local /dev/sdd6 20G 3,3G 17G 17% /usr/src cer@Telcontar:~> Used: 11+19+1.5+6.9+1.2+3.2+3.3 = 191 G So I would need at least a 600 GB btrfs partition for root. A 1 TB to be safe. And of course, repartition all my hard disks to redistribute. Not going to happen. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On Sat, 31 Dec 2016 22:38:22 +0100 jdd <jdd@dodin.org> wrote:
most people simply don't know what a file system is... (many people don't even know what an application is :-())
It's a bit like an app isn't it? HNY (Happy New Year) Dave And I too prefer filesystems that just work. Even if that means they don't have the latest bells and whistles. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 12/31/2016 02:00 AM, Richard Brown wrote:
A lot of the horror stories I've seen regarding btrfs are due to misuse / abuse of btrfs far more than actual issues or bugs,
Misuse / Abuse? Seriously, who sets out to abuse a file system on which they depend? That is the most absurd claim I've seen in years. Here's how it works in the real world: You install your OS, select a file system, and put all your stuff there. You seldom give much thought to the selection because its mostly esoteric anyway and one is pretty much as good as another for any given purpose. Over the subsequent years You don't expect to tip-toe around your file system fearing it will break. And you are astounded if it is ever necessary to do anything beyond the normal boot up fsck. It NEVER rises to the level of a horror story, let alone a "lot of horror stories". Then you meet btrfs.... You should have run away from her when you saw all the piercings and tattoos and those large mesh stocking, but your friends at the distro recommended her when you installed. And that's when the fight started... -- After all is said and done, more is said than done. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 31 December 2016 at 18:49, John Andersen <jsamyth@gmail.com> wrote:
On 12/31/2016 02:00 AM, Richard Brown wrote:
A lot of the horror stories I've seen regarding btrfs are due to misuse / abuse of btrfs far more than actual issues or bugs,
Misuse / Abuse? Seriously, who sets out to abuse a file system on which they depend? That is the most absurd claim I've seen in years.
Here's how it works in the real world: You install your OS, select a file system, and put all your stuff there. You seldom give much thought to the selection because its mostly esoteric anyway and one is pretty much as good as another for any given purpose.
Over the subsequent years You don't expect to tip-toe around your file system fearing it will break. And you are astounded if it is ever necessary to do anything beyond the normal boot up fsck.
It NEVER rises to the level of a horror story, let alone a "lot of horror stories".
Then you meet btrfs....
You should have run away from her when you saw all the piercings and tattoos and those large mesh stocking, but your friends at the distro recommended her when you installed.
And that's when the fight started...
In the real world I expect users to spend some time to understand the tools they have chosen to use and to use them appropriately I do not expect them to continue using them the way they have been used for many years, just because they used to work in a certain way We're a software project, software changes over time, a reasonable amount of re-education over time is not only healthy, but mandatory btrfs has been available in openSUSE since 11.3 (released in 2010, over 6 years ago). It's been the default in in openSUSE since 13.2 (released over 2 years ago) It's not like it's snuck up on people without time for them to start learning..but when they go ahead and run fsck before doing any of the other steps that are documented to be done before a repair, yes, that's misuse and abuse. Period. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Sat, 31 Dec 2016 20:23:48 +0100 Richard Brown <RBrownCCB@opensuse.org> wrote:
In the real world I expect users to spend some time to understand the tools they have chosen to use and to use them appropriately
I'm sorry but that is completely unrealistic, IMHO. I see two situations: (1) it's a new install - the person is likely to want to default everything and just get a running system so they can decide quickly whether they like linux/opensuse/the latest release. If you/we are lucky they will decide they do like the system, and they will then continue to use it and become dependent on it to do whatever it is they use their systems for. Only when something goes wrong will they have to learn any more about how it works and what the pros and cons of various install-time questions were. or (2) it's an upgrade - the person wants to see whether their system keeps on working as well or better than the existing installation was working. If there are any problems, then lesson learned, don't upgrade! (FWIW, I just stopped using a 9.3 system last month)
I do not expect them to continue using them the way they have been used for many years, just because they used to work in a certain way
Why on earth not? Do you live in a superinsulated passivhaus? Why not?
We're a software project, software changes over time, a reasonable amount of re-education over time is not only healthy, but mandatory
btrfs has been available in openSUSE since 11.3 (released in 2010, over 6 years ago). It's been the default in in openSUSE since 13.2 (released over 2 years ago)
It's not like it's snuck up on people without time for them to start learning..but when they go ahead and run fsck before doing any of the other steps that are documented to be done before a repair, yes, that's misuse and abuse. Period.
No. Whoever designed an fsck program that can't safely be run without precursors should be shot. Programs should always be programmed defensively and expecting people to read documentation before running them is certifiably insane. Let alone understand the documentation and act upon it. HNY, Dave -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/01/17 18:13, Dave Howorth wrote:
On Sat, 31 Dec 2016 20:23:48 +0100 Richard Brown <RBrownCCB@opensuse.org> wrote:
In the real world I expect users to spend some time to understand the tools they have chosen to use and to use them appropriately
I'm sorry but that is completely unrealistic, IMHO. I see two situations:
I see a third. The wife who abdicates responsibility for filling the car up to her husband, and then wonders why she gets stranded on a long journey because the car's run out of petrol (and she doesn't have a clue where the filler cap is). It is only reasonable to expect users to have SOME understanding of the technology they are using, not least because ignoramuses have a habit of breaking things they don't understand. Don't forget - every time the engineers invent a better foolproof gadget, nature invents a better fool. To expect users to use technology without understanding, is to expect engineers to fix things that should never have got broken. (And I curse blue bloody murder at home when all my stuff gets broken because the family "expect it to work" and have no desire to know how to make it work properly. A computer is a lot more complicated than most home appliances ...) Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Mon, 02 Jan 2017, Wols Lists wrote:
On 01/01/17 18:13, Dave Howorth wrote:
On Sat, 31 Dec 2016 20:23:48 +0100 Richard Brown <RBrownCCB@opensuse.org> wrote:
In the real world I expect users to spend some time to understand the tools they have chosen to use and to use them appropriately
I'm sorry but that is completely unrealistic, IMHO. I see two situations:
I see a third. The wife who abdicates responsibility for filling the car up to her husband, and then wonders why she gets stranded on a long journey because the car's run out of petrol (and she doesn't have a clue where the filler cap is).
It is only reasonable to expect users to have SOME understanding of the technology they are using, not least because ignoramuses have a habit of breaking things they don't understand.
Don't forget - every time the engineers invent a better foolproof gadget, nature invents a better fool.
To expect users to use technology without understanding, is to expect engineers to fix things that should never have got broken. (And I curse blue bloody murder at home when all my stuff gets broken because the family "expect it to work" and have no desire to know how to make it work properly. A computer is a lot more complicated than most home appliances ...)
Cheers, Wol
In respect to the car analogy: in the btrfs car we have old fashioned du/df fuel gauges and a new set of btrfs du/df fuel gauges. Both sets of gauges still function, but they can disagree. I'm not sure that would be acceptable in a car (hybrids perhaps?). Richard used the word "reasonable." That's the key here: what is a reasonable expectation of a users responsibility to help themselves? I appreciate the efforts that have gone into trying to make openSUSE a more feature full and sophisticated tool. I feel part of the process toward improvement should be to encourage some feedback from users. Feedback concerning issues and problems is particularly valuable as it may flag areas that impact the take up of the improvements or of the OS as a whole. I see issues with the reasonableness of the openSUSE btrfs rootfs. Previous implementations of the rootfs had far fewer dimensions to understand and manipulate, it was quite approachable for a home DIY setup. The btrfs rootfs comes together from a far more complex accumulated layering of tools and conventions: some of which are visible in the filesystem, some of which is only visible in various places in the filesystem metadata, and some of which is only visible in the documentation or from Googling. I feel that the tooling, documentation, and maybe even the structure still need work to make them more approachable to the "average" home or desktop user (if that's who's being targetted). If you read through my Unreviewed Howto on the root filesystem subvolume structure you can pick up an underlying critique of some aspects of the btrfs rootfs: https://forums.opensuse.org/showthread.php/521277-LEAP-42-2-btrfs-root-files... My write-up hints at where new tools and documentation might be needed. I explicitly state what made me reluctant to switch as well as stating some of the things have concerns with. I feel I've made reasonable efforts at coming to grips with these changes. I've made some reasonable effort to provide feedback. I feel I've made a reasonable decision to stay with ext4. I don't think its reasonable to dump on those raising usability issues. I don't think its reasonable to dump on those attempting to move openSUSE forward either. But it is completely reasonable to discuss actual usage experiences and problems and how they might be resolved. If openSUSE is built/revised to solve real world usage issues it will be much the better for it. Cheers, Michael -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Sun, 1 Jan 2017 18:44:37 +0000 Wols Lists <antlists@youngman.org.uk> wrote:
On 01/01/17 18:13, Dave Howorth wrote:
On Sat, 31 Dec 2016 20:23:48 +0100 Richard Brown <RBrownCCB@opensuse.org> wrote:
In the real world I expect users to spend some time to understand the tools they have chosen to use and to use them appropriately
I'm sorry but that is completely unrealistic, IMHO. I see two situations:
I see a third. The wife who abdicates responsibility for filling the car up to her husband, and then wonders why she gets stranded on a long journey because the car's run out of petrol (and she doesn't have a clue where the filler cap is).
It is only reasonable to expect users to have SOME understanding of the technology they are using, not least because ignoramuses have a habit of breaking things they don't understand.
You appear to have no understanding of how businesses work. Or indeed wives; mine makes sure I fill the car at every opportunity so she will never run out. :) -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
On Sun, 1 Jan 2017 18:44:37 +0000 Wols Lists <antlists@youngman.org.uk> wrote:
On 01/01/17 18:13, Dave Howorth wrote:
On Sat, 31 Dec 2016 20:23:48 +0100 Richard Brown <RBrownCCB@opensuse.org> wrote:
In the real world I expect users to spend some time to understand the tools they have chosen to use and to use them appropriately
I'm sorry but that is completely unrealistic, IMHO. I see two situations:
I see a third. The wife who abdicates responsibility for filling the car up to her husband, and then wonders why she gets stranded on a long journey because the car's run out of petrol (and she doesn't have a clue where the filler cap is).
It is only reasonable to expect users to have SOME understanding of the technology they are using, not least because ignoramuses have a habit of breaking things they don't understand.
You appear to have no understanding of how businesses work. Or indeed wives; mine makes sure I fill the car at every opportunity so she will never run out. :)
I make sure I fill up her car - if it's empty, it's will always be _my_ fault, and nobody elses. Otherwise I agree with Dave, completely. I have always considered btrfs a poorly chosen default for openSUSE. -- Per Jessen, Zürich (-1.1°C) http://www.cloudsuisse.com/ - your owncloud, hosted in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/02/2017 04:39 AM, Per Jessen wrote:
I make sure I fill up her car - if it's empty, it's will always be _my_ fault, and nobody elses.
Indeed. We can generalise that. many car owners may be smart enough to keep their tanks full and little else. They take their cars in for oil changes every 4 or 6 months because that's what they're told to do, change to winter or summer tyres because the local law requires that. But when they take their car in for oil change or wheel change they expect the mechanics to cheek the 17,438 other things that mechanics do, pretty much without thinking about it and take car of them, again pretty much without thinking about it, unless there's something that going to involve a serious cost or needing an authorization to go ahead. Even in that case they aren't going to give the read technical explanation when they explain the necessity. "Its broken. It needs fixing." In such cases, as the law has determined, its always the mechanic's responsibility. The user is not required to now or have responsibility for "what goes on under the hood". There's a good reason for that. Once you start down that road where do you stop? The user needs to know about VLSI design and fabrication? -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 2017-01-03 15:02, Anton Aylward wrote:
On 01/02/2017 04:39 AM, Per Jessen wrote:
I make sure I fill up her car - if it's empty, it's will always be _my_ fault, and nobody elses.
Indeed. We can generalise that.
many car owners may be smart enough to keep their tanks full and little else. They take their cars in for oil changes every 4 or 6 months because that's what they're told to do, change to winter or summer tyres because the local law requires that. But when they take their car in for oil change or wheel change they expect the mechanics to cheek the 17,438 other things that mechanics do, pretty much without thinking about it and take car of them, again pretty much without thinking about it, unless there's something that going to involve a serious cost or needing an authorization to go ahead. Even in that case they aren't going to give the read technical explanation when they explain the necessity. "Its broken. It needs fixing."
In such cases, as the law has determined, its always the mechanic's responsibility.
The user is not required to now or have responsibility for "what goes on under the hood".
Agreed. I know a bit about mechanics. But I'm unable to maintain my own car. Diagnostics depends on computers, parts are complicated to get, there are too many complex systems, many more than 25 years ago. Special tools are needed. Even if I know what is broken, I can not do the job. Even replacing a bulb on the headlights or rearlights can be close to impossible. I know, I tried recently. So no, drivers are not required to have mechanical training to handle a car. They just need to know when to take the car to the garage shop. Bad choice of comparisons. -- Cheers / Saludos, Carlos E. R. (from 42.2 x86_64 "Malachite" at Telcontar)
On 01/03/2017 10:10 AM, Carlos E. R. wrote:
Even replacing a bulb on the headlights or rearlights can be close to impossible. I know, I tried recently.
As A kid, i could maintain my father's Ford. I once even did a complete top-end overhaul. In his book "The Inmates are running the asylum" Alan Cooper points out that once you put a computer into something, a car, a camera, an plane, whatever, it stops being the 'whatever' and becomes a computer. You skills at maintaining the car, the camera, the plane are now useless. Its a computer. Even if you know how to maintain a computer you probably can't maintain THIS computer as its very specific design, integrated in a very specific way and using very specific software. The "Oh this is UNIX, I know how to use UNIX" of Lex Murphy, John Hammond's granddaughter, played by Ariana Richards in the 1993 movie "Jurassic Park" only made marginal sense then; it was about moving icons around on a GUI and had little to with UNIX and was all pretty obvious for the benefit of movie goes. Real life industrial control systems may be clear and obvious to people who know about that specific industrial setup, but are unlikely to be so to strangers.
So no, drivers are not required to have mechanical training to handle a car. They just need to know when to take the car to the garage shop.
Once upon a time a mechanic could fix just about any car since the mechanic understood mechanical principles and that was all there was. it was all pretty obvious to them. Now they have to go on training courses specific to not just a manufacturer but a model or model line and have diagnostic equipment and tools that are very specific. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 03/01/17 15:49, Anton Aylward wrote:
So no, drivers are not required to have mechanical training to handle a car. They just need to know when to take the car to the garage shop.
Once upon a time a mechanic could fix just about any car since the mechanic understood mechanical principles and that was all there was. it was all pretty obvious to them. Now they have to go on training courses specific to not just a manufacturer but a model or model line and have diagnostic equipment and tools that are very specific.
But you're missing a vital point. I'll quote Carlos again ... "They just need to know when to take the car to the garage shop." My daughter wrecks cars. She has no clue, and cannot spot the signs of impending doom. It drives me nuts. What's the saying - "a stitch in time saves nine"? Even if I can't FIX the problem, if I have the *knowledge* to *recognise* that "something is wrong", whatever the definition of "wrong" may be, I can take it to the garage to get it fixed. My daughter's definition of "something is wrong" starts with "it's died and stranded me by the side of the road". I've almost never had a car die on me - the only one that has done so in the last 30 years or so was booked in for a service the following day :-( Although I have been left stranded because I called the RAC with "something isn't right here" and there response was "we're calling a tow truck, you shouldn't be driving it". Cheers, Wo -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/03/2017 11:22 AM, Anthony Youngman wrote:
But you're missing a vital point. I'll quote Carlos again ... "They just need to know when to take the car to the garage shop." My daughter wrecks cars. She has no clue, and cannot spot the signs of impending doom. It drives me nuts. What's the saying - "a stitch in time saves nine"?
To refer to Alan cooper[1] once again, filtered though my own experience. before cars were computerized the "signs of impending doom" were all to apparent. Strange noises, "wobbly feeling", smoke coming out the exhaust, unwarranted increase in fuel consumption. Now we drive computers and the computers are in charge of everything, the engine, the steering, the suspension, the lighting. The computers 'optimise everything, for some value of 'optimise' that the designers thought best. They optimise out all those little warning signs. Often the mechanics, even with the best tools don't recognise the meaning of the 'adjustments'. So you end up with things going wrong. One day my engine tried shaking itself out of the car. It happened when idling and got worse as the revs increased even when the car was stationary. I called my mechanic. Well it _might_ have been a plug misfiring or not firing, but this was excessive. its possible the computer had compensated right up until the point that the plug was unusable. I can check for that and did and that wasn't the case. Next up was all the standard stuff: hoses, fluid, loose bolts. NIX to all that. Its a transverse mounted engine so needs shock absorbers on the mounts to absorb the 'kick' as it starts and as the revs go up suddenly. He thought that was the problem. "but we replaced those 4 months ago!". It turned out that there is a sort-of flywheel on the crankshaft called a "harmonic balancer" that smooths out its running. Its a largish metal and rubber device. It had worn out, fallen apart. The computer had 'compensated' for this. As it failed it did a lot of damage to the crankshaft and hence the cylinder joints and cylinder alignment. Its something that would have been apparent early on in a non-computer car. Should it show up in the logs? Yes, and after the event its possible to figure out what the warning signs were, but they weren't obvious a-priori. When the mechanic reported this upstream to GM the reaction was denial, denial, then "oh?". He tells me there was no bulletin issued about this and nothing in later model's releases or logs or diagnostic procedures about this. The computer compensates and hides the signs of impeding doom. It drives me nuts. It drives my mechanic nuts. Perhaps it drives us nuts because we're old enough to remember how it was before computers. A new generation of mechanics has grown up, and car users too, who never knew how it used to be', how to look out for the warning signs themselves. "Well it *must* be in the logs", they say", "the designers must know about all this and code it in there somewhere". The designers and coders are infallible, then. Perhaps you or I would not be able to spot the signs of impending doom in a computerized car lie your daughter drives. Yes it drives a lot of people nuts. But software is like that. [1] Alan Cooper, "The Inmates are running the asylum", SAMS 1999 ISBN 0-672-31649-8 -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 03/01/17 18:44, Anton Aylward wrote:
Its a transverse mounted engine so needs shock absorbers on the mounts to absorb the 'kick' as it starts and as the revs go up suddenly. He thought that was the problem. "but we replaced those 4 months ago!". It turned out that there is a sort-of flywheel on the crankshaft called a "harmonic balancer" that smooths out its running. Its a largish metal and rubber device. It had worn out, fallen apart. The computer had 'compensated' for this. As it failed it did a lot of damage to the crankshaft and hence the cylinder joints and cylinder alignment. Its something that would have been apparent early on in a non-computer car.
Should it show up in the logs? Yes, and after the event its possible to figure out what the warning signs were, but they weren't obvious a-priori. When the mechanic reported this upstream to GM the reaction was denial, denial, then "oh?". He tells me there was no bulletin issued about this and nothing in later model's releases or logs or diagnostic procedures about this.
:-) That car that let us down was also a GM ... When initially released, the cambelt service life was quoted at - iirc - 87,000 miles. After a bunch of failures, this was reduced to 54,000, then 27,000. I guess our car (originally a lease car) must have fallen through the cracks. The day before it was due to go in for its service - at about 50,000 miles - the cambelt fell off. Literally fell off. I was doing 70 on the motorway at the time. Fortunately, late at night, so I could safely coast in to the hard shoulder with a dead engine. Our mechanic rebuilt the engine, but a dealer service shop would almost certainly have written it off. We liked the car too much to do that, but ended up selling it shortly afterwards, anyway ... And I do take your point about computers, but my experience is far too much people burying their heads in the sand at signs that things are wrong, or simply refusing to be educated to the fact that their actions are actively making things worse, not better. Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/03/2017 02:22 PM, Wols Lists wrote:
And I do take your point about computers, but my experience is far too much people burying their heads in the sand at signs that things are wrong, or simply refusing to be educated to the fact that their actions are actively making things worse, not better.
Indeed. There was a time when things like TV had assemblies which were hand soldered using valve technology. When things went wrong, hitting the TV on the side could correct many problems and the concept entered into our culture. However it no longer makes sense. But people still do things like that, even now we have all solid state, wave soldered, no moving parts. Three's probably not even a transformer that that will delaminate and give off an annoying buzz. Now hitting the TV is more likely to *cause* damage. This is an extreme physical example, but there are probably many more. The "service engine soon" light goes out after a few miles and so can be ignored for another day ... and another day ... And anyway, it probably only means that its time for an oil change ... isn't it? If it were anything more serious it would say so, wouldn't it? So drive on .. for another day. But why won't my computer boot? It booted OK yesterday? -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/03/2017 08:22 AM, Anthony Youngman wrote:
But you're missing a vital point. I'll quote Carlos again ... "They just need to know when to take the car to the garage shop." My daughter wrecks cars. She has no clue, and cannot spot the signs of impending doom. It drives me nuts. What's the saying - "a stitch in time saves nine"?
There was a time after old school cars, and modern cars where your daughter could be forgiven for being clueless. That time is past. Modern cars warn you right on the dash about Oil changes, Service needs, tire pressure, door ajar, Parking Brake on, a whole host of other things lumped under "Service Engine Soon". There should be no reason for me or anyone else to check up on BTRFS. -- After all is said and done, more is said than done. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/01/2017 01:13 PM, Dave Howorth wrote:
On Sat, 31 Dec 2016 20:23:48 +0100 Richard Brown <RBrownCCB@opensuse.org> wrote:
In the real world I expect users to spend some time to understand the tools they have chosen to use and to use them appropriately
I'm sorry but that is completely unrealistic, IMHO. I see two situations:
(1) it's a new install - the person is likely to want to default everything and just get a running system so they can decide quickly whether they like linux/opensuse/the latest release.
I can see a further bifurcation. Is that a home user or a commercial/industrial user? Or asked another way, is it about desktop or server. Or asked another way, is it someone with a PC/Windows background or someone with a Big iron (even if its AIX or HP/UX or Solaria) background? I keep saying "Context is everything" that applies here too. Having used those Big Iron things like LVM and Btrfs's snapshotting were not strange to me, just a matter of learning a new CLI. But they are the sort of thing that are totally alien to a home Windows user. So, does the installer present them and get a "DUH? what am I supposed to do with this?" reaction form the Windows user or hide them and get a "Well this is lame and incapable!" reaction from the evaluator used to Big Iron?
[snip]
I do not expect them to continue using them the way they have been used for many years, just because they used to work in a certain way
Why on earth not? Do you live in a superinsulated passivhaus? Why not?
Explain, please, why FORD, GM, Toyota and others produce automobiles that use steering wheel and pedals and not reins and spurs? Personally I think part of the problem is in terminology. When we stopped calling them "horseless carriages" we let go of the concept of typing carriages to horses. "Automobile" is a descriptive generic; "car" is a new term even if it is an abbreviation of 'carriage'. We're going though the same thing with 'smartphones'. All the apps on your 'phone, well better than 90% of what's there has nothing to do with making POTS-like voice communication. For many people its a camera first and foremost. many SF stories recognise the utility , that it will evolve into portable 'smarts', possibly with a personal AI UI to the calendar and your 'life manager'. And we stop calling it a phone, but what do we call it? Even when I use my Samsung/iPhone as a voice communicator I don't use a dial-pad, and certainly not a rotary dialler, I simply say "Call Sandy". "Evolve or Die".
No. Whoever designed an fsck program that can't safely be run without precursors should be shot. Programs should always be programmed defensively and expecting people to read documentation before running them is certifiably insane. Let alone understand the documentation and act upon it.
Again, changes go through phases. When "horseless carriages" and the early automobiles were introduced they were expensive and limited to hobbyists and the Very Rich. The latter employed enthusiasts and specialist who know how it all worked. Yes computers followed that same path, but Bill gates did what Hank Ford had done and made the technology into a consumer item by simplifying it down for people who weren't enthusiasts and who didn't have the time or inclination to read the manual. It took a few iterations. if a time traveller had taken back a workstation from the early 200s to t a PC user of the 1980 it would have been like that scene in the Star Trek movie where Scotty travels back in time and you see him picking up the mouse as if it were a microphone and is trying to give the computer commands by voice. To a PC/AT user a mouse, a 23" screen, 250G or hard drive, hi-res graphics, pop-up context sensitive menus and more was an innovation yet to come. Most computer users today no more read the operating and technical manuals than car drivers read the operating manuals and technical manuals of their cars. And the car designers are savvy enough to realise this and design the cars to accommodate such. Our problem isn't that the software designers are geeks. Techies are like that in all industries. its that they are in command. Part of the success of Apple is that Human factor designers got an upper hands, they worked to design a computer that didn't need manuals - something they boasted of. That was the difference between Bill Gates and Steve Jobs. "Context is Everything". -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Le 03/01/2017 à 14:51, Anton Aylward a écrit :
it is an abbreviation of 'carriage'.
Carriage Return? several meanings... * type Enter... * the return of the horse carriage ... jdd -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/03/2017 08:59 AM, jdd wrote:
Carriage Return?
Yes, it says so right there in the Carriage Rental Agreement that you signed. With the horses fed and groomed. And you're responsible for any dings and scratches to the bodywork. -- A: Yes. > Q: Are you sure? >> A: Because it reverses the logical flow of conversation. >>> Q: Why is top posting frowned upon? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 03/01/17 13:51, Anton Aylward wrote:
Most computer users today no more read the operating and technical manuals than car drivers read the operating manuals and technical manuals of their cars. And the car designers are savvy enough to realise this and design the cars to accommodate such.
Our problem isn't that the software designers are geeks. Techies are like that in all industries. its that they are in command. Part of the success of Apple is that Human factor designers got an upper hands, they worked to design a computer that didn't need manuals - something they boasted of. That was the difference between Bill Gates and Steve Jobs.
Actually, it's NOT the geeks who are in command. It's the marketeers. And the geeks are trying to compensate for idiocies like megahurtz, buffer bloat, desktop grade hard drives, etc etc. Who remembers the megahurtz wars when your Intel processor was hot enough to heat your study, while the AMD chip, although slower, actually got more work done? How many of you still suffer your internet connection collapsing as soon as two people try to stream a video over your connection? I know my connection regularly achieves 0Mb over an allegedly "normally achieves 17Mb" copper wire. And I know from watching the linux raid mailing list that the biggest pain in the arse there is there is desktop drives. Let me suggest you a simple scenario, which is probably typical for your enthusiast who knows enough to be dangerous ... Let's say I set up a fancy high powered system, five drives, 4-drive raid-6 with a hot spare. How many disks does it take to fail, to knock out that array, if they're all desktop grade? ONE! Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/03/2017 08:51 AM, Anton Aylward wrote:
but Bill gates did what Hank Ford had done and made the technology into a consumer item by simplifying it down for people who weren't enthusiasts and who didn't have the time or inclination to read the manual.
Actually, that was Steve Jobs & Steve Wozniak. What Bill Gates did was sell IBM DOS, before he bought it from Seattle Computer Products, which in turn was based on and contained code from CP/M. It was IBM making it and almost completely open system that generated the clone market and mass acceptance. There were also clones of the Apple computers. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 04/01/17 03:25, James Knott wrote:
On 01/03/2017 08:51 AM, Anton Aylward wrote:
but Bill gates did what Hank Ford had done and made the technology into a consumer item by simplifying it down for people who weren't enthusiasts and who didn't have the time or inclination to read the manual.
Actually, that was Steve Jobs & Steve Wozniak. What Bill Gates did was sell IBM DOS, before he bought it from Seattle Computer Products, which in turn was based on and contained code from CP/M. It was IBM making it and almost completely open system that generated the clone market and mass acceptance. There were also clones of the Apple computers.
It wouldn't surprise me if it was neither. Still, we do like re-writing history in our favour, and as the guy who seized control of the computer I'm not surprised that it was re-written to favour Bill Gates. Bit like my pet moan - Everyone knows Edison invented the light bulb. His patent that claims that is dated TWO YEARS AFTER he visited a factory making light bulbs in Birmingham (the original, not the one in Alabama). Cheers, Wol -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/04/2017 01:57 PM, Wols Lists wrote:
Actually, that was Steve Jobs & Steve Wozniak. What Bill Gates did was sell IBM DOS, before he bought it from Seattle Computer Products, which in turn was based on and contained code from CP/M. It was IBM making it and almost completely open system that generated the clone market and mass acceptance. There were also clones of the Apple computers. It wouldn't surprise me if it was neither. Still, we do like re-writing history in our favour, and as the guy who seized control of the computer I'm not surprised that it was re-written to favour Bill Gates.
Well, I recall those days well. I bought my first computer, an IMSAI 8080 over 30 years ago. I also have every print edition of Byte magazine on the shelves here. In fact, I bought the first 3 issues in person from the original publisher of Byte, Wayne Greene, at the 1975 Radio Society of Ontario convention in Ottawa, Ont. So, I've been involved long enough to remember those days. BTW, in one of those Byte issues, there's an article by Gary Paterson, of Seattle Computer, who developed what he called Q-DOS, and intended it to be just a development system, while waiting for CP/M-86 to be released. This is what Bill Gates sold to IBM, before he actually owned it. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Quoting James Knott <james.knott@rogers.com>:
On 01/04/2017 01:57 PM, Wols Lists wrote:
Actually, that was Steve Jobs & Steve Wozniak. What Bill Gates did was sell IBM DOS, before he bought it from Seattle Computer Products, which in turn was based on and contained code from CP/M. It was IBM making it and almost completely open system that generated the clone market and mass acceptance. There were also clones of the Apple computers. It wouldn't surprise me if it was neither. Still, we do like re-writing history in our favour, and as the guy who seized control of the computer I'm not surprised that it was re-written to favour Bill Gates.
Well, I recall those days well. I bought my first computer, an IMSAI 8080 over 30 years ago. I also have every print edition of Byte magazine on the shelves here. In fact, I bought the first 3 issues in person from the original publisher of Byte, Wayne Greene, at the 1975 Radio Society of Ontario convention in Ottawa, Ont. So, I've been involved long enough to remember those days.
BTW, in one of those Byte issues, there's an article by Gary Paterson, of Seattle Computer, who developed what he called Q-DOS, and intended it to be just a development system, while waiting for CP/M-86 to be released. This is what Bill Gates sold to IBM, before he actually owned it.
My understanding is the Q-DOS stood for Quick and Dirty Operating System. It was a 16 bit CP/M look similar. Ugly. It was an something so he could sell hardware. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/04/2017 10:32 PM, Jeffrey L. Taylor wrote:
BTW, in one of those Byte issues, there's an article by Gary Paterson,
of Seattle Computer, who developed what he called Q-DOS, and intended it to be just a development system, while waiting for CP/M-86 to be released. This is what Bill Gates sold to IBM, before he actually owned it.
My understanding is the Q-DOS stood for Quick and Dirty Operating System. It was a 16 bit CP/M look similar. Ugly. It was an something so he could sell hardware.
As I mentioned, it was originally intended to give SCP something to test with, while waiting for CP/M-86. It looked like CP/M for that reason. In fact, in a lawsuit against Microsoft, Gary Kildall proved in court, that MS-DOS contained CP/M code. Incidentally, Q-DOS originally stood for "Quick and Dirty Operating System. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/04/2017 06:05 PM, James Knott wrote:
BTW, in one of those Byte issues, there's an article by Gary Paterson, of Seattle Computer, who developed what he called Q-DOS, and intended it to be just a development system, while waiting for CP/M-86 to be released. This is what Bill Gates sold to IBM, before he actually owned it.
I also recall that Q-DOS stands for Quick & Dirty Operating System. Wasn't there something about the IBM people going to Gary Kindall (CP/M) first, but Gary kept them cooling their heels in the waiting room? They then went to see Bill Gates instead. Wow on the Byte collection! I remember the issue introducing UNIX with a drawing of a tool box on the cover, the drawers being all the ancillary programs (cat, grep, etc). I still think of that cover when talk here wanders into the realm of large monolithic programs that attempt to do everything. Where would we be now if Kindall actually welcomed the IBM guys? BTW, Jerry Pournelle is still alive. He was a regular on Leo Laporte's TWIT podcast, but I haven't heard him for a couple of years. Regards, Lew -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/04/2017 10:57 PM, Lew Wolfgang wrote:
I also recall that Q-DOS stands for Quick & Dirty Operating System. Wasn't there something about the IBM people going to Gary Kindall (CP/M) first, but Gary kept them cooling their heels in the waiting room? They then went to see Bill Gates instead.
Actually, if you get into the history, you'll find that is incorrect. Gary Kildall's wife was reluctant to accept some of IBM's terms. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/05/2017 04:41 AM, James Knott wrote:
On 01/04/2017 10:57 PM, Lew Wolfgang wrote:
I also recall that Q-DOS stands for Quick & Dirty Operating System. Wasn't there something about the IBM people going to Gary Kindall (CP/M) first, but Gary kept them cooling their heels in the waiting room? They then went to see Bill Gates instead. Actually, if you get into the history, you'll find that is incorrect. Gary Kildall's wife was reluctant to accept some of IBM's terms.
That's interesting. I wonder what terms she objected to? And was Gates offered the same terms? Did he accept? Regards, Lew -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 01/04/2017 09:05 PM, James Knott wrote:
BTW, in one of those Byte issues, there's an article by Gary Paterson, of Seattle Computer, who developed what he called Q-DOS
Sorry, that should have been Tim Paterson. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (18)
-
Andrei Borzenkov
-
Anthony Youngman
-
Anton Aylward
-
Carlos E. R.
-
Dave Howorth
-
David C. Rankin
-
James Knott
-
jdd
-
Jeffrey L. Taylor
-
John Andersen
-
Lew Wolfgang
-
Michael Hamilton
-
nicholas
-
Paul Neuwirth
-
Per Jessen
-
Richard Brown
-
sdm
-
Wols Lists