[Bug 1172566] New: problems with LVM cache
http://bugzilla.suse.com/show_bug.cgi?id=1172566 Bug ID: 1172566 Summary: problems with LVM cache Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Basesystem Assignee: screening-team-bugs@suse.de Reporter: aschnell@suse.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- I was experimenting with LVM cache and somehow managed to trigger a bug. Here is the command that failed: # lvconvert --type cache --cachevol giant-cache --cachemode writeback --chunksize 192 system/giant Erase all existing data on system/giant-cache? [y/n]: y device-mapper: resume ioctl on (254:3) failed: No space left on device Unable to resume system-giant (254:3). Problem reactivating logical volume system/giant. Releasing activation in critical section. libdevmapper exiting with 1 device(s) still suspended In dmesg this triggered: [1387212.915349] device-mapper: space map metadata: unable to allocate new metadata block [1387212.915352] device-mapper: cache: 254:3: could not resize cache metadata [1387212.915354] device-mapper: cache: 254:3: metadata operation 'dm_cache_resize' failed: error = -28 [1387212.915356] device-mapper: cache: 254:3: aborting current metadata transaction [1387212.940222] device-mapper: cache: 254:3: switching cache to read-only mode [1387212.940227] device-mapper: table: 254:3: cache: preresume failed, error = -28 [1387243.006881] device-mapper: cache: 254:3: unable to switch cache to write mode until repaired. [1387243.006883] device-mapper: cache: 254:3: switching cache to read-only mode [1387243.021274] device-mapper: table: 254:3: cache: Unable to get write access to metadata, please check/repair metadata. [1387243.021277] device-mapper: ioctl: error adding target to table Now lvs reports: # lvs -a LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert abuild system -wi-ao---- 50.00g arvin system -wi-ao---- 100.00g giant system Cwi-CoC-M- 512.00g [giant-cache] [giant_corig] 0.00 0.68 0.00 [giant-cache] system Cwi-aoC--- 128.00g [giant_corig] system owi-aoC--- 512.00g root system -wi-ao---- 50.00g swap system -wi-ao---- 2.00g As can be seen for giant a check is needed (although 'C' is not thin-pools). Unmounting the filesystem does not work (hangs). -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1172566 Arvin Schnell <aschnell@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |aschnell@suse.com Found By|--- |Development -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1172566 Alynx Zhou <alynx.zhou@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |alynx.zhou@suse.com Assignee|screening-team-bugs@suse.de |deanraccoon@gmail.com -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1172566 http://bugzilla.suse.com/show_bug.cgi?id=1172566#c1 --- Comment #1 from Arvin Schnell <aschnell@suse.com> --- Created attachment 838609 --> http://bugzilla.suse.com/attachment.cgi?id=838609&action=edit steps to reproduce problem -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1172566 http://bugzilla.suse.com/show_bug.cgi?id=1172566#c2 Arvin Schnell <aschnell@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|Normal |Critical --- Comment #2 from Arvin Schnell <aschnell@suse.com> --- Setting to critical due to data loss (I did not manage to access the logical volume again). -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1172566 Arvin Schnell <aschnell@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|problems with LVM cache |problem with LVM cache | |(data loss) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1172566 Arvin Schnell <aschnell@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Blocks| |1172696 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1172566 Arvin Schnell <aschnell@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- See Also| |http://bugzilla.suse.com/sh | |ow_bug.cgi?id=1172696 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1172566 http://bugzilla.suse.com/show_bug.cgi?id=1172566#c4 heming zhao <heming.zhao@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |heming.zhao@suse.com --- Comment #4 from heming zhao <heming.zhao@suse.com> --- the error info in comment #0: device-mapper: resume ioctl on (254:3) failed: No space left on device so could you make sure the space is enough for (254:3) --------- I pasted my steps in bsc#1172696. I can't reproduce this issue. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1172566 http://bugzilla.suse.com/show_bug.cgi?id=1172566#c5 --- Comment #5 from Arvin Schnell <aschnell@suse.com> --- Yes, there is enough space. # lvs -a -o lv_name,vg_name,lv_attr,lv_size LV VG Attr LSize giant test Cwi-C-C-M- 512.00g [giant-cache] test Cwi-aoC--- 128.00g [giant_corig] test owi-aoC--- 512.00g # pvs PV VG Fmt Attr PSize PFree /dev/sdb test lvm2 a-- 512.00g 384.00g /dev/sdc test lvm2 a-- 1024.00g 512.00g Try to reproduce the problem with the exact sizes I have used. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1172566 http://bugzilla.suse.com/show_bug.cgi?id=1172566#c6 --- Comment #6 from heming zhao <heming.zhao@suse.com> --- in my kvm-qemu, 2 1000g scsi disks, still can't reproduction. ``` tb-base40g:~ # pvcreate /dev/sdb Physical volume "/dev/sdb" successfully created. tb-base40g:~ # pvcreate /dev/sda Physical volume "/dev/sda" successfully created. tb-base40g:~ # vgcreate test /dev/sdb /dev/sda Volume group "test" successfully created tb-base40g:~ # lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1000G 0 disk sdb 8:16 0 1000G 0 disk sr0 11:0 1 4.3G 0 rom vda 253:0 0 40G 0 disk ├─vda1 253:1 0 8M 0 part ├─vda2 253:2 0 38G 0 part / └─vda3 253:3 0 2G 0 part [SWAP] vdb 253:16 0 1G 0 disk └─vdb1 253:17 0 1023M 0 part vdc 253:32 0 1G 0 disk └─vdc1 253:33 0 1023M 0 part tb-base40g:~ # lvcreate --size 512g --name giant test /dev/sda Logical volume "giant" created. tb-base40g:~ # lvcreate --size 96g --name giant-cache test /dev/sdb Logical volume "giant-cache" created. tb-base40g:~ # lvconvert --type cache --cachevol giant-cache --cachemode writeback --chunksize 128 test/giant Erase all existing data on test/giant-cache? [y/n]: y WARNING: repairing a damaged cachevol is not yet possible. WARNING: cache mode writethrough is suggested for safe operation. Continue using writeback without repair?y Logical volume test/giant is now cached. tb-base40g:~ # lvconvert --splitcache test/giant Flushing 0 blocks for cache test/giant. Logical volume test/giant is not cached and test/giant-cache is unused. tb-base40g:~ # lvresize --size 128g test/giant-cache Size of logical volume test/giant-cache changed from 96.00 GiB (24576 extents) to 128.00 GiB (32768 extents). Logical volume test/giant-cache successfully resized. tb-base40g:~ # lvconvert --type cache --cachevol giant-cache --cachemode writeback --chunksize 256 test/giant Erase all existing data on test/giant-cache? [y/n]: y WARNING: repairing a damaged cachevol is not yet possible. WARNING: cache mode writethrough is suggested for safe operation. Continue using writeback without repair?y Logical volume test/giant is now cached. tb-base40g:~ # lvs -a LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert giant test Cwi-a-C--- 512.00g [giant-cache_cvol] [giant_corig] 0.01 14.86 0.00 [giant-cache_cvol] test Cwi-aoC--- 128.00g [giant_corig] test owi-aoC--- 512.00g tb-base40g:~ # rpm -qa | grep lvm2 lvm2-2.03.05-8.1.x86_64 lvm2-testsuite-2.03.05-7.2.x86_64 liblvm2cmd2_03-2.03.05-8.1.x86_64 ``` -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1172566 http://bugzilla.suse.com/show_bug.cgi?id=1172566#c7 --- Comment #7 from Arvin Schnell <aschnell@suse.com> --- I have different versions: # rpm -qa | grep lvm2 liblvm2cmd2_03-2.03.05-11.1.x86_64 lvm2-2.03.05-11.1.x86_64 # cat /etc/os-release NAME="openSUSE Tumbleweed" # VERSION="20200605" -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1172566 http://bugzilla.suse.com/show_bug.cgi?id=1172566#c8 --- Comment #8 from heming zhao <heming.zhao@suse.com> --- very interesting, I reproduced this issue with 2.03.05-11.1. from the lvm2 rpm change log, and comparing with 2.03.05-8.1, I can't find any patch can introduce this issue. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1172566 http://bugzilla.suse.com/show_bug.cgi?id=1172566#c9 --- Comment #9 from heming zhao <heming.zhao@suse.com> --- this bug had been rootcaused. in lib/metadata/cache_manip.c ``` int cache_vol_set_params() { ... ... if (!meta_size) { //heming: //the 128GB won't match below 3 if-"else if" branch. //so, meta_size is zero, then the meta_size will setup //to default value 4MB. it will cause device_mapper report //"No space left on device" if (pool_lv->size < (128 * ONE_MB_S)) meta_size = 16 * ONE_MB_S; else if (pool_lv->size < ONE_GB_S) meta_size = 32 * ONE_MB_S; else if (pool_lv->size < (128 * ONE_GB_S)) meta_size = 64 * ONE_MB_S; if (meta_size > (pool_lv->size / 2)) meta_size = pool_lv->size / 2; if (meta_size < min_meta_size) meta_size = min_meta_size; if (meta_size % extent_size) meta_size += extent_size - meta_size % extent_size; } ... ... } ``` upstream had been fixed this issue by commit: c08704cee7e34a96fdaa453faf900683283e8691 I will provide test rpm package ASAP. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1172566 http://bugzilla.suse.com/show_bug.cgi?id=1172566#c10 heming zhao <heming.zhao@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|deanraccoon@gmail.com |heming.zhao@suse.com --- Comment #10 from heming zhao <heming.zhao@suse.com> --- assignee to me. I have created private rpm packages in my project: https://build.opensuse.org/package/show/home:hmzhao:branches:openSUSE:Factor... please download and try, wait for your feedback. thanks -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1172566 https://bugzilla.suse.com/show_bug.cgi?id=1172566#c11 --- Comment #11 from heming zhao <heming.zhao@suse.com> --- items values for calc min_metadata_size: extent_size: 0x2000 (4MB @ 512B sector size, can be get by vgdisplay) poolmetadatasize: 0x0 (if without --poolmetadatasize parameter) min_meta_size should be calac by _cache_min_metadata_size(), formular: chunk_overhead = DM_BYTES_PER_BLOCK + DM_MAX_HINT_WIDTH + DM_HINT_OVERHEAD_PER_BLOCK => 16 + (4 + 16) + 8 => 44 => 0x2c (fixed value) transaction_overhead = 4MB (0x400000, fixed value) nr_chunks = pool_lv->size / chunk_size (variable value) => 128GB/128k => 0x100000 (or 96G/128k => 0xc0000, 96G/256K=>0x60000 , 128G/256K => 0x8000) so, we can get from above: the pool_lv size is bigger, the nr_chunks is bigger. the chunk_size is bigger, the nr_chunks is smaller. min_meta_size = (transaction_overhead + nr_chunks * chunk_overhead + (SECTOR_SIZE - 1)) >> SECTOR_SHIFT; || \/ min_meta_size = (0x400000 + 0xc0000 * 0x2c + (0x200 - 1)) >> 9 => 0x12800 => 0x12800 * 0x200 (sector size) => 37 MB if the chunk size change to 256K, but others values no change. the min_meta_size will become: (0x400000 + 0x60000 * 0x2c + (0x200 - 1)) >> 9 => 0x4400 => 20.5MB when lv_pool change from 96G to 128G, chunk size from 128K to 256K, but others no change. the min_meta_size become: (0x400000 + 0x80000 * 0x2c + (0x200 - 1)) >> 9 => 0xd000 => 26MB if lv_pool is 128G, chunk size is 512K, others no change, the min_meta_size become: (0x400000 + 0x40000 * 0x2c + (0x200 - 1)) >> 9 => 0x7800 => 15MB ---------- for easy calculate, with following formular: min_metadata_size = {(0x400000 + (pool_lv->size / chunk_size) * 0x2c + (sector_size - 1)) >> (sector_size shift bit) } + extent_size sector_size shift bits: - 512 Bytes: 9 - 1024: 10 - 2048: 11 - 4096: 12 pool_lv->size: - the cache lv size. in this bug, it is test/giant-cache size, (96GB or 128GB). chunk_size: - default 4MB - can be set by parameter: "--chunksize XX" - in this bug, it is 256 or 512. (please note, 128k -> 256 sectors, (512B sector size)) -------------- at last, use the calculated min_metadata_size to execute lvconvert: lvconvert --type cache --cachevol giant-cache --cachemode writeback --chunksize 256 test/giant --poolmetadatasize 30m -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1172566 Marcus Meissner <meissner@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |meissner@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1172566 https://bugzilla.suse.com/show_bug.cgi?id=1172566#c14 --- Comment #14 from Swamp Workflow Management <swamp@suse.de> --- SUSE-RU-2020:1795-1: An update that has one recommended fix can now be installed. Category: recommended (important) Bug References: 1172566 CVE References: Sources used: SUSE Linux Enterprise Module for Basesystem 15-SP2 (src): lvm2-2.03.05-8.3.1, lvm2-device-mapper-2.03.05-8.3.1 SUSE Linux Enterprise High Availability 15-SP2 (src): lvm2-lvmlockd-2.03.05-8.3.1 NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1172566 https://bugzilla.suse.com/show_bug.cgi?id=1172566#c15 heming zhao <heming.zhao@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #15 from heming zhao <heming.zhao@suse.com> --- fixed code merged, close this bug. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1172566 https://bugzilla.suse.com/show_bug.cgi?id=1172566#c16 --- Comment #16 from Swamp Workflow Management <swamp@suse.de> --- openSUSE-RU-2020:0921-1: An update that has one recommended fix can now be installed. Category: recommended (important) Bug References: 1172566 CVE References: Sources used: openSUSE Leap 15.2 (src): lvm2-2.03.05-lp152.7.3.1, lvm2-device-mapper-2.03.05-lp152.7.3.1, lvm2-lvmlockd-2.03.05-lp152.7.3.1 -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com