Comment # 5 on bug 1186726 from Anthony Iliopoulos

Thanks for the report, Martin. Looks like you're on some rather old xfs
filesystems (v4), and although there's nothing wrong with that (they should be
relatively robust), the newer format (v5) has been around for quite a while and
features many more metadata corruption verifiers that are able to catch many
more issues. So in general it is highly recommended to move data to xfs
filesystems formatted with the latest mkfs.xfs which will automatically switch
on crc (v5) format. Also v4 is scheduled for deprecation (at Sep 2025), which
could impact you if you're running on TW.

Now regarding this bug, could you attach a metadump of the filesystem (just
xfs_metadump /dev/foo /tmp/xfs.md will do the trick). Through that I can check
the actual fs parameters (e.g. xfs_info), and also verify if there are any more
corruptions.

It's hard to say what exactly happened, but given the persistent
"xlog_space_left: head behind tail" messages and the kernel oopses that
happened just before that, I'd hazard a guess that some of those (unrelated to
xfs issues) had corrupted the xfs in-memory log, which was persisted to disk
and replayed during the next mount (the "Starting recovery" messages). In
principle as long as recovery completes without issues (mount succeeds), there
should be no dormant corruptions. In the particular case I think because of the
old v4 format, the log replay during recovery doesn't feature as many
robustness checks as the later format, and thus probably allowed for
inconsistent (due to the previous in-memory corruptions) metadata to be
replayed back to the fs.

All the other errors that you see coming from xfs is just xfs encountering
those inconsistent metadata at some later point and protecting the fs from
further corruption. So they are definitely related, and the reason they occur
later is just that it so happened at that point of time that your workload
happened to touch those inconsistent bits of metadata (e.g. the particular
extent allocation btree block and the particular inode allocation block that
apparently were inconsistent).

All of those issues should indeed be fixed by xfs_repair, so you shouldn't
expect to see any more of those messages unless there are new corruptions being
introduced.

What	Removed	Added
CC		ailiopoulos@suse.com
Assignee	kernel-bugs@opensuse.org	ailiopoulos@suse.com