Mailinglist Archive: opensuse-bugs (4258 mails)

< Previous Next >
[Bug 1008107] Potential XFS Kernel bug - _xfs_buf_find: Block out of range
  • From: bugzilla_noreply@xxxxxxxxxx
  • Date: Sun, 29 Jan 2017 19:14:06 +0000
  • Message-id: <bug-1008107-21960-Lfak3oDawz@http.bugzilla.suse.com/>
http://bugzilla.suse.com/show_bug.cgi?id=1008107
http://bugzilla.suse.com/show_bug.cgi?id=1008107#c7

--- Comment #7 from David Taylor <david@xxxxxxxxxxxxxxxxxxxxx> ---
(In reply to Luis Rodriguez from comment #5)


How big of a partition are we talking about here? If its large its harder to
diagnose and find the culprit issue.

It's 8G. I let it sit attempting xfs_repair run for several hours, so figured
it wasn't going anywhere at that point. It's not that large a partition.

Filesystem Size Used Avail Use% Mounted on
/dev/mapper/rootvg-lvvar 8.0G 1.6G 6.5G 20% /var

Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/rootvg-lvvar 4194304 3511 4190793 1% /var


Booting from the Leap USB stick, I was able
to get a little further, but was unsuccessful in getting /var back. I had
tried mounting readonly with norecover but it still refused. Flushing the
log/metadata with the -L option to xfs_repair was the only way to get past
the problem. (I have backups of the logs from the night before, so just lost
a little syslog data from some other systems, not a big issue here).

Once you blow away a corrupt fs I'm afraid we cannot inspect a possible root
cause anymore. After a corruption happens the thing to do is to get an xfs
meta dump of the partition and then hopefully with some analysis it *may* be
possible to iron out the root cause. The bigger the xfs meta dump though the
harder it will be to figure out the issue though, so hence my concern for
the size of the partition.

I tried to do a metadata dump before running the -L option, but that hung for
several hours too, so figured it wasn't going to complete. The file size on
the metadata target file was zero.


Once I had /var mounted, I was able to look at the messages in the log which
referred to _xfs_buf_find: Block out of range errors.

It seems these errors happen once a filesystem is already corrupted, it
means we are looking up for data which clearly we know is not possibly
there, as such one possible reason for these invalid lookups is a corruption
could have happened before this on the filesystem. The error then is likely
an after-thing effect of looking for files on an already faulty fs.

These occurred when
logrotate was trying to swap logs around. It was still writing logs against
my main system messages file (I use syslog_ng vs systemd journal logging)

Ah good to know! Sounds like you could likely try to write a mini script or
program to try to reproduce this! Based on your description the way I'd
write such a script is as follows:

Write a multithreaded program which issues tons of writes with / without
sync onto a log file, while another thread moves the main "log" file. You
must sync the threads handling writes to the log file between the thread
which moves the log file file around.

Think of it, an alternative is to just use a daemon which issues tons of
writes to the system log file and configure your log manager to just rotate
logs very very very frequently.

If you can reproduce this with a super small partition you are my hero :D

I'll see what I can do, unfortunately have a daughter with medical issues
that's currently consuming much of what little free time I had.


up until the point I power cycled the system so I guess it had sufficient
allocation on that file without requesting more. As squid had stopped
working, along with logins, etc, I expect the /var file system failure was
preventing opening and/or writing other logs (thus appearing locked up).

Very likely, yes, a write could be stalling the system, and such writes
prevent daemons from doing anything useful. Which should also mean log
managers should probably seriously consider a log_optional() or so which
would only log if possible but if they can't it will not block operations.
This should allow logins to complete provided other system files needed for
the service are functional. Specially for sshd. I'd consider filing a FATE
request for this to be evaluated.

Best I can tell based on the SMART results (I have smartd running) and lack
of any other kernel warnings of disk failures, this does not appear to have
been a disk failure.

Show the results please. No new errors at all??

Nothing that I could see. The error log is empty.

shadows:/home/dtaylor # smartctl -a /dev/sda
smartctl 6.2 2013-11-07 r3856 [x86_64-linux-4.1.36-44-default] (SUSE RPM)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: ST3320413AS
Serial Number: W2A7A966
LU WWN Device Id: 5 000c50 045752f4a
Firmware Version: JC66
User Capacity: 320,072,933,376 bytes [320 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sun Jan 29 13:54:33 2017 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED


shadows:/home/dtaylor # smartctl -l error /dev/sda
smartctl 6.2 2013-11-07 r3856 [x86_64-linux-4.1.36-44-default] (SUSE RPM)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Error Log Version: 1
No Errors Logged


There were a total of six dumps within several seconds, all referencing the
same PID (logrotate). Here's the first one. I'll attach an edited copy of
the log that contains more details plus reboot information for those who
need/want to know. I did try and clean up unnecessary noise from the
attached log as well as identifying bits that I didn't think needed publicly
posted.

Oct 29 01:00:05 shadows kernel: XFS (dm-9): _xfs_buf_find: Block out of
range: block 0x7fffffff8, EOFS 0x1000000 Oct 29 01:00:05 shadows kernel:
[665260.471535] XFS (dm-9): _xfs_buf_find: Block out of range: block
0x7fffffff8, EOFS 0x1000000
Oct 29 01:00:05 shadows kernel: [665260.471581] ------------[ cut here
]------------
Oct 29 01:00:05 shadows kernel: [665260.471626] WARNING: CPU: 3 PID: 4863 at
../fs/xfs/xfs_buf.c:473 _xfs_buf_find+0x2a1/0x2f0 [xfs]()

Yeah I'm afraid at this point the fs was already very likely corrupted. We
want to inspect the xfs meta dump of a type of fs which is corrupt in this
state (no repair done). Its preferable we actually reproduce the issue
though.

The next step, provided you can figure out how to reproduce is to tests
against the kernel of the day. If it cannot be reproduced there then clearly
we have a fix we can and should cherry pick onto leap. If you can reproduce
on the kotd, we'd then move on to linux-next. If you can reproduce on
linux-next -- well my friend, we have a gem to explore and try to fix
upstream.

Several items about the attached log so as to hopefully reduce any
confusion. I had attached a Samsung 850 SSD to the machine to serve as a
storage device for recovering bits of data between the nightly backup and
when it crashed.

Let's be clear, a crash did not happen, however a fs corruption probably did
and you had no access to the system after that.

True, it did not kernel panic and die. I apologize for not being precise in
this forum, but from an operation perspective I tend to use that term when it
becomes unresponsive and/or unable to do any useful work.


Do you mean it was cp -a'ing over /var/log over to the SSD for backups when
the system became unavailable ?

It is not normally attached (same goes for the USB stick).

The USB stick was attached when the corruption probably happened?
No, neither the SSD nor the stick were attached at the time it became
unresponsive. Those were after the fact. I was gathering information from the
system and backing up bits of data after I got it past the inability to mount
/var. I figured if I produced any output or logging information, it
potentially would show the stick and/or SSD in the output and was pointing out
that they were only on the system for purposes of recovery. They were not
there when the problem cropped up so could not have contributed to the issue.

--
You are receiving this mail because:
You are on the CC list for the bug.
< Previous Next >