Mailinglist Archive: opensuse-bugs (4258 mails)

< Previous Next >
[Bug 1008107] Potential XFS Kernel bug - _xfs_buf_find: Block out of range
  • From: bugzilla_noreply@xxxxxxxxxx
  • Date: Sun, 29 Jan 2017 17:59:25 +0000
  • Message-id: <bug-1008107-21960-pEhmNyVgDc@http.bugzilla.suse.com/>
http://bugzilla.suse.com/show_bug.cgi?id=1008107
http://bugzilla.suse.com/show_bug.cgi?id=1008107#c5

--- Comment #5 from Luis Rodriguez <lurodriguez@xxxxxxxx> ---
(In reply to David Taylor from comment #0)
Had complaints that the squid proxy was not working for several machines on
the network, so I investigated the Leap 42.1 box I have handling proxy
services. It would ping and ports would respond to Nagios TCP checks
(service checks were faulting), but I couldn't log in and none of the
services on the box were responsive.

At this point a corruption must have happened already.

I power cycled the machine and came up
to the emergency/maintenance recovery login.

When this happens it means the automatic safe reparation done by XFS was not
feasible using the available log.

Once in the maintenance mode I
determined the /var file system was corrupted and any attempt to mount it,
xfs_repair, etc. had the effect of hanging the system indefinitely requiring
another power cycle to recover.

How big of a partition are we talking about here? If its large its harder to
diagnose and find the culprit issue.

Booting from the Leap USB stick, I was able
to get a little further, but was unsuccessful in getting /var back. I had
tried mounting readonly with norecover but it still refused. Flushing the
log/metadata with the -L option to xfs_repair was the only way to get past
the problem. (I have backups of the logs from the night before, so just lost
a little syslog data from some other systems, not a big issue here).

Once you blow away a corrupt fs I'm afraid we cannot inspect a possible root
cause anymore. After a corruption happens the thing to do is to get an xfs meta
dump of the partition and then hopefully with some analysis it *may* be
possible to iron out the root cause. The bigger the xfs meta dump though the
harder it will be to figure out the issue though, so hence my concern for the
size of the partition.

Once I had /var mounted, I was able to look at the messages in the log which
referred to _xfs_buf_find: Block out of range errors.

It seems these errors happen once a filesystem is already corrupted, it means
we are looking up for data which clearly we know is not possibly there, as such
one possible reason for these invalid lookups is a corruption could have
happened before this on the filesystem. The error then is likely an after-thing
effect of looking for files on an already faulty fs.

These occurred when
logrotate was trying to swap logs around. It was still writing logs against
my main system messages file (I use syslog_ng vs systemd journal logging)

Ah good to know! Sounds like you could likely try to write a mini script or
program to try to reproduce this! Based on your description the way I'd write
such a script is as follows:

Write a multithreaded program which issues tons of writes with / without sync
onto a log file, while another thread moves the main "log" file. You must sync
the threads handling writes to the log file between the thread which moves the
log file file around.

Think of it, an alternative is to just use a daemon which issues tons of writes
to the system log file and configure your log manager to just rotate logs very
very very frequently.

If you can reproduce this with a super small partition you are my hero :D

up until the point I power cycled the system so I guess it had sufficient
allocation on that file without requesting more. As squid had stopped
working, along with logins, etc, I expect the /var file system failure was
preventing opening and/or writing other logs (thus appearing locked up).

Very likely, yes, a write could be stalling the system, and such writes prevent
daemons from doing anything useful. Which should also mean log managers should
probably seriously consider a log_optional() or so which would only log if
possible but if they can't it will not block operations. This should allow
logins to complete provided other system files needed for the service are
functional. Specially for sshd. I'd consider filing a FATE request for this to
be evaluated.

Best I can tell based on the SMART results (I have smartd running) and lack
of any other kernel warnings of disk failures, this does not appear to have
been a disk failure.

Show the results please. No new errors at all??

There were a total of six dumps within several seconds, all referencing the
same PID (logrotate). Here's the first one. I'll attach an edited copy of
the log that contains more details plus reboot information for those who
need/want to know. I did try and clean up unnecessary noise from the
attached log as well as identifying bits that I didn't think needed publicly
posted.

Oct 29 01:00:05 shadows kernel: XFS (dm-9): _xfs_buf_find: Block out of
range: block 0x7fffffff8, EOFS 0x1000000 Oct 29 01:00:05 shadows kernel:
[665260.471535] XFS (dm-9): _xfs_buf_find: Block out of range: block
0x7fffffff8, EOFS 0x1000000
Oct 29 01:00:05 shadows kernel: [665260.471581] ------------[ cut here
]------------
Oct 29 01:00:05 shadows kernel: [665260.471626] WARNING: CPU: 3 PID: 4863 at
../fs/xfs/xfs_buf.c:473 _xfs_buf_find+0x2a1/0x2f0 [xfs]()

Yeah I'm afraid at this point the fs was already very likely corrupted. We want
to inspect the xfs meta dump of a type of fs which is corrupt in this state (no
repair done). Its preferable we actually reproduce the issue though.

The next step, provided you can figure out how to reproduce is to tests against
the kernel of the day. If it cannot be reproduced there then clearly we have a
fix we can and should cherry pick onto leap. If you can reproduce on the kotd,
we'd then move on to linux-next. If you can reproduce on linux-next -- well my
friend, we have a gem to explore and try to fix upstream.

Several items about the attached log so as to hopefully reduce any
confusion. I had attached a Samsung 850 SSD to the machine to serve as a
storage device for recovering bits of data between the nightly backup and
when it crashed.

Let's be clear, a crash did not happen, however a fs corruption probably did
and you had no access to the system after that.

Do you mean it was cp -a'ing over /var/log over to the SSD for backups when the
system became unavailable ?

It is not normally attached (same goes for the USB stick).

The USB stick was attached when the corruption probably happened?

--
You are receiving this mail because:
You are on the CC list for the bug.
< Previous Next >