Mailinglist Archive: opensuse-bugs (4650 mails)

< Previous Next >
[Bug 1043449] xfsaild sometimes prevents suspend
  • From: bugzilla_noreply@xxxxxxxxxx
  • Date: Thu, 15 Jun 2017 17:40:12 +0000
  • Message-id: <bug-1043449-21960-WuBgs5Ya8J@http.bugzilla.suse.com/>
http://bugzilla.suse.com/show_bug.cgi?id=1043449
http://bugzilla.suse.com/show_bug.cgi?id=1043449#c1

Luis Rodriguez <lurodriguez@xxxxxxxx> changed:

What |Removed |Added
----------------------------------------------------------------------------
Flags| |needinfo?(oleg.b.antonyan@g
| |mail.com)

--- Comment #1 from Luis Rodriguez <lurodriguez@xxxxxxxx> ---
(In reply to Oleg Antonyan from comment #0)
Related to https://bugzilla.opensuse.org/show_bug.cgi?id=962250 but still
present in 4.11.3

Tumbleweed latest (08.06.2017) x86_64

1 of 5 suspend trials result in this error

Can you describe more about your partition setup ? Where is XFS used? On the
primary partition ?

Also, are you running some data performance tests during suspend/resume? Was
the system just idle?

dmesg:
[24442.429942] Freezing of tasks failed after 20.007 seconds (1 tasks
refusing to freeze, wq_busy=0):

Wow after 20s. That is a lot. You must have been doing some sort of work while
testing suspend/resume?

[24442.429972] xfsaild/sda2 D 0 285 2 0x00000000

Can you provide the output of:

sudo smartctl -a /dev/sda

XFS has an "Active Item List" (AIL) for items written to the log but not yet to
disk. xfsaild() is the kthread run for the AIL which monitors the AIL list, we
have one per XFS mount.

What type of disk do you have ? What is on /dev/sda ? What does

I can only infer /dev/sda2 not your main partition, so you might be doing work
off of some work load of sorts. Since its all work written to the log but not
yet disk and xfsaild is taking a while I could also guess perhaps there may not
be enough delays for xfsaild to run often. Reason for this can vary and there
are some hard corner cases I suspect are problematic but its unclear what type
of issue caused this for you. The simplest type of way to think of this problem
could be a fast disk with tons of load equally spread over CPUs, and us not
having enough time to flush the pending items off the log.

Can you reproduce by doing some simple task like a loop on recompiling the
Linux kernel with:

make mrproper -j$(getconf _NPROCESSORS_ONLN)
make allyesconfig -j$(getconf _NPROCESSORS_ONLN)
make -j$(getconf _NPROCESSORS_ONLN)

Over and over again. Do this in a loop, perhaps echo a count to see where its
at. Then just loop your suspend tests. Can you reproduce easily? If not what is
it about your work load that makes this trigger you think?

--
You are receiving this mail because:
You are on the CC list for the bug.
< Previous Next >