Re: [opensuse-kernel] Re: openSUSE Kernel: Push Patches Upstream

10 May 2011

      On Fri, Apr 29, 2011 at 12:28:50PM +0200, Michal Hocko wrote:
...
On Thu 28-04-11 13:35:13, Jeff Mahoney wrote:
[...]
...
Mel Gorman / Michal Hocko:
- - patches.fixes/grab-swap-token-oops
There are still no in-kernel users of gup from kernel threads AFAICS.
Besides that I think there is no issue in the current upstream because
we do not get to handle_mm_fault (and grab_swap_token) path if there
is no mm_struct. find_extend_vma will return NULL for (mm == NULL) and
so we either go into gate_vma path (which is not handled by the patch and
still can be an issue because of pgd_offset_gate) or fail with -EFAULT.
The same applies to openSUSE-11.4 and SLES10_SP4_BRANCH branches. If Mel
doesn't see anything I would vote for dropping the patch from all
supported branches.
That patch belongs to an SGI developer so there is a possibility that
they have an RDMA driver or MPI accelerator that was doing direct
writes to userspace after pinning the page with get_user_pages(). GPFS
could also be doing weird things from kernel threads and pinning
pages for IO a kernel thread context.

For upstream, it's not currently a problem. The only in-tree user of
get_user_pages I'm aware of is KSM calling get_user_pages (ok, it's
not called directly, it's just very get_user_pages like) and it always
passes in a valid mm from do_swap_page so it does not need this patch.

I'd agree with Michal - drop this patch because even out-of-tree
drivers should only be trying to grab the swap token from
do_swap_page(). If they are doing something else, it's probably best
we find out about it.
...
...
- - patches.suse/files-slab-rcu.patch
Nick's VFS black magic. The patch is not applied in openSUSE-11.4.
Sorry, I cannot say anything about this patch.
Drop it unless there is a known specific use case it helps. There is
now a whole host of other VFS black magic merged to mainline and it
improves open/close performance in other ways. Slab-destroy-by-rcu
means there is a variable amount of time spent freeing slab
objects. Many users probably don't care, but realtime people have
complained about variable latency when freeing slab objects before
and this is the kind of thing that can surprise them.
...
...
- - patches.suse/mm-devzero-optimisation.patch
  [against SLES10, issue may not exist anymore]
Yes we can drop it from SLE11-SP1 and openSUSE-11.4 branches. The zero
page has been reintroduced in 2.6.32. The main reason for the above
patch was the page fault overhead (coming from allocation and zeroing)
during copying from /dev/zero. Now that we have zero page again this is
not a case anymore so we do not need any special /dev/zero hacks.
Sure.
...
...
- - patches.fixes/aggressive-zone-reclaim.patch [disabled since 2.6.36]
I assume that the patch has been reverted due to changes in the area,
right?
It's not clear. I think it might have been dropped due to conflicts and
there was no motivation to keep it updated.
...
My impression of the patch is that is highly workload specific.
Extremely so.
...
Especially decreasing ZONE_RECLAIM_PRIORITY to 0 is very tricky because
it makes reclaim really aggressive. I think this part is highly
controversial for upstream.
It would only make sense on a machine running threads that were tightly
bound to their NUMA node which will probably only be tree on HPC
configurations. For things like mail servers running on Nehalem with
large NUMA distances, this patch could have very surprising results.
...
Nevertheless, I quite like the ZONE_RECLAIM_LOCKED part.
Although, this
can be really subtle because we might end up reclaiming too much if
there is really heavy parallel load or multiple high order allocators.
Yes.

A consequence of removing it is that a number of parallel allocator
requests that happen at the same time dump all the memory of a node.
Worse, they could start reclaiming each others memory in the node
which would manifest as unexpected stalls at unpredictable times. This
is true for normal reclaim of course but in __zone_reclaim it's
potentially worse as it's focusing on clean pages.
...
I will wait with reverting for Mel but mm-devzero-optimisation.patch is
one that can go away for sure.
Ditch it. It is likely to adversely affect common workloads that are
heavily file based.

-- 
Mel Gorman
SUSE Labs
-- 
To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org
For additional commands, e-mail: opensuse-kernel+help@opensuse.org