On Fri, Apr 29, 2011 at 12:28:50PM +0200, Michal Hocko wrote:
On Thu 28-04-11 13:35:13, Jeff Mahoney wrote: [...]
Mel Gorman / Michal Hocko: - - patches.fixes/grab-swap-token-oops
There are still no in-kernel users of gup from kernel threads AFAICS. Besides that I think there is no issue in the current upstream because we do not get to handle_mm_fault (and grab_swap_token) path if there is no mm_struct. find_extend_vma will return NULL for (mm == NULL) and so we either go into gate_vma path (which is not handled by the patch and still can be an issue because of pgd_offset_gate) or fail with -EFAULT.
The same applies to openSUSE-11.4 and SLES10_SP4_BRANCH branches. If Mel doesn't see anything I would vote for dropping the patch from all supported branches.
That patch belongs to an SGI developer so there is a possibility that they have an RDMA driver or MPI accelerator that was doing direct writes to userspace after pinning the page with get_user_pages(). GPFS could also be doing weird things from kernel threads and pinning pages for IO a kernel thread context. For upstream, it's not currently a problem. The only in-tree user of get_user_pages I'm aware of is KSM calling get_user_pages (ok, it's not called directly, it's just very get_user_pages like) and it always passes in a valid mm from do_swap_page so it does not need this patch. I'd agree with Michal - drop this patch because even out-of-tree drivers should only be trying to grab the swap token from do_swap_page(). If they are doing something else, it's probably best we find out about it.
- - patches.suse/files-slab-rcu.patch
Nick's VFS black magic. The patch is not applied in openSUSE-11.4. Sorry, I cannot say anything about this patch.
Drop it unless there is a known specific use case it helps. There is now a whole host of other VFS black magic merged to mainline and it improves open/close performance in other ways. Slab-destroy-by-rcu means there is a variable amount of time spent freeing slab objects. Many users probably don't care, but realtime people have complained about variable latency when freeing slab objects before and this is the kind of thing that can surprise them.
- - patches.suse/mm-devzero-optimisation.patch [against SLES10, issue may not exist anymore]
Yes we can drop it from SLE11-SP1 and openSUSE-11.4 branches. The zero page has been reintroduced in 2.6.32. The main reason for the above patch was the page fault overhead (coming from allocation and zeroing) during copying from /dev/zero. Now that we have zero page again this is not a case anymore so we do not need any special /dev/zero hacks.
Sure.
- - patches.fixes/aggressive-zone-reclaim.patch [disabled since 2.6.36]
I assume that the patch has been reverted due to changes in the area, right?
It's not clear. I think it might have been dropped due to conflicts and there was no motivation to keep it updated.
My impression of the patch is that is highly workload specific.
Extremely so.
Especially decreasing ZONE_RECLAIM_PRIORITY to 0 is very tricky because it makes reclaim really aggressive. I think this part is highly controversial for upstream.
It would only make sense on a machine running threads that were tightly bound to their NUMA node which will probably only be tree on HPC configurations. For things like mail servers running on Nehalem with large NUMA distances, this patch could have very surprising results.
Nevertheless, I quite like the ZONE_RECLAIM_LOCKED part. Although, this can be really subtle because we might end up reclaiming too much if there is really heavy parallel load or multiple high order allocators.
Yes. A consequence of removing it is that a number of parallel allocator requests that happen at the same time dump all the memory of a node. Worse, they could start reclaiming each others memory in the node which would manifest as unexpected stalls at unpredictable times. This is true for normal reclaim of course but in __zone_reclaim it's potentially worse as it's focusing on clean pages.
I will wait with reverting for Mel but mm-devzero-optimisation.patch is one that can go away for sure.
Ditch it. It is likely to adversely affect common workloads that are heavily file based. -- Mel Gorman SUSE Labs -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org