[Bug 1048190] New: Please take over 80-hotplug-cpu-mem.rules
http://bugzilla.suse.com/show_bug.cgi?id=1048190 Bug ID: 1048190 Summary: Please take over 80-hotplug-cpu-mem.rules Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-maintainers@forge.provo.novell.com Reporter: fbui@suse.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- Hi, systemd carries 80-hotplug-cpu-mem.rules which basically encodes the onlining policy for the newly added memory/cpu. This was a temporary solution but now it seems that the kernel gained some support to handle memory onlining eventually: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... So it would be nice if we could make use of that and drop the bits from systemd/udev. Unfortunately there doesn't seem to be a counter part for cpus but this rule should probably be hosted by one of the packages maintained by the kernel team. Thanks. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1048190
Franck Bui
http://bugzilla.suse.com/show_bug.cgi?id=1048190
http://bugzilla.suse.com/show_bug.cgi?id=1048190#c1
Michal Hocko
http://bugzilla.suse.com/show_bug.cgi?id=1048190
http://bugzilla.suse.com/show_bug.cgi?id=1048190#c2
--- Comment #2 from Michal Hocko
http://bugzilla.suse.com/show_bug.cgi?id=1048190
http://bugzilla.suse.com/show_bug.cgi?id=1048190#c3
--- Comment #3 from Michal Hocko
Unfortunately there doesn't seem to be a counter part for cpus but this rule should probably be hosted by one of the packages maintained by the kernel team.
Why is keeping the udev rule in udev/systemd a problem in the first place. This is where we keep other udev rules. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1048190
http://bugzilla.suse.com/show_bug.cgi?id=1048190#c4
--- Comment #4 from Franck Bui
And btw. I am working on memory hotplug to actually drop this auto onlining nonsense. It's been a wrong thing to since beginning. Let's not spread it even more.
Any pointers to discussions related to this work ? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1048190
http://bugzilla.suse.com/show_bug.cgi?id=1048190#c5
--- Comment #5 from Franck Bui
Why is keeping the udev rule in udev/systemd a problem in the first place.
Because the policy doesn't belong to udev either. And the udev rule looks like a workaround as it's described in the commit message I pointed out in comment #0.
This is where we keep other udev rules.
That doesn't mean necessarily that *all* kind of rules must be hosted by udev. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1048190
http://bugzilla.suse.com/show_bug.cgi?id=1048190#c6
--- Comment #6 from Michal Hocko
(In reply to Michal Hocko from comment #2)
And btw. I am working on memory hotplug to actually drop this auto onlining nonsense. It's been a wrong thing to since beginning. Let's not spread it even more.
Any pointers to discussions related to this work ?
My attempt to drop auto_online can be found http://lkml.kernel.org/r/20170227092817.23571-1-mhocko@kernel.org. There was an opposition against doing that mainly based by an observation that the some memory ballooning solutions based on the memory hotplug could lead to memory depletion because the userspace might not react fast enough to online memory while the physical hotadd would already consume memory resources for the new memory. Let's put aside what I think about memory hotplug based ballooning solutions, this is fixable by either postponing memory resource allocation to the online phase or by allocating those resources from the newly hotadded memory range. This is something still on my todo list but a first (large) step was redefining semantic of the memory online phase pulled in the 4.13 merge window http://lkml.kernel.org/r/20170515085827.16474-1-mhocko@kernel.org Once we remove the memory resource allocation problem then there won't be any real reason to have a broken policy in the kernel anymore and I will suggest removing it from the kernel again. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1048190
http://bugzilla.suse.com/show_bug.cgi?id=1048190#c7
--- Comment #7 from Franck Bui
http://bugzilla.suse.com/show_bug.cgi?id=1048190
http://bugzilla.suse.com/show_bug.cgi?id=1048190#c8
--- Comment #8 from Michal Hocko
(In reply to Michal Hocko from comment #3)
Why is keeping the udev rule in udev/systemd a problem in the first place.
Because the policy doesn't belong to udev either.
I am not an udev expert but to me udev rules are basically about a policy what to do when kernel emits an event. Our default policy in this case is to online that memory. Different usecases might overload this rule and use something else (e.g. online memory movable or have a more sophisticated rules based on the specific HW). AFAIK udev allows to override default system rule file.
And the udev rule looks like a workaround as it's described in the commit message I pointed out in comment #0.
that commit message was based on a misunderstanding of the general concept of the memory hotplug coming from a very specific usecase which doesn't really need to distinguish different ways of memory onlining.
This is where we keep other udev rules.
That doesn't mean necessarily that *all* kind of rules must be hosted by udev.
I am not sure where to draw a line but considering that we are talking about a system resouces I would expect this to be in the udev package. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1048190
http://bugzilla.suse.com/show_bug.cgi?id=1048190#c9
--- Comment #9 from Michal Hocko
BTW, there was already another hack added on top of this rule: when some memories are added and all tmpfs whose sizes were specified in % of the total memory size, the rule will remount all tmpfs so their new sizes are updated...
This really seems pretty hackish.
Shouldn't this event be handled by the kernel instead ?
I do not think so because, again that is not a generally advisable thing to do. I can imagine somebody not wanting to increase shmem size when a new memory is added or allow smaller or large portion of new memory to be considered. Kernel doesn't know about that. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1048190
http://bugzilla.suse.com/show_bug.cgi?id=1048190#c10
--- Comment #10 from Franck Bui
I do not think so because, again that is not a generally advisable thing to do. I can imagine somebody not wanting to increase shmem size when a new memory is added or allow smaller or large portion of new memory to be considered. Kernel doesn't know about that.
In this case, the issue is not about policy but the way it's implemented: remounting all the tmpfs fs from userspace seems pretty ugly. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1048190
http://bugzilla.suse.com/show_bug.cgi?id=1048190#c11
--- Comment #11 from Michal Hocko
(In reply to Michal Hocko from comment #9)
I do not think so because, again that is not a generally advisable thing to do. I can imagine somebody not wanting to increase shmem size when a new memory is added or allow smaller or large portion of new memory to be considered. Kernel doesn't know about that.
In this case, the issue is not about policy but the way it's implemented: remounting all the tmpfs fs from userspace seems pretty ugly.
Ohh, I do not pretend this is an art of beauty at all. But it would be even more uggly to do from the kernel IMHO. That would basically require kernel hooking into the memory hotplug and rebinding each shmem filesystem from there. Now the maximum size can be specified in an absolute size or in % and we lose that information during the mount so we do not know whether the change the limit at all. Now you could rightfully object that increasing the size unconditionally from the udev rule is not correct for the same reason. I would agree. The only reason we do so is because some systems online a large part of memory too late after most shmem filesystems are mounted already. So it is a workaround. If somebody dislikes this decision it is trivial to override this policy because it is in userspace. If it was in the kernel our chances would be worse (there would have to be an explicit tunable to control this behavior). -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.suse.com/show_bug.cgi?id=1048190
http://bugzilla.suse.com/show_bug.cgi?id=1048190#c12
Jiri Slaby
http://bugzilla.suse.com/show_bug.cgi?id=1048190
Joey Lee
participants (1)
-
bugzilla_noreply@novell.com