[opensuse] Transparent Huge Pages
I've just noticed that Transparent Huge Pages are enabled on my Leap 15.0 system. I don't believe I've done anything to cause that since I didn't know what they were until I recently read the redis log which says: "WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled." So why are THP enabled by default? I could understand if they were set to madvise for example, but are they likely to make a huge difference on my 8 GB desktop? And I don't understand why redis recommend a setting of never rather than madvise, if anybody has any thoughts on that? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 03/16/2019 07:31 PM, Dave Howorth wrote:
So why are THP enabled by default? I could understand if they were set to madvise for example, but are they likely to make a huge difference on my 8 GB desktop?
And I don't understand why redis recommend a setting of never rather than madvise, if anybody has any thoughts on that?
Strange, they are disabled on 42.3. Apparently there is benefit to the decreasing the number of lookups, but the downside is on systems running databases or dealing with sparse data. I don't know whether this helps or hurts the normal user. -- David C. Rankin, J.D.,P.E. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Am 17.03.19 um 02:37 schrieb David C. Rankin:
On 03/16/2019 07:31 PM, Dave Howorth wrote:
So why are THP enabled by default? I could understand if they were set to madvise for example, but are they likely to make a huge difference on my 8 GB desktop?
And I don't understand why redis recommend a setting of never rather than madvise, if anybody has any thoughts on that?
Strange, they are disabled on 42.3. Apparently there is benefit to the decreasing the number of lookups, but the downside is on systems running databases or dealing with sparse data. I don't know whether this helps or hurts the normal user.
It hurts me regularly when I run VMware virtual machines on Leap 15. Then khugepaged and the VM utilize the assigned cpu cores 100% for a couple of seconds up to a minute. Therefor I disable THP, if I care for the performance of the VM. Hendrik -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
17.03.2019 12:20, Hendrik Woltersdorf пишет:
Am 17.03.19 um 02:37 schrieb David C. Rankin:
On 03/16/2019 07:31 PM, Dave Howorth wrote:
So why are THP enabled by default? I could understand if they were set to madvise for example, but are they likely to make a huge difference on my 8 GB desktop?
And I don't understand why redis recommend a setting of never rather than madvise, if anybody has any thoughts on that?
Strange, they are disabled on 42.3. Apparently there is benefit to the decreasing the number of lookups, but the downside is on systems running databases or dealing with sparse data. I don't know whether this helps or hurts the normal user.
It hurts me regularly when I run VMware virtual machines on Leap 15. Then khugepaged and the VM utilize the assigned cpu cores 100% for a couple of seconds up to a minute. Therefor I disable THP, if I care for the performance of the VM.
You disable it in host or in guest? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Am 17.03.19 um 12:17 schrieb Andrei Borzenkov:
17.03.2019 12:20, Hendrik Woltersdorf пишет:
Am 17.03.19 um 02:37 schrieb David C. Rankin:
On 03/16/2019 07:31 PM, Dave Howorth wrote:
So why are THP enabled by default? I could understand if they were set to madvise for example, but are they likely to make a huge difference on my 8 GB desktop?
And I don't understand why redis recommend a setting of never rather than madvise, if anybody has any thoughts on that?
Strange, they are disabled on 42.3. Apparently there is benefit to the decreasing the number of lookups, but the downside is on systems running databases or dealing with sparse data. I don't know whether this helps or hurts the normal user.
It hurts me regularly when I run VMware virtual machines on Leap 15. Then khugepaged and the VM utilize the assigned cpu cores 100% for a couple of seconds up to a minute. Therefor I disable THP, if I care for the performance of the VM.
You disable it in host or in guest?
On the host. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
David C. Rankin wrote:
On 03/16/2019 07:31 PM, Dave Howorth wrote:
So why are THP enabled by default? I could understand if they were set to madvise for example, but are they likely to make a huge difference on my 8 GB desktop?
And I don't understand why redis recommend a setting of never rather than madvise, if anybody has any thoughts on that?
Strange, they are disabled on 42.3.
On my Lenovo laptop with Leap42.3 and 8Gb memory, it is enabled. -- Per Jessen, Zürich (13.3°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 17/03/2019 10.45, Per Jessen wrote:
David C. Rankin wrote:
On 03/16/2019 07:31 PM, Dave Howorth wrote:
So why are THP enabled by default? I could understand if they were set to madvise for example, but are they likely to make a huge difference on my 8 GB desktop?
And I don't understand why redis recommend a setting of never rather than madvise, if anybody has any thoughts on that?
Strange, they are disabled on 42.3.
On my Lenovo laptop with Leap42.3 and 8Gb memory, it is enabled.
cer@Telcontar:~> cat /sys/kernel/mm/transparent_hugepage/enabled [always] madvise never cer@Telcontar:~> I suppose that is disabled. This machine has 8 GiB, a desktop with 15.0 I obtain the same result on my small laptop, which has 4 GiB. -- Cheers / Saludos, Carlos E. R. (from 15.0 x86_64 at Telcontar)
On Sun, 17 Mar 2019 13:23:35 +0100 "Carlos E. R." <robin.listas@telefonica.net> wrote:
On 17/03/2019 10.45, Per Jessen wrote:
David C. Rankin wrote:
On 03/16/2019 07:31 PM, Dave Howorth wrote:
So why are THP enabled by default? I could understand if they were set to madvise for example, but are they likely to make a huge difference on my 8 GB desktop?
And I don't understand why redis recommend a setting of never rather than madvise, if anybody has any thoughts on that?
I still don't understand why madvise doesn't solve all problems.
Strange, they are disabled on 42.3.
On my Lenovo laptop with Leap42.3 and 8Gb memory, it is enabled.
cer@Telcontar:~> cat /sys/kernel/mm/transparent_hugepage/enabled [always] madvise never cer@Telcontar:~>
I suppose that is disabled. This machine has 8 GiB, a desktop with 15.0
No, that is enabled. The square brackets indicate the chosen option.
I obtain the same result on my small laptop, which has 4 GiB.
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 17/03/2019 13.36, Dave Howorth wrote:
On Sun, 17 Mar 2019 13:23:35 +0100 "Carlos E. R." <> wrote:
On 17/03/2019 10.45, Per Jessen wrote:
David C. Rankin wrote:
On 03/16/2019 07:31 PM, Dave Howorth wrote:
So why are THP enabled by default? I could understand if they were set to madvise for example, but are they likely to make a huge difference on my 8 GB desktop?
And I don't understand why redis recommend a setting of never rather than madvise, if anybody has any thoughts on that?
I still don't understand why madvise doesn't solve all problems.
I don't even know what huge pages are about. Do you have some link for dummies out there? :-D Wikipedia has nothing on THP. Google points to <https://docs.mongodb.com/manual/tutorial/transparent-huge-pages/> «Transparent Huge Pages (THP) is a Linux memory management system that reduces the overhead of Translation Lookaside Buffer (TLB) lookups on machines with large amounts of memory by using larger memory pages. However, database workloads often perform poorly with THP, because they tend to have sparse rather than contiguous memory access patterns. You should disable THP on Linux machines to ensure best performance with MongoDB.» I don't think I understand much of that. Is 8 GiB of RAM considered "large amount of memory"? So I'm wondering if I should disable it.
Strange, they are disabled on 42.3.
On my Lenovo laptop with Leap42.3 and 8Gb memory, it is enabled.
cer@Telcontar:~> cat /sys/kernel/mm/transparent_hugepage/enabled [always] madvise never cer@Telcontar:~>
I suppose that is disabled. This machine has 8 GiB, a desktop with 15.0
No, that is enabled. The square brackets indicate the chosen option.
Ah, of course. -- Cheers / Saludos, Carlos E. R. (from 15.0 x86_64 at Telcontar)
On Sun, 17 Mar 2019 13:54:04 +0100 "Carlos E. R." <robin.listas@telefonica.net> wrote:
On 17/03/2019 13.36, Dave Howorth wrote:
On Sun, 17 Mar 2019 13:23:35 +0100 "Carlos E. R." <> wrote:
On 17/03/2019 10.45, Per Jessen wrote:
David C. Rankin wrote:
On 03/16/2019 07:31 PM, Dave Howorth wrote:
So why are THP enabled by default? I could understand if they were set to madvise for example, but are they likely to make a huge difference on my 8 GB desktop?
And I don't understand why redis recommend a setting of never rather than madvise, if anybody has any thoughts on that?
I still don't understand why madvise doesn't solve all problems.
I don't even know what huge pages are about. Do you have some link for dummies out there? :-D
No, I just started googling a few hours before you :)
Wikipedia has nothing on THP.
Google points to
<https://docs.mongodb.com/manual/tutorial/transparent-huge-pages/>
«Transparent Huge Pages (THP) is a Linux memory management system that reduces the overhead of Translation Lookaside Buffer (TLB) lookups on machines with large amounts of memory by using larger memory pages.
However, database workloads often perform poorly with THP, because they tend to have sparse rather than contiguous memory access patterns. You should disable THP on Linux machines to ensure best performance with MongoDB.»
I don't think I understand much of that.
Is 8 GiB of RAM considered "large amount of memory"? So I'm wondering if I should disable it.
Dunno, I'm wondering too. That's why I'm asking questions.
Strange, they are disabled on 42.3.
On my Lenovo laptop with Leap42.3 and 8Gb memory, it is enabled.
cer@Telcontar:~> cat /sys/kernel/mm/transparent_hugepage/enabled [always] madvise never cer@Telcontar:~>
I suppose that is disabled. This machine has 8 GiB, a desktop with 15.0
No, that is enabled. The square brackets indicate the chosen option.
Ah, of course.
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Hello, On Sun, 17 Mar 2019, Dave Howorth wrote:
On Sun, 17 Mar 2019 13:54:04 +0100 "Carlos E. R." <robin.listas@telefonica.net> wrote: [..]
Wikipedia has nothing on THP. [..] Is 8 GiB of RAM considered "large amount of memory"? So I'm wondering if I should disable it.
Dunno, I'm wondering too. That's why I'm asking questions.
/usr/src/linux/Documentation/vm/transhuge.txt HTH, -dnh -- Get back there in front of the computer NOW. Christmas can wait. -- Linus "the Grinch" Torvalds, 24 Dec 2000 on linux-kernel -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Mon, 18 Mar 2019 22:51:48 +0100 David Haller <dnh@opensuse.org> wrote:
/usr/src/linux/Documentation/vm/transhuge.txt
Thanks, somebody on the redis list just pointed me to that page on the kernel website. Unfortunately it doesn't even tell me what the default is for the 'enabled' control, let alone explain why or what the effect of changing it would be in terms I can understand. I'm thinking I'll bug report both openSUSE and redis if nobody knows and see what the various devs have to say. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
On Mon, 18 Mar 2019 22:51:48 +0100 David Haller <dnh@opensuse.org> wrote:
/usr/src/linux/Documentation/vm/transhuge.txt
Thanks, somebody on the redis list just pointed me to that page on the kernel website. Unfortunately it doesn't even tell me what the default is for the 'enabled' control, let alone explain why or what the effect of changing it would be in terms I can understand.
Very simplified - it's all about performance. With a large amount of memory, the TLB (which caches address lookups from virtual to physical) grows too big, which will slow down the access. By using 2M pages instead of 4K pages, the number of TLB entries is significantly reduced, thereby causing a TLB miss to be much faster. Also, when each TLB entry covers a much large page, we will have fewer TLB misses. I would say using THP is probably most beneficial to a virtual host, but I don't know what the effect (if any/measurable) might be for a real server or a desktop. The settings are fairly obvious - always - always use huge pages never - never use huge pages madvise - use huge pages when madvise() requests it. I am also not certain of when the latter will be useful. Wrt redis, I guess it uses lots of small chunks of memory, which could suffer from always using chunks that are much bigger than needed. -- Per Jessen, Zürich (6.8°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Tue, 19 Mar 2019 13:49:46 +0100 Per Jessen <per@computer.org> wrote:
Dave Howorth wrote:
On Mon, 18 Mar 2019 22:51:48 +0100 David Haller <dnh@opensuse.org> wrote:
/usr/src/linux/Documentation/vm/transhuge.txt
Thanks, somebody on the redis list just pointed me to that page on the kernel website. Unfortunately it doesn't even tell me what the default is for the 'enabled' control, let alone explain why or what the effect of changing it would be in terms I can understand.
Very simplified - it's all about performance. With a large amount of memory, the TLB (which caches address lookups from virtual to physical) grows too big, which will slow down the access.
By using 2M pages instead of 4K pages, the number of TLB entries is significantly reduced, thereby causing a TLB miss to be much faster. Also, when each TLB entry covers a much large page, we will have fewer TLB misses.
I would say using THP is probably most beneficial to a virtual host, but I don't know what the effect (if any/measurable) might be for a real server or a desktop.
The settings are fairly obvious -
always - always use huge pages never - never use huge pages madvise - use huge pages when madvise() requests it.
I am also not certain of when the latter will be useful.
I don't understand why madvise isn't the most useful. Surely it is reasonable that applications that want huge pages simply ask for them?
Wrt redis, I guess it uses lots of small chunks of memory, which could suffer from always using chunks that are much bigger than needed.
Indeed, but I still don't understand why it insists on 'never'. Surely 'madvise' fixes its problem? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
On Tue, 19 Mar 2019 13:49:46 +0100 Per Jessen <per@computer.org> wrote:
The settings are fairly obvious -
always - always use huge pages never - never use huge pages madvise - use huge pages when madvise() requests it.
I am also not certain of when the latter will be useful.
I don't understand why madvise isn't the most useful. Surely it is reasonable that applications that want huge pages simply ask for them?
Just to be on the safe side - madvise() is for apps to tell the kernel about their intended use of memory, i.e. size and access pattern. I guess it means using 2M pages if the app consumes memory in large chunks.
Wrt redis, I guess it uses lots of small chunks of memory, which could suffer from always using chunks that are much bigger than needed.
Indeed, but I still don't understand why it insists on 'never'. Surely 'madvise' fixes its problem?
Erroring on the side of caution ? if they understand madvise() as little as us, saying 'never' is safe. -- Per Jessen, Zürich (7.4°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
FWIW... I run a busy php service. Running perf on it showed it's primary CPU activity was memory compaction. Google suggested that this was THP, and, indeed turning off THP dropped CPU load down noticeably. From what research I could understand, I believe this is essentially the same problem as redis. Lot's of small transactions triggering a lot of excess memory compaction in order to allocate a larger page. On 0319, Per Jessen wrote:
Dave Howorth wrote:
On Tue, 19 Mar 2019 13:49:46 +0100 Per Jessen <per@computer.org> wrote:
The settings are fairly obvious -
always - always use huge pages never - never use huge pages madvise - use huge pages when madvise() requests it.
I am also not certain of when the latter will be useful.
I don't understand why madvise isn't the most useful. Surely it is reasonable that applications that want huge pages simply ask for them?
Just to be on the safe side - madvise() is for apps to tell the kernel about their intended use of memory, i.e. size and access pattern. I guess it means using 2M pages if the app consumes memory in large chunks.
Wrt redis, I guess it uses lots of small chunks of memory, which could suffer from always using chunks that are much bigger than needed.
Indeed, but I still don't understand why it insists on 'never'. Surely 'madvise' fixes its problem?
Erroring on the side of caution ? if they understand madvise() as little as us, saying 'never' is safe.
-- __________________________________________________________________________ Josef Fortier Systems Administrator fortier@augsburg.edu Phone: 612-330-1479 __________________________________________________________________________ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Tue, 19 Mar 2019 11:09:50 -0500 Josef Fortier <fortier@augsburg.edu> wrote:
FWIW...
I run a busy php service. Running perf on it showed it's primary CPU activity was memory compaction. Google suggested that this was THP, and, indeed turning off THP dropped CPU load down noticeably. From what research I could understand, I believe this is essentially the same problem as redis. Lot's of small transactions triggering a lot of excess memory compaction in order to allocate a larger page.
Thanks for the info. Yes, I agree it sounds like a similar situation. I haven't come across perf before, so thanks for that pointer; it looks a bit complicated to run though :( My machine isn't very busy so I doubt I can conduct any useful experiment. Although: $ cat /proc/meminfo ... AnonHugePages: 350208 kB ... so its allocating THP for something. I wonder if setting /sys/kernel/mm/transparent_hugepage/enabled to 'madvise' temporarily would leave your system in the improved state or make it revert to the heavily-loaded state?
On 0319, Per Jessen wrote:
Dave Howorth wrote:
On Tue, 19 Mar 2019 13:49:46 +0100 Per Jessen <per@computer.org> wrote:
The settings are fairly obvious -
always - always use huge pages never - never use huge pages madvise - use huge pages when madvise() requests it.
I am also not certain of when the latter will be useful.
I don't understand why madvise isn't the most useful. Surely it is reasonable that applications that want huge pages simply ask for them?
Just to be on the safe side - madvise() is for apps to tell the kernel about their intended use of memory, i.e. size and access pattern. I guess it means using 2M pages if the app consumes memory in large chunks.
Wrt redis, I guess it uses lots of small chunks of memory, which could suffer from always using chunks that are much bigger than needed.
Indeed, but I still don't understand why it insists on 'never'. Surely 'madvise' fixes its problem?
Erroring on the side of caution ? if they understand madvise() as little as us, saying 'never' is safe.
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
* Dave Howorth <dave@howorth.org.uk> [03-19-19 13:34]:
On Tue, 19 Mar 2019 11:09:50 -0500 Josef Fortier <fortier@augsburg.edu> wrote:
FWIW...
I run a busy php service. Running perf on it showed it's primary CPU activity was memory compaction. Google suggested that this was THP, and, indeed turning off THP dropped CPU load down noticeably. From what research I could understand, I believe this is essentially the same problem as redis. Lot's of small transactions triggering a lot of excess memory compaction in order to allocate a larger page.
Thanks for the info. Yes, I agree it sounds like a similar situation. I haven't come across perf before, so thanks for that pointer; it looks a bit complicated to run though :( My machine isn't very busy so I doubt I can conduct any useful experiment. Although:
$ cat /proc/meminfo ... AnonHugePages: 350208 kB ...
so its allocating THP for something.
I wonder if setting /sys/kernel/mm/transparent_hugepage/enabled to 'madvise' temporarily would leave your system in the improved state or make it revert to the heavily-loaded state?
On 0319, Per Jessen wrote:
Dave Howorth wrote:
On Tue, 19 Mar 2019 13:49:46 +0100 Per Jessen <per@computer.org> wrote:
The settings are fairly obvious -
always - always use huge pages never - never use huge pages madvise - use huge pages when madvise() requests it.
I am also not certain of when the latter will be useful.
I don't understand why madvise isn't the most useful. Surely it is reasonable that applications that want huge pages simply ask for them?
Just to be on the safe side - madvise() is for apps to tell the kernel about their intended use of memory, i.e. size and access pattern. I guess it means using 2M pages if the app consumes memory in large chunks.
Wrt redis, I guess it uses lots of small chunks of memory, which could suffer from always using chunks that are much bigger than needed.
Indeed, but I still don't understand why it insists on 'never'. Surely 'madvise' fixes its problem?
Erroring on the side of caution ? if they understand madvise() as little as us, saying 'never' is safe.
you're in good shape. my desktop/workstation AnonHugePages: 3717120 kB my server AnonHugePages: 126976 kB -- (paka)Patrick Shanahan Plainfield, Indiana, USA @ptilopteri http://en.opensuse.org openSUSE Community Member facebook/ptilopteri Registered Linux User #207535 @ http://linuxcounter.net Photos: http://wahoo.no-ip.org/piwigo paka @ IRCnet freenode -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Tue, 19 Mar 2019 20:15:37 +0300 Andrei Borzenkov <arvidjaar@gmail.com> wrote:
19.03.2019 16:20, Dave Howorth пишет:
I don't understand why madvise isn't the most useful.
Because then THP stops being Transparent.
But given the difficulties redis and php and presumably other programs appear to have, the whole idea of transparency is a bug not a feature, is it not? Surely it's better to have a design where programs that want to use a new feature ask for it and programs that don't aren't forced to use it? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
On Tue, 19 Mar 2019 20:15:37 +0300 Andrei Borzenkov <arvidjaar@gmail.com> wrote:
19.03.2019 16:20, Dave Howorth пишет:
I don't understand why madvise isn't the most useful.
Because then THP stops being Transparent.
But given the difficulties redis and php and presumably other programs appear to have, the whole idea of transparency is a bug not a feature, is it not?
Not if every system that doesn't run redis is doing better because of it ?
Surely it's better to have a design where programs that want to use a new feature ask for it and programs that don't aren't forced to use it?
huge pages is not really a user/application level feature. -- Per Jessen, Zürich (4.4°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Tue, 19 Mar 2019 20:08:57 +0100 Per Jessen <per@computer.org> wrote:
Dave Howorth wrote:
On Tue, 19 Mar 2019 20:15:37 +0300 Andrei Borzenkov <arvidjaar@gmail.com> wrote:
19.03.2019 16:20, Dave Howorth пишет:
I don't understand why madvise isn't the most useful.
Because then THP stops being Transparent.
But given the difficulties redis and php and presumably other programs appear to have, the whole idea of transparency is a bug not a feature, is it not?
Not if every system that doesn't run redis is doing better because of it ?
There are a *lot* of systems that run PHP. And I'm sure there are lots of other applications that use small memory blocks. Why should they be penalised?
Surely it's better to have a design where programs that want to use a new feature ask for it and programs that don't aren't forced to use it?
huge pages is not really a user/application level feature.
what's madvise about then? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
On Tue, 19 Mar 2019 20:08:57 +0100 Per Jessen <per@computer.org> wrote:
Dave Howorth wrote:
On Tue, 19 Mar 2019 20:15:37 +0300 Andrei Borzenkov <arvidjaar@gmail.com> wrote:
19.03.2019 16:20, Dave Howorth пишет:
I don't understand why madvise isn't the most useful.
Because then THP stops being Transparent.
But given the difficulties redis and php and presumably other programs appear to have, the whole idea of transparency is a bug not a feature, is it not?
Not if every system that doesn't run redis is doing better because of it ?
There are a *lot* of systems that run PHP.
Sure, we have quite a few, but php is just a scripting language. The vast majority of php apps will run with zero impact whether thp is on or off.
And I'm sure there are lots of other applications that use small memory blocks. Why should they be penalised?
IMO, that is not the general case - a few might suffer when running at high load, that's all.
Surely it's better to have a design where programs that want to use a new feature ask for it and programs that don't aren't forced to use it?
huge pages is not really a user/application level feature.
what's madvise about then?
It's only advising the memory manager about intended sizes and patterns of memory access. Whether or not that leads to huge pages is up to the memory manager and in no way visible to the application. -- Per Jessen, Zürich (1.4°C) http://www.hostsuisse.com/ - virtual servers, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Huge pages are the ones that are larger than the "standard" 8K pages Linux uses. On larger memory systems, say 256GB servers or 128G desktops, it might be useful to use 'always'. At a 2MB page size, you can fit 512 "items" into 1G, so for an 8GB system, that translates to 4096 items that can fit into memory at a time (maybe a bit more if they only use hugepages for data and not code). If you run a program that wants to store half of a million words in memory and it wants to allocate 1 memory segment for each word, it thinks that would be 8k (standard page size) * 512K = 4096 MB or 4GB. But if each page is 2M, for 512K entries, it would take 1024*1024 MB, or about one TB of space. Most systems would have to have a large paging file to support that which is guaranteed to cause the latency associated with swapping. If you had a system with 1-2TB of memory in it, it would still be rather slow as there is no way that memory would fit in the cpu cache. The transparent part has to do with whether or not an application can use 'huge pages' without being aware that they exist (i.e. needing NO changes to the program, thus "transparent". madvise refers to the name of the posix call that can tell the kernel how your application wants to use memory. One of those calls can tell the kernel what areas of memory should be best handled in 'huge pages'. That's the idea situation - programs that can run best with large contiguous memory segments ask for them and generally run faster due to a need to use 4096x less kernel pages to map their data into active memory. This is very beneficial as system memory sizes grow. For a VM that fits into a 64m memory space, it usually would be more efficient to map that in via 32 'map' calls vs. 8096 'map' calls, meaning the host would likely be more efficient running that program with transparent huge pages (which the client would likely use normal size pages assuming it simply used memory as a linear address space), If using huge pages on the host requires that the client also use huge pages, the client would only have 32-things it could map into memory and would likely suffer performance problems. I don't know which vm techs allow for treating memory differently from the host vs. the client viewpoint. Hopefully this gives a more clear idea about what these options mean and when they might be useful...(?) if not, feel free to ask more questions and I can try to answer them, though if you want me to respond more quickly, sending a copy directly to me would likely have me seeing it more quickly... Linda -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Tue, 19 Mar 2019 12:59:24 -0700 L A Walsh <suse@tlinx.org> wrote:
Huge pages are the ones that are larger than the "standard" 8K pages Linux uses.
On larger memory systems, say 256GB servers or 128G desktops, it might be useful to use 'always'. At a 2MB page size, you can fit 512 "items" into 1G, so for an 8GB system, that translates to 4096 items that can fit into memory at a time (maybe a bit more if they only use hugepages for data and not code).
If you run a program that wants to store half of a million words in memory and it wants to allocate 1 memory segment for each word, it thinks that would be 8k (standard page size) * 512K = 4096 MB or 4GB.
But if each page is 2M, for 512K entries, it would take 1024*1024 MB, or about one TB of space. Most systems would have to have a large paging file to support that which is guaranteed to cause the latency associated with swapping.
If you had a system with 1-2TB of memory in it, it would still be rather slow as there is no way that memory would fit in the cpu cache.
The transparent part has to do with whether or not an application can use 'huge pages' without being aware that they exist (i.e. needing NO changes to the program, thus "transparent". madvise refers to the name of the posix call that can tell the kernel how your application wants to use memory.
One of those calls can tell the kernel what areas of memory should be best handled in 'huge pages'. That's the idea situation - programs that can run best with large contiguous memory segments ask for them and generally run faster due to a need to use 4096x less kernel pages to map their data into active memory. This is very beneficial as system memory sizes grow.
For a VM that fits into a 64m memory space, it usually would be more efficient to map that in via 32 'map' calls vs. 8096 'map' calls, meaning the host would likely be more efficient running that program with transparent huge pages (which the client would likely use normal size pages assuming it simply used memory as a linear address space), If using huge pages on the host requires that the client also use huge pages, the client would only have 32-things it could map into memory and would likely suffer performance problems.
I don't know which vm techs allow for treating memory differently from the host vs. the client viewpoint.
Hopefully this gives a more clear idea about what these options mean and when they might be useful...(?)
Yes, it makes things fairly clear. They're useful on big machines and they're useful with VMs. Since I have a modest machine and never use VMs I question why my system is set to 'always' by default? Especially given that it does apparently have negative effects on programs that I *do* run. And since a VM technology can easily request huge pages, as can other programs that use large amounts of memory in suitable ways, why do they have to be 'transparent'?
if not, feel free to ask more questions and I can try to answer them, though if you want me to respond more quickly, sending a copy directly to me would likely have me seeing it more quickly...
Feel free to modify the reply-to settings if you want personal replies. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On 3/19/2019 1:17 PM, Dave Howorth wrote:
On Tue, 19 Mar 2019 12:59:24 -0700
Yes, it makes things fairly clear. They're useful on big machines and they're useful with VMs. Since I have a modest machine and never use VMs I question why my system is set to 'always' by default?
if not, feel free to ask more questions and I can try to answer them, though if you want me to respond more quickly, sending a copy directly to me would likely have me seeing it more quickly...
Feel free to modify the reply-to settings if you want personal replies.
The normal default used to be Reply-All, but I wouldn't want to have it sent just to me if someone is using reply-to-group -- neither would I want to change it to me in the case where they *didn't* want a quicker response, or primarily wanted to stand on a soapbox and address the group, not really wanting or needing a personal reply. Since you were the person who started the thread, I would submit that you more likely want to address the group and that you didn't need a faster response from me. FWIW, I agree with you -- setting it to always might be useful on a distro aimed at servers or larger memory machines, but on many desktops with 8-32GB, setting it to madvise would seem the wisest choice of action and most likely to cause least disruption. But like the scheduler settings for no-preempt, voluntary preempt, fully preemptable, it seems someone set them with some particular goal in mind other than causing least disruption. Like it was one of the "doers" who took the attitude of those who do make the rules. Unfortunately, it seems, rarely are such decisions "informed". -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
On Tue, 19 Mar 2019 12:59:24 -0700 L A Walsh <suse@tlinx.org> wrote:
[snip]
Yes, it makes things fairly clear. They're useful on big machines and they're useful with VMs. Since I have a modest machine and never use VMs I question why my system is set to 'always' by default?
Probably because it is a sane default for SLES and will most likely not impact your performance.
Especially given that it does apparently have negative effects on programs that I *do* run.
I would suggest s/apparently/theoretically/. redis just probed the setting, but it has no idea what effect it will really have. -- Per Jessen, Zürich (1.8°C) http://www.hostsuisse.com/ - dedicated server rental in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Wed, 20 Mar 2019 08:35:46 +0100 Per Jessen <per@computer.org> wrote:
I would suggest s/apparently/theoretically/. redis just probed the setting, but it has no idea what effect it will really have.
http://antirez.com/news/84 -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
On Wed, 20 Mar 2019 08:35:46 +0100 Per Jessen <per@computer.org> wrote:
I would suggest s/apparently/theoretically/. redis just probed the setting, but it has no idea what effect it will really have.
Sure. I meant theoretically in your case. Having THP enabled in openSUSE is a safe and reasonable default - 99% of installations will not be running redis or memcached and of those 1% remaining, only 1% will be running in environments where the setting will matter. -- Per Jessen, Zürich (9.0°C) http://www.dns24.ch/ - free dynamic DNS, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Wed, 20 Mar 2019 08:35:46 +0100
Per Jessen <per@computer.org> wrote:
I would suggest s/apparently/theoretically/. redis just probed the setting, but it has no idea what effect it will really have.
http://antirez.com/news/84 Anyway, thank you for this thread. I changed the setting from always to madvise and guess what, I have less problems now (3 GB of memory used instead of 4 GB and I have less problems when running kmail, as it seems there is a
In data mercoledì 20 marzo 2019 11:01:41 CET, Dave Howorth ha scritto: problem with a memory leak (after baloo crashing) that appears to be less severe with this setting. I am using postgres, so it does not seem to affect negatively any performance on my (admittedly historical) X201 with 8 GB Ram. _________________________________________________________________ ________________________________________________________ Ihre E-Mail-Postfächer sicher & zentral an einem Ort. Jetzt wechseln und alte E-Mail-Adresse mitnehmen! https://www.eclipso.de -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
Dave Howorth wrote:
I've just noticed that Transparent Huge Pages are enabled on my Leap 15.0 system. I don't believe I've done anything to cause that since I didn't know what they were until I recently read the redis log which says:
"WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled."
So why are THP enabled by default? I could understand if they were set to madvise for example, but are they likely to make a huge difference on my 8 GB desktop?
Looking at a couple of Leap 42.3 and Leap 15 systems, with memory varying from 4G to 16G, they all have it enabled. THP is supposedly good for performance machines with larger amounts of memory. -- Per Jessen, Zürich (12.8°C) http://www.dns24.ch/ - your free DNS host, made in Switzerland. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (11)
-
Andrei Borzenkov
-
Carlos E. R.
-
Dave Howorth
-
David C. Rankin
-
David Haller
-
Hendrik Woltersdorf
-
Josef Fortier
-
L A Walsh
-
Patrick Shanahan
-
Per Jessen
-
stakanov