On Thu, Aug 27, 2015 at 11:50:35PM -0400, Jeff Mahoney wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 8/27/15 10:04 PM, Navin Parakkal wrote:
Hi, I would like to know why SUSE through SLES and openSUSE still uses CONFIG_SLAB as the allocator instead of CONFIG_SLUB ?.
I've seen worst case scenarios locking up buffer_head when there is fragmentation and the number of slabs are in millions.
Any particular case where it performs better than SLUB . I'm not investigating on low end systems where SLOB is another option.
Mel can probably comment further, but there are two answers.
- Our testing hasn't shown any clear winner under all workloads
between the two allocators and our mm experts have many years of experience working with the slab code.
This is a repeat response as my previous message got held up by the moderator as firstname.lastname@example.org is not subscribed. Sorry to those that get it twice.
This question was raised before and my answer at the time was http://lists.opensuse.org/opensuse-kernel/2014-06/msg00047.html . Much of what I said there is still relevant even if it lacked data at the time. I recently looked at the performance of SLAB vs SLUB on 4 mid-range servers using kernel 4.1. The objective was to see if there was a compelling reason to switch to SLUB in openSUSE. I didn't release the report as it would need a lot of polish before it'd be fit for public consumption.
tldr: There is no known compelling reason to make the switch but there is no known barrier to it either. SLUB is potentially easier to debug but it has some potential degradation over long periods of time that are not quantified. There were also some gaps in the study that ideally would be closed before making a switch. Right now, I'm favouring keeping the status quo but would be open to hearing about a realistic workload or hardware configuration that clearly distinguished between them.
SLOB was ruled out as an option because it's known to not scale on even low-end hardware. It really is for embedded environments with constrained memory. The following workloads were evaluated
dbt5 on ext4 pgbench on ext4 netperf with multiple sizes, both TCP and UDP, streaming and RR netperf with multiple sizes, both TCP and UDP RR netperf with multiple instances with 256 bytes fixed size Hackbench with pipes Hackbench with sockets Deduplication and compression of large files
Ultimately, there was little difference in performance between SLAB and SLUB but crucially, neither was consistently better or worse than the other. I'm not going to write up the results in email format but by and large, they are not that interesting. Instead here is a slightly editted version of the study's conclusion.
=== begin report extract ===
Early evaluations of SLAB and SLUB indicated that SLUB was slower in a variety of workloads. The shorter paths and smaller management was theoritically beneficial but offset by a reliance on high-order pages and a long slow path. In recent years there have been a number of aggressive attempts to improve the performance of SLUB while SLAB remained stable. This study indicates that there is either little difference in performance in many cases. In cases where there are large performance differences then neither SLAB nor SLUB is the consistent winner. For network workloads, it appeared that SLUB was generally faster for TCP-based workloads but we cannot make a general statement on whether openSUSE users workloads prefer TCP over UDP performance.
The observation of this study is that there are no known performance-related barriers to using SLUB in openSUSE if it's based on a 4.1 kernel albeit there is also no compelling reason to make the switch. With no known distinction in performance then the fact that SLUB can be debugged at runtime is compelling and the lower memory footprint is attractive. This is offset by some limitations in the study.
A key limitation of the network portions of this study was the fact that it was mostly over localhost. An evaluation was conducted on a pair of machines that match the description but the results showed no difference between SLAB and SLUB. The result is not that interesting as they indicate that the network speed was the limiting factor. It would be necessary to install faster network cards or a faster switch to properly conduct that evaluation. The machines that were used in 2007 and 2012 to demonstrate network-related performance regressions were also 4-node machines which is an important factor as SLAB and SLUB differ on how remote data is managed. That configuration should be replicated if possible.
A second important gap in the study was an evaluation of an in-memory database. There was a configuration included but the results were not reported. Kernel 4.1 has a regression that caused the workload to swap and other data indicates that this regression was introduced between 4.0 and 4.1. The root cause of this regression is not known at the time of writing.
The final recommendation is to keep the status quo until the gaps are addressed or a compelling reason is found to justify the risk of switching.