On Tue, May 27, 2014 at 11:49:08AM -0400, Cristian Rodríguez wrote:
Just out of curiosity, why SUSE kernels default to CONFIG_SLAB and not CONFIG_SLUB ? I just checked the rest of current distribution world and everyone else takes the kernel default.
The upstream default to use SLUB was not based on performance data. This is the original commit that set the default http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=a0... To the best of my recollection it was set as a default so that it would be tested by the wider community. In the subsequent kernel releases there were a large number of patches to address performance concerns in SLUB. It has been some time since an upstream evaluation was done and at the time it was found to have problems on some workloads -- networking intensive ones in particular which are sl[a|u]b intensive. Performance changes to SLUB are now relatively rare but I'm not aware of any recent SLUB vs SLAB evaluation that would decide the matter one way or the other.
It's true that a number of distributions have changed their default to SLUB but to the best of my knowledge this was not based on any performance evaluation either. The distributions changed their config to match the upstream default on the mistaken belief that it must have been made for performance reasons.
For better or worse, SLAB has known performance characteristics. I'm not aware of any performance bug reports in openSUSE that identified SLAB as the bottleneck that SLUB would address. While there are some design considerations in SLAB such as the fact that alien caches can grow to a large size, it also has been found the performance of some workloads depended on those large remote caches. In some circumstances it is easier to debug problems using SLUB but that in itself is not a justification for the change.
SLUB is also not without its performance concerns. IIRC, to achieve the best performance of SLUB requires the use of large contiguous pages to avoid list locks. While this will show better results for benchmarks that fit in memory, the performance can decay if the machine is under memory pressure and the ability of SLUB to use contiguous pages may decay the longer the system is up. To the best of my knowledge, this potential problem has never been properly analysed. Furthermore while SLUBs extensive use of atomic operations makes intuitive sense it is also not guaranteed to be a universal win in all cases meaning the performance profile of it is harder to predict.
Due to the lack of concrete performance data identifying bottlenecks in SLAB there has been little motivation to evaluate SLUB vs SLAB in the openSUSE context. Clearly identifying a sensible workload (not a microbenchmark) that shows SLAB as the bottleneck on a range of machine size that is fundamental to its design may be grounds for revisiting this. However, the bar that would convince me to do a full evaluation would be relatively high -- much higher than "just because".