Michael Calmer wrote:
It is not only this. Please think of datacenters with thousands of virtual machines where the virtual harddisks are placed on a storage.
This storage has some raid level and admins are really fighting for every MB the OS do not need. Simply because multiplied by 1000 or 2000 some MB makes a difference in costs which need to be spend for the storage. And if we talk about other architectures where storages are extra expensive this hurt more.
Sorry, based on my experience with customers running mid-size data centers I don't buy this cost argument. And really large data centers (Google, Facebook etc.) will run their own Linux distribution with a partitioning scheme and file-system hierarchy allowing to mount read-only parts of the OS from central location for saving lots of storage space. Having said this: If you're really eager to optimize for small/reusable data storage for large data centers there are more promising measures to reduce storage size.
If I look at JeOS, I can see we even do not have the standard kernel installed but only the kernel-default-base just to save some MB.
But IIRC this was done because of initrd size.
The next things are containers which become really popular these days. Every container only run 1 application but bring a python stack if the software is written in python.
The reason why applications bring their own module stack (pip install in a virtualenv, own devpi index) is that application programmers don't want to deal with the arbitrary changes made by OS packagers. Especially since OS packagers usually do not test the Python modules they update. Probably I will also take this route for my Æ-DIR to avoid having to change my ansible code each time OS packagers make random package name changes etc. And this will also redeem me from packaging Python modules. Ciao, Michael.