Mailinglist Archive: opensuse-factory (381 mails)

< Previous Next >
[opensuse-factory] Proposal to remove pyc/pyo from Python on TW
  • From: Alberto Planas Dominguez <aplanas@xxxxxxx>
  • Date: Thu, 04 Oct 2018 16:52:13 +0200
  • Message-id: <8807336.6jyTb4SfVb@lena>

As you know the Python packages are collecting the pyc/pyo precompiled
binaries inside the RPM. This is mostly a good idea, as makes the first
execution of the Python code faster, as is skipped the stage where the
interpreter compile the .py code.

But this also makes the Python stack a bit fat. Most of the time this is not a
problem, but things are changing. We have JeOS and MicroOS, both minimal
images (build with different goals and technology) that search for be small
and slim. But we want to include a bit of Python in there, like salt-minion or
cloud-init. And now the relative size of Python is evident.

For Python 2.7 and 3.7 is possible to remove the pyc code from the system and
instruct the interpreter to avoid the recreation of the pyc once the code is
executed. The Python interpreter, by default, will compile and store the pyc
in the disk for each `import`, but this behavior can be disable when we call

But this will make the initial execution of a big Python stack a bit slow, as
the pyc needs to be recreated in memory for each invocation. The slowness can
be relevant in some situations, so is better to not enable this feature.

But in Python 3.8 there is a new feature in place, bpo-33499, that will
recognize a new env variable (PYTHONPYCACHEPREFIX) that will change the place
where __pycache__ is stored [2]. I backported this feature to 3.7 and create a
JeOS image that includes salt-minion. I created an small shim that replace the
python3.7 binary to enable this cache prefix feature, to point it to /var/
cache/pycache/<username>, and I removed from the image all the python compiled

I decided salt-minion as saltsack is a relevant Python codebase. I needed to
port to 3.7 150 python libraries to create the first PoC.

The PoC works properly locally. I have yet some bits that I need to publish in
the repo, but the general idea seems to work OK. I can also publish the gain
on size for the ISO with the patch and without the patch, to have more data to

I also estimated some gains for different scenarios. For example in a normal
TW installation:

* Python 2.7 + 3.6
- pyc/pyc: 127M total
- py: 109M total

* Python 3.6 only
- pyc/pyc: 91M total
- py: 70M total

Python pyc/pyo size is more than the py code size, so we can potentially half
the size of the Python 3 stack.

Maybe for a normal TW installation the absolute gain is not much (91M). But
for other scenarios can be relevant, like in OpenStack Cloud, where the size
of the Python code is big. I made some calculations based on all the different
OpenStack services:

* Python 2.7 OpenStack services
- pyc/pyo: 1.2G total
- py: 804M total

Saving 1.2G each node is a more important number.

So, my proposal is to remove the pyc from the Python 3 packages and enable the
cache layer on Tumbleweed since Python 3.7. I do not know if do that by
default or under certain configurations, as I am not sure how to that feature

Any ideas? Any suggestions? What do you think if I follow this path?

Some ideas that I have are add a new %pycache-clean macro that will remove the
__pycache__ from the RPM, add a new rpmlint check to make sure that there are
not pyc for a python3 package, update the wiki and update the py2pack code to
generate good python3 spec files for openSUSE.

But if most of the community do not agree with this approach, I can drop the
idea : )


p.s: Sorry if the message is delivered two times, one from the .com account.
My bad.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham
Norton, HRB 21284 (AG Nürnberg)
Maxfeldstraße 5, 90409 Nürnberg, Germany

To unsubscribe, e-mail: opensuse-factory+unsubscribe@xxxxxxxxxxxx
To contact the owner, e-mail: opensuse-factory+owner@xxxxxxxxxxxx

< Previous Next >