Mailinglist Archive: opensuse-factory (381 mails)

< Previous Next >
Re: [opensuse-factory] Proposal to remove pyc/pyo from Python on TW
  • From: Alberto Planas Dominguez <aplanas@xxxxxxx>
  • Date: Fri, 05 Oct 2018 10:23:56 +0200
  • Message-id: <4652492.78I8LDDklW@lena>
On Thursday, October 4, 2018 7:23:45 PM CEST Robert Schweikert wrote:
On 10/4/18 10:52 AM, Alberto Planas Dominguez wrote:

But this also makes the Python stack a bit fat. Most of the time this is
not a problem, but things are changing. We have JeOS and MicroOS, both
minimal images (build with different goals and technology) that search
for be small and slim.

Can you share the definition of "small and slim". What is the target
size we want to get to and why does it matter if the image is a "bit"
bigger?

I can think of 1 trade-of, in the Cloud, when a new instance is created
the image file is copied. Therefore a smaller image improves the overall
instance start up as there is less data to copy. However, from my
experience in GCE, where we at some point built 8GB images then switched
to 10 GB images, there was no noticeable difference between the two
image sizes w.r.t. start up time of an instance.

That is a good example, indeed. The main problem is that the ration py / pyc
is around 1.3 to 1.5, so for each KB of .py I have to add 1.3 or 1.5 of .pyc.
We are more than doubling the space.

I can think how removing more than a half of the size can overall improve the
speed of upgrading a system, the size of the rpm and drpm, the download time
of JeOS and MicroOS or the time to upload something to Cinder.

But we want to include a bit of Python in there, like salt-minion or
cloud-init. And now the relative size of Python is evident.

Well, especially for cloud-init at the last couple of get together
events of upstream contributors start up time for cloud-init was a big
discussion point. A lot of effort has gone into making cloud-init
faster. The results of this effort would be eliminated with such a move.

I plan to measure this. The first boot can be slower, but I am still not able
to have numbers here. This argument can be indeed relevant and make the
proposal a bad one, but by far I do not think that the big chunk of time goes
under the pyc generation in the cloud-init case, as there are more
architectural problems in that.

Having these numbers is interesting but what is the goal of the image
you want to build and what is the benefit of the smaller size for JeOS
or MicriOS?

I have one local that I am replicating it in OBS. Will appear today or this
weeked in the repo pointed in the first email.

Maybe for a normal TW installation the absolute gain is not much (91M).

Well it is not just the install. We would be penalizing every user with
a start up time penalty to save 91M, sorry that appears to me as an
optimization for the corner case at the expense of the most common path.

I do not see the penalization, sorry. The proposal is not to wipe out pyc and
use -B when calling the Python code, is about moving the pyc generation in /
var/cache (or some other cache place) and delay the pyc generation until the
first boot. The difference is that pyc will be there, maybe in a ram disk, or
in a faster fs, or in the same old harddisk than always.

If there is a considerable penalty on the first launch, we can thing on
alternatives, like making the feature optional or prepopulating a subset of
the stack.

So, my proposal is to remove the pyc from the Python 3 packages and enable
the cache layer on Tumbleweed since Python 3.7. I do not know if do that
by default or under certain configurations, as I am not sure how to that
feature optional.

Any ideas?

IMHO there are mechanism for you to do this for the corner cases, i.e.
JeOS and MicroOS image builds. It is very easy with kiwi to run "find /
-name '*.pyc' | xargs rm ' during the image build state. This gives you
what you are after, a smaller image size without penalizing everyone else.

Well, this is how I am testing it now.

What do you think if I follow this path?

I oppose this path. We'd be penalizing every start up of every instance
of EC2. We have feature requests to improve our boot performance and
this is counter acting our efforts.

Not true, as the cache will be populated after the first boot. And again, by
far the slowest path is not the pyc generation. But I agree that I need to
deliver the numbers.

--
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham
Norton, HRB 21284 (AG Nürnberg)
Maxfeldstraße 5, 90409 Nürnberg, Germany


--
To unsubscribe, e-mail: opensuse-factory+unsubscribe@xxxxxxxxxxxx
To contact the owner, e-mail: opensuse-factory+owner@xxxxxxxxxxxx

< Previous Next >
Follow Ups