Re: [opensuse-factory] Proposal to remove pyc/pyo from Python on TW

9 Oct 2018

      On Monday, October 8, 2018 7:17:08 PM CEST Robert Schweikert wrote:
...
On 10/8/18 12:07 PM, Alberto Planas Dominguez wrote:
...
[Dropping a very unproductive content]
I am really trying hard to leave out "color commentary" and adjectives,
it would be nice to see the effort reciprocated.
Indeed. Sorry if my choose of words are not correct. Maybe in Spanish I would 
use something more neutral.

[...]
...
...
When cloud-init is loaded, Python will read all the `import`s and the
required subtree of pyc will be generated before the execution of _any_
Python code. You are not compiling only the pyc from cloud-init, but for
all the dependencies that are required.
Unless there are some lazy load in cloud-init based on something like
stevedore (that I do not see), or is full for `import`s inside functions
and methods, the pyc generation of the required subtree will be the first
thing that Python will do.
I think we are in agreement, problem appears to be that we have a
different idea about what
"create the initials pyc needed for cloud-init:"
means. For me this arrived as what you stated, i.e. "pyc needed for
cloud-init" which says nothing about dependencies. For you this
statement appears to imply that the pyc files for the dependencies were
also generated in this test.
More explicit and concise communication would certainly help.
I see, I hope that now is clear. The pyc subtree is generated before any 
Python code runs, as they are generates as soon the `import` statement is 
parsed.

[...]
...
...
...
In any case your
second comparison appears to making a leap that I, at this point do not
agree with. You are equating the generation of pyc code in a "hot
system" to the time it takes to load everything in a "cold system". A
calculation of percentage contribution of pyc creation in a "cold
system" would only be valid if that scenario were tested. Which we have
not done, but would certainly not be too difficult to test.
I do not get the point. At the end we measured the proportion of the time
Python spend generating the pyc for cloud-init and all the dependencies
needed for the service, in relation with the overall time that cloud init
spend during the initialization of the service.
I am not sure what do you mean by hot and clod here, as I removed all the
pyc from site-packages to have a measure of the relation of the
generation of the pyc over the time that cloud init uses to start the
service.
OK, maybe I can explain this better. We agree that there is a
significant difference in the cloud-init execution time between initial
start up of the VM vs. a reboot, even if the cloud-init cache (not the
pyc files) is cleared. This implies that something is working behind the
scene to our advantage and makes a VM reboot faster w.r.t. cloud-init
execution when compared to the start a new instance scenario. Given that
we do not know what this "makes it work faster" part is, we should not
draw any conclusion that the pyc build will take equally as short/long
of a time on initial start up as it takes in a "reboot the VM" scenario.
This will have to be tested.
I understand now. Looks like that the version of cloud init in place for sle12 
sp3 do not have the `clear` command. In any case I inspected site-packages and 
all the pycs are there after the reboot, so I can consider the time of 
generating pycs to be a constant time, not related about if or not the cloud-
init cache is already populated or not, as this will affect the non-constant 
time of the total time that cloud-init needs to start.

In any case I will look into that.
...
...
...
...
The cost is amortized, and the corner case, IMHO, is more yours than
mine.
Your case is a fresh boot of a just installed EC2 VM. I agree that there
is a penalty of ~10% (or a 0.54% in my SLE12 SP3 OpenStack case), but
this is only for this first boot.
Which is a problem for those users that start a lot of instances to
throw them away and start new instances the next time they are needed.
This would be a typical autoscaling use case or a typical test use case.
Correct. The 0.205s will be added for each new fresh VM. Am I correct to
assume that also this is an scenario where the resize in the initial boot
is happening? If so, the overall impact is much less that the 10% that we
are talking about, and more close to the %0.5 that I measured in
OpenStack.
The data I presented as an example was generated with a 10GB image size
for the creation of an instance with a 10GB root volume size. So there
is a, what should be a negligible, contribution from the growpart
script, which is called by cloud-init and runs in a subprocess.
I attribute "negligible" in this case as growpart will exit very fast is
no resizing is required. It still take time for process start up etc.
but again I consider this as negligible.
But you are correct, by increasing the time it takes for other things
cloud-init calls, such as root volume resize, one can decrease the
percentage of time allocated to pyc creation.
...
If I would want to take this to an extreme I could start a process
during user data processing that runs for several minutes and thus I
could make an argument that pyc creation takes almost no time. However,
that would be misleading.
(Read the next as a joke) This point is were my set of words becomes too 
scarce to choose a colorless word to answer this argument. But I understand 
the idea behind it :-)
...
I think in an effort to arrive as close as reasonably possible at the
"real cost" of the pyc generation for the cloud-init example, we should
minimize the externally executed processes by cloud-init, such as
minimizing the runtime for growpart, which is done by not manipulating
the instance root volume size as compared to the image size.
Very true, removing all that is slow during the boot time will make the pyc 
time more relevant in the total amount. IMHO this is a very far time in the 
future, if happens.

But I encourage you and all the cloud stakeholders to make this argument 
relevant, as this will make the distribution very attractive for Cloud.
...
...
...
It is relatively easy to calculate a cost for this with some estimates.
If my test for my application needs 2000 (an arbitrary number I picked)
test instances, and every test instance takes .2 seconds longer to boot,
to use your number, than the total time penalty is ~6.7 seconds. If this
test uses an instance type that costs me $10 per hour the slow down
costs me ~ $1.1 every time I run my test. So if the test case runs once
a week it would amount to ~$57 per year.
Imagine the cost of the resize of the kiwi operation, must be around some
thousands dollars.
But you are right. If there is a weekly re-escalation of 2000 instances
during the 54 weeks of a year, you can measure the cost of the pyc
generation.
Is in my understanding that CPU cost is cheaper in relation with network
transfer and storage. Can we measure the savings of network and storage
here?
Not in the Public Cloud, there is no data. Network data into the
framework is always free and the size of the root volumes in our images
is already 10GB (30GB in Azure) and thus offers up ample space for the
pyc files. Meaning there is no gain if the actual disk space used by the
packages we install is smaller and we have more empty space in the 10GB
(30GB in Azure) image.
So uploading ISO / qcow2 images and storing them for a long period is free?

[...]
...
...
Or better, the user can add a `python -m compileall` in the kiwi
config.sh,
that will populate /var/cache for the cloud images only.
Well, I could turn around and state that this is a "hack" and "...we
don't want any of your proposed hacks.." in our image creation process.
Hopefully sounds familiar.... ;)
Indeed, the hack is breaking the content of the RPM removing files from the 
places where zypper put in (/usr). There is no hack making /var/cache hot 
before the service runs :-)
...
Anyway, on a more serious note, if we can resolve the security concerns
and properly handle the upgrade mechanism while not generating multiple
packages I am not categorically opposed to such an addition in our
Public Cloud image builds.
Thanks!

I see the security problem, indeed. I will work to provide an answer to this 
problem and propose it here.

Meanwhile the image from [1] is providing some of the pieces that I talk in 
the first email. This is the image that I am using for another project related 
with Salt, but as today:

* Python 3.7 is installed (in TW we still have 3.6)
* Python 3.7 contains the patch from 3.8 to enable storage of pycache in a 
different file system
* I added two shim loaders (written in shell), that replace python3.7 and 
python3.7m, to enable the 3.8 feature
* As a hack, I removed all the pycache from the ISO (I know, I know ...), so 
all is generated under demand on /var/cache

[1] https://build.opensuse.org/project/monitor/home:aplanas:Images?
arch_x86_64=1&defaults=0&repo_images=1&succeeded=1
...
...
I think that we need a productive argumentation here.
And I thought we were having that for the most part. (My contribution to
color commentary)
Your data was very helpful, you are right.

-- 
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham 
Norton, HRB 21284 (AG Nürnberg)
Maxfeldstraße 5, 90409 Nürnberg, Germany

-- 
To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org
To contact the owner, e-mail: opensuse-factory+owner@opensuse.org