On 10/5/18 4:23 AM, Alberto Planas Dominguez wrote:
On Thursday, October 4, 2018 7:23:45 PM CEST Robert Schweikert wrote:
On 10/4/18 10:52 AM, Alberto Planas Dominguez wrote:
But this also makes the Python stack a bit fat. Most of the time this is not a problem, but things are changing. We have JeOS and MicroOS, both minimal images (build with different goals and technology) that search for be small and slim.
Can you share the definition of "small and slim". What is the target size we want to get to and why does it matter if the image is a "bit" bigger?
I can think of 1 trade-of, in the Cloud, when a new instance is created the image file is copied. Therefore a smaller image improves the overall instance start up as there is less data to copy. However, from my experience in GCE, where we at some point built 8GB images then switched to 10 GB images, there was no noticeable difference between the two image sizes w.r.t. start up time of an instance.
That is a good example, indeed. The main problem is that the ration py / pyc is around 1.3 to 1.5, so for each KB of .py I have to add 1.3 or 1.5 of .pyc. We are more than doubling the space.
I can think how removing more than a half of the size can overall improve the speed of upgrading a system, the size of the rpm and drpm, the download time of JeOS and MicroOS or the time to upload something to Cinder.
Can you please formulate your goals concisely and stick to it? Are we back to discussing side effects? This is confusing. You started out stating that there is a goal to reduce image sizes for JeOS and MicroOS builds and that Python was a contributing factor to image "bloat". This was seconded by Thorsten providing a specific use case where we have people asking for 150 MB images. The image size and the size of the rpm are tangentially related. By example, an rpm contains docs, this contributes to the rpm size, but these can easily be excluded at install time with the --excludedocs option, and thus would not contribute to a measure such as image size. So the relationship you are creating with the download and upgrade example is really a different cup of tea than the goal, as I read it, in the original proposal. If your goal is to reduce the size of the Python packages then we probably need a different solution compared to a goal that produces a smaller image size when Python is part of an image.
But we want to include a bit of Python in there, like salt-minion or cloud-init. And now the relative size of Python is evident.
Well, especially for cloud-init at the last couple of get together events of upstream contributors start up time for cloud-init was a big discussion point. A lot of effort has gone into making cloud-init faster. The results of this effort would be eliminated with such a move.
I plan to measure this. The first boot can be slower, but I am still not able to have numbers here. This argument can be indeed relevant and make the proposal a bad one, but by far I do not think that the big chunk of time goes under the pyc generation in the cloud-init case, as there are more architectural problems in that.
Well I think the common agreement is that pyc generation is pretty slow. But lets put some perspective behind that and look at data rather than taking common believes as facts. On a t2.micro instance in AWS, running the SUSE stock SLES 15 BYOS image. The instance was booted (first boot), then the cloud-init cache was cleared with # cloud-init clean then shutdown -r now, i.e. a soft reboot of the VM. # systemd-analyze blame | grep cloud 6.505s cloud-init-local.service 1.013s cloud-config.service 982ms cloud-init.service 665ms cloud-final.service All these services are part of cloud-init Clear the cloud-init cache so it will re-run # cloud-init clean Clear out all Python artifacts: # cd / # find . -name '__pycache__' | xargs rm -rf # find . -name '*.pyc' | xargs rm # find . -name '*.pyo' | xargs rm This should reasonably approximate the state you are proposing, I think. Reboot: # systemd-analyze blame | grep cloud 7.469s cloud-init-local.service 1.070s cloud-init.service 976ms cloud-config.service 671ms cloud-final.service so a 13% increase for the runtime of the cloud-init-local service. And this is just a quick and dirty test with a soft reboot of the VM. Number would probably be worse with a stop-start cycle. I'll leave that to be dis-proven for those interested.
Having these numbers is interesting but what is the goal of the image you want to build and what is the benefit of the smaller size for JeOS or MicriOS?
I have one local that I am replicating it in OBS. Will appear today or this weeked in the repo pointed in the first email.
Maybe for a normal TW installation the absolute gain is not much (91M).
Well it is not just the install. We would be penalizing every user with a start up time penalty to save 91M, sorry that appears to me as an optimization for the corner case at the expense of the most common path.
I do not see the penalization, sorry.
Well I'd say the penalty is shown above, 13% in one particular example. This or worse would hit our users every time they start a new instance in AWS, GCE, Azure, OpenStack,.....
The proposal is not to wipe out pyc
The way I read your proposal was to eliminate py{c,o} from the packages, i.e. we have to byte-compile when any python module is used
and use -B when calling the Python code, is about moving the pyc generation in / var/cache (or some other cache place) and delay the pyc generation until the first boot.
OK, this part of the statement seems in line with my understanding of your proposal. You say "first-boot" are you implying a process that does a system-wide byte compilation of all installed Python code? That would probably add a rather large time penalty to first boot and is not going to work for us in the Public Cloud. But data would have to be collected on such a process to make a real decision. Or do you mean "module load" when you say "first boot", i.e. the byte-compilation takes place when a Python module is loaded for the first time? The effect of this is shown in the above example. I have an issue with a 13% drop in performance for every user on initial start up. This is penalizing the majority to cover one specific use case. Sorry it is hard for me to see this any other way.
The difference is that pyc will be there, maybe in a ram disk, or in a faster fs, or in the same old harddisk than always.
If there is a considerable penalty on the first launch, we can thing on alternatives, like making the feature optional or prepopulating a subset of the stack.
So, my proposal is to remove the pyc from the Python 3 packages and enable the cache layer on Tumbleweed since Python 3.7. I do not know if do that by default or under certain configurations, as I am not sure how to that feature optional.
Any ideas?
IMHO there are mechanism for you to do this for the corner cases, i.e. JeOS and MicroOS image builds. It is very easy with kiwi to run "find / -name '*.pyc' | xargs rm ' during the image build state. This gives you what you are after, a smaller image size without penalizing everyone else.
Well, this is how I am testing it now.
What do you think if I follow this path?
I oppose this path. We'd be penalizing every start up of every instance of EC2. We have feature requests to improve our boot performance and this is counter acting our efforts.
Not true, as the cache will be populated after the first boot.
How is my statement not true? Above you state that the cache is filled on "first boot". I hope we can all agree that byte-compilation is not a 0 time operation. Therefore, there is a time penalty in the boot process in some way shape or form. Increasing the boot time is counter to our efforts to reduce the boot time for our cloud images.
And again, by far the slowest path is not the pyc generation. But I agree that I need to deliver the numbers.
The numbers in execution time in my crude test above show that pyc is not the slowest for cloud-init, but >10% slow down is not insignificant. I think fiddling at the Python package level is not the best approach to solve the problem, unless of course making the Python packages smaller is your primary goal. Later, Robert -- Robert Schweikert MAY THE SOURCE BE WITH YOU Distinguished Architect LINUX Team Lead Public Cloud rjschwei@suse.com IRC: robjo