[opensuse-packaging] Packaging hints for transactional updates

Hi, I started to collect issues with RPMs and transational-updates and how to avoid them: https://en.opensuse.org/openSUSE:Packaging_for_transactional-updates Luckily, until now it's not much and most RPMs are fine. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Hello Thorsten, On Mar 20 15:06 Thorsten Kukuk wrote (excerpt):
https://en.opensuse.org/openSUSE:Packaging_for_transactional-updates
It reads: ------------------------------------------------------------------------- ... instead of creating a snapshot, updating the current system and rolling back if an error happened, we create a snapshot, update this snapshot, and do a "rollback" to that snapshot if no error did occur. ------------------------------------------------------------------------- Can you explain therein the reason behind why it is done this way or add a link that points to an explanation? ( I have my own personal idea what the reason behind could be but I would prefer to also know the "official" reason behind ;-) Kind Regards Johannes Meixner -- SUSE LINUX GmbH - GF: Felix Imendoerffer, Jane Smithard, Graham Norton - HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Tue, Mar 21, Johannes Meixner wrote:
Hello Thorsten,
On Mar 20 15:06 Thorsten Kukuk wrote (excerpt):
https://en.opensuse.org/openSUSE:Packaging_for_transactional-updates
It reads: ------------------------------------------------------------------------- ... instead of creating a snapshot, updating the current system and rolling back if an error happened, we create a snapshot, update this snapshot, and do a "rollback" to that snapshot if no error did occur. -------------------------------------------------------------------------
Can you explain therein the reason behind why it is done this way or add a link that points to an explanation?
That's explained in the very first sentences: "Transactional updates are atomic. This means, either the update is fully applied without any error, or no change is made to the system. Additional, transactional upates should not influence the currently running processes." If you update the running system, none of this is fullfillable. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Tue, 21 Mar 2017, Thorsten Kukuk wrote:
On Tue, Mar 21, Johannes Meixner wrote:
Hello Thorsten,
On Mar 20 15:06 Thorsten Kukuk wrote (excerpt):
https://en.opensuse.org/openSUSE:Packaging_for_transactional-updates
It reads: ------------------------------------------------------------------------- ... instead of creating a snapshot, updating the current system and rolling back if an error happened, we create a snapshot, update this snapshot, and do a "rollback" to that snapshot if no error did occur. -------------------------------------------------------------------------
Can you explain therein the reason behind why it is done this way or add a link that points to an explanation?
That's explained in the very first sentences:
"Transactional updates are atomic. This means, either the update is fully applied without any error, or no change is made to the system. Additional, transactional upates should not influence the currently running processes."
If you update the running system, none of this is fullfillable.
But it also means the "running system" part that is transactionally modified should better be readonly as otherwise you lose changes done during transaction start and commit? Richard.
Thorsten
-- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Tue, Mar 21, Richard Biener wrote:
On Tue, 21 Mar 2017, Thorsten Kukuk wrote:
On Tue, Mar 21, Johannes Meixner wrote:
Hello Thorsten,
On Mar 20 15:06 Thorsten Kukuk wrote (excerpt):
https://en.opensuse.org/openSUSE:Packaging_for_transactional-updates
It reads: ------------------------------------------------------------------------- ... instead of creating a snapshot, updating the current system and rolling back if an error happened, we create a snapshot, update this snapshot, and do a "rollback" to that snapshot if no error did occur. -------------------------------------------------------------------------
Can you explain therein the reason behind why it is done this way or add a link that points to an explanation?
That's explained in the very first sentences:
"Transactional updates are atomic. This means, either the update is fully applied without any error, or no change is made to the system. Additional, transactional upates should not influence the currently running processes."
If you update the running system, none of this is fullfillable.
But it also means the "running system" part that is transactionally modified should better be readonly as otherwise you lose changes done during transaction start and commit?
If you want to be 100% safe: yes, the root filesystem should be read-only, as we do for SUSE CaaSP for this reason. But, if you run the transactional-update in the night with automatic reboot, or make sure that all data is written in other subvolumes beside the root subvolume, you are mostly safe, too. So my openSUSE Tumbleweed installations are updated at 3 o'clock in the night automatically and reboot if patches where successfully applied. But some people already think about porting the read-only root subvolume code to Tumbleweed for transactional updates. But this will require quite a lot of changes to existing RPMs. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Tue, 21 Mar 2017, Thorsten Kukuk wrote:
On Tue, Mar 21, Richard Biener wrote:
On Tue, 21 Mar 2017, Thorsten Kukuk wrote:
On Tue, Mar 21, Johannes Meixner wrote:
Hello Thorsten,
On Mar 20 15:06 Thorsten Kukuk wrote (excerpt):
https://en.opensuse.org/openSUSE:Packaging_for_transactional-updates
It reads: ------------------------------------------------------------------------- ... instead of creating a snapshot, updating the current system and rolling back if an error happened, we create a snapshot, update this snapshot, and do a "rollback" to that snapshot if no error did occur. -------------------------------------------------------------------------
Can you explain therein the reason behind why it is done this way or add a link that points to an explanation?
That's explained in the very first sentences:
"Transactional updates are atomic. This means, either the update is fully applied without any error, or no change is made to the system. Additional, transactional upates should not influence the currently running processes."
If you update the running system, none of this is fullfillable.
But it also means the "running system" part that is transactionally modified should better be readonly as otherwise you lose changes done during transaction start and commit?
If you want to be 100% safe: yes, the root filesystem should be read-only, as we do for SUSE CaaSP for this reason.
But I wonder if the current scheme of doing the modification in the live system and rolling back on error works closer to 100% then (in the case of read-write root). Given a transaction abort should be the minority of cases ...
But, if you run the transactional-update in the night with automatic reboot, or make sure that all data is written in other subvolumes beside the root subvolume, you are mostly safe, too.
Doesn't sound like "no changes to the running system" to me then ;)
So my openSUSE Tumbleweed installations are updated at 3 o'clock in the night automatically and reboot if patches where successfully applied.
But some people already think about porting the read-only root subvolume code to Tumbleweed for transactional updates. But this will require quite a lot of changes to existing RPMs.
So what issue are we solving then? That is, is this for CaaSP only where we can guarantee the r/o root? Richard. -- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Tue, Mar 21, Richard Biener wrote:
So what issue are we solving then? That is, is this for CaaSP only where we can guarantee the r/o root?
The problems we want to solve are all the broken systems you can regular read on the factory list about after an update, because the update in the running system did break other running processes. Especially the Desktop. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Tue, 21 Mar 2017, Thorsten Kukuk wrote:
On Tue, Mar 21, Richard Biener wrote:
So what issue are we solving then? That is, is this for CaaSP only where we can guarantee the r/o root?
The problems we want to solve are all the broken systems you can regular read on the factory list about after an update, because the update in the running system did break other running processes. Especially the Desktop.
Sure. But do we now exchange this for all the broken systems where replacing root with the snapshot after the transaction? And are we sure the number of broken systems will actually shrink with this change? (and can you prove that?) Richard.
Thorsten
-- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Tue, Mar 21, Richard Biener wrote:
On Tue, 21 Mar 2017, Thorsten Kukuk wrote:
On Tue, Mar 21, Richard Biener wrote:
So what issue are we solving then? That is, is this for CaaSP only where we can guarantee the r/o root?
The problems we want to solve are all the broken systems you can regular read on the factory list about after an update, because the update in the running system did break other running processes. Especially the Desktop.
Sure. But do we now exchange this for all the broken systems where replacing root with the snapshot after the transaction? And are we sure the number of broken systems will actually shrink with this change? (and can you prove that?)
Sorry, but I neither understand what you are writing here nor what your problem is. What I can prove is: all the problems the people had with updates, that their running applications did crash and the update did not finish and their system were left over in a unbootable state can be solved with this. And about your fear about that the snapshot is in a broken state after update even if zypper did not return any error: if this happens, this would also happen with your normal running system. It does not matter if you update a snapshot or the real system, the installed RPMs are the same. And, don't forget: if you don't like transactional updates, nobody is forcing you to use them. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Tue, 21 Mar 2017, Thorsten Kukuk wrote:
On Tue, Mar 21, Richard Biener wrote:
On Tue, 21 Mar 2017, Thorsten Kukuk wrote:
On Tue, Mar 21, Richard Biener wrote:
So what issue are we solving then? That is, is this for CaaSP only where we can guarantee the r/o root?
The problems we want to solve are all the broken systems you can regular read on the factory list about after an update, because the update in the running system did break other running processes. Especially the Desktop.
Sure. But do we now exchange this for all the broken systems where replacing root with the snapshot after the transaction? And are we sure the number of broken systems will actually shrink with this change? (and can you prove that?)
Sorry, but I neither understand what you are writing here nor what your problem is.
What I can prove is: all the problems the people had with updates, that their running applications did crash and the update did not finish and their system were left over in a unbootable state can be solved with this.
True - this is because you are not updating the running system but the one that is activated after the next reboot (as far as I understand).
And about your fear about that the snapshot is in a broken state after update even if zypper did not return any error: if this happens, this would also happen with your normal running system. It does not matter if you update a snapshot or the real system, the installed RPMs are the same.
No, I am refering to the time window between creating the snapshot and activating it. For a true transaction you'd need to verify the root you are about to replace with the updated snapshot is in the same state as at the time of snapshot creation (thus, it had better be readonly). Otherwise you are losing data.
And, don't forget: if you don't like transactional updates, nobody is forcing you to use them.
Of course. But the system you are implementing sounds a more dangerous way of effectively downloading the update in the running system, rebooting, and at defined state (say, in initrd context) create the snapshot, install into it and continue booting from it. _Without_ the issue of that inconsistency due to the time window the root is active between creating the snapshot and activating it. Richard.
Thorsten
-- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Tue, Mar 21, Richard Biener wrote:
No, I am refering to the time window between creating the snapshot and activating it. For a true transaction you'd need to verify the root you are about to replace with the updated snapshot is in the same state as at the time of snapshot creation (thus, it had better be readonly). Otherwise you are losing data.
If you clearly seperate data from applications, as you have to do for snapshot and rollback anyways, the risk is really very low. And if you use a read-only root filesystem, the risk is zero. But this are not only problems with transactional updates, you have the same problems already today if you use rollback. And there the risk of data lossage is much, much higher.
But the system you are implementing sounds a more dangerous way of effectively downloading the update in the running system, rebooting, and at defined state (say, in initrd context) create the snapshot, install into it and continue booting from it.
That's how Windows is doing it and GNOME tries to implement it. You should watch my presentation at Fosdem this year, which negativ impact this already had in the past. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Hi, On Tue, 21 Mar 2017, Thorsten Kukuk wrote:
If you clearly seperate data from applications, as you have to do for snapshot and rollback anyways, the risk is really very low. And if you use a read-only root filesystem, the risk is zero. But this are not only problems with transactional updates, you have the same problems already today if you use rollback. And there the risk of data lossage is much, much higher.
But a rollback always implies data-loss (namely all the data written since). That's known by people (at least by those that have some mental concept of "transactional"). Data-loss by activation of an update is surprising. What happens e.g. in this situation: % user installs update % user installs more updates % user adds repo and installs a new rpm foo % user reboots (because finally he's annoyed by the warning of having to reboot) foo is gone from the rebooted system, right (except of course for the desktop item for that program, now pointing to a dead file, which is even more surprising)? Are at least updates 1 and 2 merged?
But the system you are implementing sounds a more dangerous way of effectively downloading the update in the running system, rebooting, and at defined state (say, in initrd context) create the snapshot, install into it and continue booting from it.
That's how Windows is doing it and GNOME tries to implement it.
But the way Windows does it (I don't know about GNOME) is definitely more like a proper transaction than installing into a snapshot and just activating it at next reboot over whatever was there before. That's not transactional at all because the abort-transaction with concurrent writes is missing.
You should watch my presentation at Fosdem this year, which negativ impact this already had in the past.
Perhaps, are the slides somewhere? Ciao, Michael. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Tue, Mar 21, Michael Matz wrote:
Hi,
On Tue, 21 Mar 2017, Thorsten Kukuk wrote:
If you clearly seperate data from applications, as you have to do for snapshot and rollback anyways, the risk is really very low. And if you use a read-only root filesystem, the risk is zero. But this are not only problems with transactional updates, you have the same problems already today if you use rollback. And there the risk of data lossage is much, much higher.
But a rollback always implies data-loss (namely all the data written since). That's known by people (at least by those that have some mental concept of "transactional"). Data-loss by activation of an update is surprising.
What happens e.g. in this situation:
% user installs update % user installs more updates % user adds repo and installs a new rpm foo % user reboots (because finally he's annoyed by the warning of having to reboot)
foo is gone from the rebooted system, right (except of course for the desktop item for that program, now pointing to a dead file, which is even more surprising)? Are at least updates 1 and 2 merged?
Nothing happend and foo is there, if the user only uses the transactional-update script and not a mix of different tools.
But the way Windows does it (I don't know about GNOME) is definitely more like a proper transaction than installing into a snapshot and just activating it at next reboot over whatever was there before. That's not transactional at all because the abort-transaction with concurrent writes is missing.
Sorry, but exactly the other way around is true. Windows is not doing any transaction at all. If your system breaks, it's broken. What Windows calls "transaction" in this case is only what is written into the windows registry, not on harddisk.
Perhaps, are the slides somewhere?
Yes, in the Fosdem archive. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Hi, On Tue, 21 Mar 2017, Thorsten Kukuk wrote:
What happens e.g. in this situation:
% user installs update % user installs more updates % user adds repo and installs a new rpm foo % user reboots (because finally he's annoyed by the warning of having to reboot)
foo is gone from the rebooted system, right (except of course for the desktop item for that program, now pointing to a dead file, which is even more surprising)? Are at least updates 1 and 2 merged?
Nothing happend and foo is there, if the user only uses the transactional-update script and not a mix of different tools.
That's nice. Is zypper for installing the rpm at above third step a different tool for these purposes (and does it mean that the rpm is installed twice, once into the used-after-reboot snapshot, and once into the currently used one?)
But the way Windows does it (I don't know about GNOME) is definitely more like a proper transaction than installing into a snapshot and just activating it at next reboot over whatever was there before. That's not transactional at all because the abort-transaction with concurrent writes is missing.
Sorry, but exactly the other way around is true. Windows is not doing any transaction at all. If your system breaks, it's broken. What Windows calls "transaction" in this case is only what is written into the windows registry, not on harddisk.
Um. I'm not sure how to say this, but you do know that NTFS provides snapshots as well and Windows makes use of them for update purposes much similar to our rollback scenario with btrfs? And as the kernel-related updates for it are installed at points where there are definitely no other writers, namely at shutdown or bootup, yes, that is actually transactional (or at least more so, all this talk about "transactional" related to updates and filesystems is a bit dodgy as it's quite a bit unlike transactions in the database sense). That this is done in Windows had initially different reasons, but now the side-effect is that the updates are safe from concurrent writes. (that would be comparable to us installing the updates from initrd just after mounting the FSes)
Perhaps, are the slides somewhere?
Yes, in the Fosdem archive.
Looking. I'm still hoping I have some basic misunderstanding of the whole "transactional" updates idea. Ciao, Michael. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Tue, Mar 21, Michael Matz wrote:
Um. I'm not sure how to say this, but you do know that NTFS provides snapshots as well and Windows makes use of them for update purposes much similar to our rollback scenario with btrfs?
If it is used for Windows Updates it does not work in practice, else the windows installation on my laptop wouldn't go in a corrupt state unrepairable by Windows by an update. And only wikipedia assumes that the Windows Update is using it, Microsoft itself is not mentioning it. Ah, wait, yes, Windows can do a rollback and Microsoft explains it, but at first you need to get the current broken state booting to start the GUI to do the rollback ... If there are other ways, MS is hiding them well in their documentation. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Hi, On Tue, 21 Mar 2017, Thorsten Kukuk wrote:
On Tue, Mar 21, Michael Matz wrote:
Um. I'm not sure how to say this, but you do know that NTFS provides snapshots as well and Windows makes use of them for update purposes much similar to our rollback scenario with btrfs?
If it is used for Windows Updates it does not work in practice, else the windows installation on my laptop wouldn't go in a corrupt state unrepairable by Windows by an update. And only wikipedia assumes that the Windows Update is using it, Microsoft itself is not mentioning it. Ah, wait, yes, Windows can do a rollback and Microsoft explains it, but at first you need to get the current broken state booting to start the GUI to do the rollback ... If there are other ways, MS is hiding them well in their documentation.
Like us they can boot from a snapshot (or rather at boot time you can restore a snapshot from the "emergency initrd" (which even has graphical user interface!)). Depending on Windows installation and settings snapshoting is not always active (like with us), and sometimes they remove snapshots too early. But sure: they have bugs, that's no wonder. Us too, if I may be so bold :) (that your laptop can't be recovered is not a good data point; my sometimes-windows-gaming desktop was recoverable (and actually needed it only once in the whole lifetime, unlike us when we botched the bootloader installation for the 20th time), so can I say now that it does work in practice? :) ). In any case bugs in their implementation of it don't directly show an inherent flaw in their approach. I do see some problems with that approach, but far fewer than in the transactional-updates approach. But I guess somebody will have to properly implement the initrd approach for us so that we can really compare both on the system we care about. After all, possibly I'm wrong ;) Ciao, Michael. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Tue, Mar 21, Michael Matz wrote:
In any case bugs in their implementation of it don't directly show an inherent flaw in their approach. I do see some problems with that approach, but far fewer than in the transactional-updates approach. But I guess somebody will have to properly implement the initrd approach for us so that we can really compare both on the system we care about. After all, possibly I'm wrong ;)
As I wrote: switch your btrfs root subvolume to read-only and transactional-updates are 100% safe. And since you can apply them at any time, you even don't need to spend the time waiting that your mission critical server is alive again ;) Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Hi, On Tue, 21 Mar 2017, Thorsten Kukuk wrote:
As I wrote: switch your btrfs root subvolume to read-only and transactional-updates are 100% safe.
Yes. And as we were saying, if you don't do that you don't have anything at all.
And since you can apply them at any time, you even don't need to spend the time waiting that your mission critical server is alive again ;)
You mean with read-only / ? How could updates become active without reboot? You can't change the files already opened by running processes. So eventually you _have_ to wait for reboot/app-restart (let's ignore live patching for this thread :) ), but that's of course the same with all update approaches (in other words I don't see how transactional-updates specifically change anything in this regard). Ciao, Michael. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On 03/21/2017 05:33 PM, Michael Matz wrote:
And since you can apply them at any time, you even don't need to spend the time waiting that your mission critical server is alive again ;) You mean with read-only / ? How could updates become active without reboot? You can't change the files already opened by running processes. Are we only talking about updates of the kernel and libc or about all programs? For the first one, we need to to reboot, but not for the latter one. I always just restart services, not the whole server. For GUI systems, often a logout/login is sufficient.
Are transactional upgrades enforcing reboots after every upgrade? -- python programming - mail server - photo - video - https://sebix.at cryptographic key at https://sebix.at/DC9B463B.asc and on public keyservers

On Tue, Mar 21, Sebastian wrote:
On 03/21/2017 05:33 PM, Michael Matz wrote:
And since you can apply them at any time, you even don't need to spend the time waiting that your mission critical server is alive again ;) You mean with read-only / ? How could updates become active without reboot? You can't change the files already opened by running processes. Are we only talking about updates of the kernel and libc or about all programs? For the first one, we need to to reboot, but not for the latter one. I always just restart services, not the whole server. For GUI systems, often a logout/login is sufficient.
Are transactional upgrades enforcing reboots after every upgrade?
transactional upgrades always need a reboot to activate the changes. That's with all implementations the case, independent of how it is implemented. Else you couldn't do it "atomic" and without influence on the running system. If you have a read-write root filesystem, you could apply small changes with zypper, and only do the big ones with transactional-updates. But in this case, you should not continue to use zypper until you reboot, else this changes will go lost. But since openSUSE Tumbleweed is only updated at max. once a day, you can run the update in the night, including reboot, without any problems or risks. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On 03/21/2017 08:34 PM, Thorsten Kukuk wrote:
But since openSUSE Tumbleweed is only updated at max. once a day, you can run the update in the night, including reboot, without any problems or risks. So this only applies to Tumbleweed? Just now or in the future too?
-- python programming - mail server - photo - video - https://sebix.at cryptographic key at https://sebix.at/DC9B463B.asc and on public keyservers

Am 21.03.2017 um 17:33 schrieb Michael Matz <matz@suse.de>:
Hi,
On Tue, 21 Mar 2017, Thorsten Kukuk wrote:
As I wrote: switch your btrfs root subvolume to read-only and transactional-updates are 100% safe.
Yes. And as we were saying, if you don't do that you don't have anything at all.
And since you can apply them at any time, you even don't need to spend the time waiting that your mission critical server is alive again ;)
You mean with read-only / ? How could updates become active without reboot? You can't change the files already opened by running processes. So eventually you _have_ to wait for reboot/app-restart (let's ignore live patching for this thread :) ), but that's of course the same with all update approaches (in other words I don't see how transactional-updates specifically change anything in this regard).
I'm sure I'm missing something, but for a system where data, applications, and configuration are separated reasonably well into their own btrfs subvolumes (which I think is the case for a default SUSE Linux setup), what kind of writes exactly could happen, in the same subvolume as the one the transactional snapshot is done to, that could really lead to data loss? Obviously none if the subvolume in question is mounted read-only. But even if it isn't, what kinds of (intended) writes would happen to /usr/* during a transactional update run? I'm always surprised how well traditional RPM updates work in the running (server) system, although we are basically relying on old software that hasn't been updated yet to peacefully co-exist with new software that has been updated already. But there are real problems, and from time to time we hit them. For example, we've run into cases where updating Salt with Salt fails because the running Salt process may lazy-load updated code that doesn't match the running code's APIs any more. And of course it doesn't work smoothly at all if you try to update a Gnome or KDE from within a running desktop session. Compared to the alternative approaches (Windows, Mac, iOS, Gnome, Android), I see the "Kukuk-approach" as the best choice if we can get those (IMHO rather hypothetical) snapshot-related data losses under control: The downtime is really reduced to just the reboot, while the other approaches leave the system in a non-productive state during at least one reboot plus the time all updates take to be applied, which can take many minutes even if the updates have been completely downloaded before. Joachim -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Tue, 21 Mar 2017, Joachim Werner wrote:
Am 21.03.2017 um 17:33 schrieb Michael Matz <matz@suse.de>:
Hi,
On Tue, 21 Mar 2017, Thorsten Kukuk wrote:
As I wrote: switch your btrfs root subvolume to read-only and transactional-updates are 100% safe.
Yes. And as we were saying, if you don't do that you don't have anything at all.
And since you can apply them at any time, you even don't need to spend the time waiting that your mission critical server is alive again ;)
You mean with read-only / ? How could updates become active without reboot? You can't change the files already opened by running processes. So eventually you _have_ to wait for reboot/app-restart (let's ignore live patching for this thread :) ), but that's of course the same with all update approaches (in other words I don't see how transactional-updates specifically change anything in this regard).
I'm sure I'm missing something, but for a system where data, applications, and configuration are separated reasonably well into their own btrfs subvolumes (which I think is the case for a default SUSE Linux setup), what kind of writes exactly could happen, in the same subvolume as the one the transactional snapshot is done to, that could really lead to data loss?
Obviously none if the subvolume in question is mounted read-only. But even if it isn't, what kinds of (intended) writes would happen to /usr/* during a transactional update run?
I'm always surprised how well traditional RPM updates work in the running (server) system, although we are basically relying on old software that hasn't been updated yet to peacefully co-exist with new software that has been updated already.
But there are real problems, and from time to time we hit them. For example, we've run into cases where updating Salt with Salt fails because the running Salt process may lazy-load updated code that doesn't match the running code's APIs any more.
And of course it doesn't work smoothly at all if you try to update a Gnome or KDE from within a running desktop session.
Compared to the alternative approaches (Windows, Mac, iOS, Gnome, Android), I see the "Kukuk-approach" as the best choice if we can get those (IMHO rather hypothetical) snapshot-related data losses under control:
The downtime is really reduced to just the reboot, while the other approaches leave the system in a non-productive state during at least one reboot plus the time all updates take to be applied, which can take many minutes even if the updates have been completely downloaded before.
But as you are safely doing it during night anyway that little extra time doesn't matter. I suppose that the transactional update should work this way: 1) download updates 2) force sub-volumes we are going to snapshot r/o 3) snapshot, apply updates ... time passes (hopefully running system is happy with r/o state) 4) you reboot, updated snapshot gets activated then we're safe. But if you leave out 2) there's the possibility of breakage (or you need a verification step before 4) that the subvolumes you snapshotted didn't change so you can rollback in that case and try again). You can do 2) and 3) also during/after 4) easily. Your theory above suggests that 2) is not going to be an issue for the running system or your productivity. Richard. -- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Tue, Mar 21, Michael Matz wrote:
But I guess somebody will have to properly implement the initrd approach for us so that we can really compare both on the system we care about. After all, possibly I'm wrong ;)
I did implement the initrd approach two years ago during hackweek in a few hours. But it requires that you always boot twice, first to run zypper from a initrd, second to activate all changes and get back to a consistent system. You gain nothing from this except sitting a long time for a for normal work unuseable machine. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Wed, 22 Mar 2017, Thorsten Kukuk wrote:
On Tue, Mar 21, Michael Matz wrote:
But I guess somebody will have to properly implement the initrd approach for us so that we can really compare both on the system we care about. After all, possibly I'm wrong ;)
I did implement the initrd approach two years ago during hackweek in a few hours. But it requires that you always boot twice, first to run zypper from a initrd, second to activate all changes and get back to a consistent system. You gain nothing from this except sitting a long time for a for normal work unuseable machine.
Not sure why you need to reboot twice - you'd apply a kernel update online in a non-transactional way (we have working ways of "rolling" back to the old kernel). So the reboot updates the kernel, from the initrd you apply the update and simply continue booting? Updates usually are small, so this won't take long (assuming you have downloaded them before, of course). The interesting part is of course some intelligence to decide which updates to apply online and which ones to do "transactional". Richard. -- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Wed, Mar 22, 2017 at 11:45 AM, Richard Biener <rguenther@suse.de> wrote:
Not sure why you need to reboot twice - you'd apply a kernel update online in a non-transactional way (we have working ways of "rolling" back to the old kernel). So the reboot updates the kernel, from the initrd you apply the update and simply continue booting?
Some services may be taken over from initrd. In this case initrd had been generated using old versions and they will continue to be used even if you update root to new version until next reboot. This may lead to rather hard to debug issues because everything after boot will indicate you have new versions installed. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Wed, 22 Mar 2017, Andrei Borzenkov wrote:
On Wed, Mar 22, 2017 at 11:45 AM, Richard Biener <rguenther@suse.de> wrote:
Not sure why you need to reboot twice - you'd apply a kernel update online in a non-transactional way (we have working ways of "rolling" back to the old kernel). So the reboot updates the kernel, from the initrd you apply the update and simply continue booting?
Some services may be taken over from initrd. In this case initrd had been generated using old versions and they will continue to be used even if you update root to new version until next reboot. This may lead to rather hard to debug issues because everything after boot will indicate you have new versions installed.
Well, apply the update from init (systemd) then before it spawns anything else. init can reload itself. Yes, there are implementation difficulties but it can work. You still do have to wait for those updates to be applied of course. Richard. -- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Wed, Mar 22, Richard Biener wrote:
On Wed, 22 Mar 2017, Andrei Borzenkov wrote:
On Wed, Mar 22, 2017 at 11:45 AM, Richard Biener <rguenther@suse.de> wrote:
Not sure why you need to reboot twice - you'd apply a kernel update online in a non-transactional way (we have working ways of "rolling" back to the old kernel). So the reboot updates the kernel, from the initrd you apply the update and simply continue booting?
Some services may be taken over from initrd. In this case initrd had been generated using old versions and they will continue to be used even if you update root to new version until next reboot. This may lead to rather hard to debug issues because everything after boot will indicate you have new versions installed.
Well, apply the update from init (systemd) then before it spawns anything else. init can reload itself.
And afterwards you have to reboot to activate the new kernel and initrd ... Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Hi, On Wed, 22 Mar 2017, Thorsten Kukuk wrote:
Not sure why you need to reboot twice - you'd apply a kernel update online in a non-transactional way (we have working ways of "rolling" back to the old kernel). So the reboot updates the kernel, from the initrd you apply the update and simply continue booting?
Some services may be taken over from initrd. In this case initrd had been generated using old versions and they will continue to be used even if you update root to new version until next reboot. This may lead to rather hard to debug issues because everything after boot will indicate you have new versions installed.
Well, apply the update from init (systemd) then before it spawns anything else. init can reload itself.
And afterwards you have to reboot to activate the new kernel and initrd ...
Nah. updates that affect kernel or initrd will simply be applied in the running system (and initrd rebuild). The nature of them is such that they can't possibly affect anything in the running system. No need for two reboots. Ciao, Michael. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Richard Biener wrote:
On Wed, 22 Mar 2017, Andrei Borzenkov wrote:
On Wed, Mar 22, 2017 at 11:45 AM, Richard Biener <rguenther@suse.de> wrote:
Not sure why you need to reboot twice - you'd apply a kernel update online in a non-transactional way (we have working ways of "rolling" back to the old kernel). So the reboot updates the kernel, from the initrd you apply the update and simply continue booting?
Some services may be taken over from initrd. In this case initrd had been generated using old versions and they will continue to be used even if you update root to new version until next reboot. This may lead to rather hard to debug issues because everything after boot will indicate you have new versions installed.
Well, apply the update from init (systemd) then before it spawns anything else. init can reload itself.
Yes, there are implementation difficulties but it can work. You still do have to wait for those updates to be applied of course.
That way already exists. It's called "offline system updates". A description is online¹. In short packagekit has a mode to just download rpms and notify the system of their presence. On next reboot systemd boots into a special target, installs the downloaded files and reboots again. This mode is not just for small updates. This is what you'd use for updating e.g. whole desktop environments with hundreds of packages. So applying updates this way does take time. I'm sure it will take even more time just in the moment one needs the system to be back ASAP :-) The "transactional update" mechanism on the other hand would download and install packages while the user can still use the system. Since the installation goes to a separate snapshot of the file system, the running system is not impacted. After successful installation a reboot would boot into the newly created snapshot just as quick (or slow :-)) as usual. Both methods obviously need safeguards. If the installation step of the offline updates is too dumb (like rpm -U --force --nodeps *) it has the potential to bring the system to a inconsistent state, in case packages got installed or removed after the updates were downloaded. transactional updates would always lead to a consistent system at least. However, if safeguards are not in place to handle or disallow package installations/removals after creating the new snapshot, one would miss those modifications after reboot. I guess a better integration with zypper and rpm itself is needed to prevent that. However, both approaches are unsuitable for small updates IMO. On a traditional server it would be crazy to require a reboot just to apply e.g. a security update on apache or even systemd when both services can handle inline replacement just fine. We have to be better than that. That also applies to the desktop. Who wants to reboot to update Firefox after all? So IMO the perfect solution would allow both, installing small updates like today, plus having a way to apply "disruptive" changes without impacting the running system. For the latter the offline updates approach is a cheap solution that can be used immediately and on any file system. It's not really the best way for users though. Transactional updates are more effort on engineering side. The changes required to packages and the system are the right thing to do anyways though (like sorting out the /srv mess or separating data/config migrations from installation). So it's worthwhile to fix the packages independent of the discussion about the right update mechanism. cu Ludwig [1] https://www.freedesktop.org/software/systemd/man/systemd.offline-updates.htm... -- (o_ Ludwig Nussel //\ V_/_ http://www.suse.com/ SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Wed, Mar 22, Ludwig Nussel wrote:
However, both approaches are unsuitable for small updates IMO. On a traditional server it would be crazy to require a reboot just to apply e.g. a security update on apache or even systemd when both services can handle inline replacement just fine.
And your apache case is the case, why big customers want to have transactional updates: for them it is not acceptable, that the web server is restarted during an update and maybe the transaction with the customer will fail because of that. They prefer to have scheduled reboots for this. So, there are hundred of different use cases and requirements, and it is clear that there will not be the one solution fiting everything. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On 2017-03-22T14:15:24, Thorsten Kukuk <kukuk@suse.de> wrote:
However, both approaches are unsuitable for small updates IMO. On a traditional server it would be crazy to require a reboot just to apply e.g. a security update on apache or even systemd when both services can handle inline replacement just fine. And your apache case is the case, why big customers want to have transactional updates: for them it is not acceptable, that the web server is restarted during an update and maybe the transaction with the customer will fail because of that. They prefer to have scheduled reboots for this.
Any customer that runs more than one server as in the above will have some sort of HA/Load Balancing setup where they quiesce one at a time, update it, and bring it back online. Whether that includes a reboot or not is basically irrelevant - of course faster updates are always good, but the operational impact is low. I still really like transactional updates because then the system is always consistent, which is just nice, and it does have the potential of reducing the brief restart cycle. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Hi, On Wed, 22 Mar 2017, Ludwig Nussel wrote:
transactional updates would always lead to a consistent system at least.
Just reiterating this doesn't make it true unfortunately. You have to make the source of the snapshot read-only for this to become true and that will have problematic consequences for things like /etc. Ciao, Michael. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On 2017-03-22T15:48:26, Michael Matz <matz@suse.de> wrote:
transactional updates would always lead to a consistent system at least. Just reiterating this doesn't make it true unfortunately. You have to make the source of the snapshot read-only for this to become true and that will have problematic consequences for things like /etc.
In the context of a container host, or a static root image with all data living in containers or user homes (or application containers/apps ;-), it is true. /etc would need to be excluded from the snapshot if modifications are expected there, but then packages would also be required to not update files there - that'd need to be deferred to the first boot, perhaps by only running the %postin/%posttrans scriptlets then. Regards, Lars -- SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Hi, On Wed, 22 Mar 2017, Lars Marowsky-Bree wrote:
On 2017-03-22T15:48:26, Michael Matz <matz@suse.de> wrote:
transactional updates would always lead to a consistent system at least. Just reiterating this doesn't make it true unfortunately. You have to make the source of the snapshot read-only for this to become true and that will have problematic consequences for things like /etc.
In the context of a container host, or a static root image with all data living in containers or user homes (or application containers/apps ;-), it is true.
/etc would need to be excluded from the snapshot if modifications are expected there, but then packages would also be required to not update files there - that'd need to be deferred to the first boot, perhaps by only running the %postin/%posttrans scriptlets then.
Yes. That is part of the "problematic consequences" I meant. Ciao, Michael. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Wed, Mar 22, 2017 at 5:57 PM, Lars Marowsky-Bree <lmb@suse.de> wrote:
/etc would need to be excluded from the snapshot if modifications are expected there, but then packages would also be required to not update files there - that'd need to be deferred to the first boot, perhaps by only running the %postin/%posttrans scriptlets then.
That was one of reasons (or at least justifications) for /usr merge ... having OS image under one and only one directory that can be atomically snapshot. And you also need /var/lib/rpm as part of root. Which pleads for moving it into /usr as well (as this *is* part of root anyway if installation is done on btrfs, but forces every other subtree of /var to be created as separate subvolume). -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Hi, On Tue, 21 Mar 2017, Michael Matz wrote:
Perhaps, are the slides somewhere?
Yes, in the Fosdem archive.
Looking. I'm still hoping I have some basic misunderstanding of the whole "transactional" updates idea.
Did now. No I understood correctly, and my worries are real :) Ciao, Michael. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

21.03.2017 19:26, Michael Matz пишет:
Hi,
On Tue, 21 Mar 2017, Michael Matz wrote:
Perhaps, are the slides somewhere?
Yes, in the Fosdem archive.
Looking. I'm still hoping I have some basic misunderstanding of the whole "transactional" updates idea.
Did now. No I understood correctly, and my worries are real :)
Care to share link to spare other's efforts and time? Thank you. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Tue, 2017-03-21 at 15:45 +0100, Richard Biener wrote:
Sure. But do we now exchange this for all the broken systems where replacing root with the snapshot after the transaction? And are we sure the number of broken systems will actually shrink with this change? (and can you prove that?)
That question implies to never dare testing a different approach than what is currently in use - as you will not be able to guarantee that a new idea as such works better than the old way of doing things without testing it. So far, nobody said that TW will be switching hard to transactional updates - but having the option available and being able to gain experiences is certainly a nice thing to have Cheers, Dominique -- Dimstar / Dominique Leuenberger <dimstar@opensuse.org>

Hi, On Tue, 21 Mar 2017, Dimstar / Dominique Leuenberger wrote:
On Tue, 2017-03-21 at 15:45 +0100, Richard Biener wrote:
Sure. But do we now exchange this for all the broken systems where replacing root with the snapshot after the transaction? And are we sure the number of broken systems will actually shrink with this change? (and can you prove that?)
That question implies to never dare testing a different approach than what is currently in use - as you will not be able to guarantee that a new idea as such works better than the old way of doing things without testing it.
Well, but it must at least be sound if it causes work for others, right? So, proving might be a bit too harsh indeed, but the transactional updates (at least as I understand them right now) seem a bit dubious in that they replace a certain set of problem with other (IMHO worse) problems. How many of the problems solved by transactional updates are actually solved by the required reboot (or could be solved by that alone), and not by the installing-into-snapshot aspect? Ciao, Michael.

On Tue, 2017-03-21 at 16:28 +0100, Michael Matz wrote:
That question implies to never dare testing a different approach
than what is currently in use - as you will not be able to guarantee that a new idea as such works better than the old way of doing things without testing it.
Well, but it must at least be sound if it causes work for others, right? So, proving might be a bit too harsh indeed, but the transactional updates (at least as I understand them right now) seem a bit dubious in that they replace a certain set of problem with other (IMHO worse) problems.
Don't get me wrong: I like the fact that people DISCUSS about the solution and potential issues they see. And I'm sure Thorsten agrees on that too. Just don't always phrase everything in a way stating that all the new stuff just is more broken than all the old broken stuff.
How many of the problems solved by transactional updates are actually solved by the required reboot (or could be solved by that alone), and not by the installing-into-snapshot aspect?
Updating a system is a tricky thing. We have been using zypper dup from within a running session for a long time now, with the chance to rollback when it fails. Every so often users complain the session crashed halfway leaving them in a bad place. GNOME upstream tries to eliminate this by downloading all the updates to the local cache, reboot the machine to a minimal running system, update all the packages and reboot the system into a working system (unlike Windows, though, it does NOT tell you when you have to do it: you actively click on 'reboot and update' from within GNOME Software or select the checkbox 'perform updates on next boot' when you reboot/shutdown your system) - of course everybody complains that booting to apply updates is not what we are used to do on Unix/Linux, and if you understand the setup well enough, you can actually judge on it - most 'users' I tend to talk to would not know how to get started. Transactional Updates does a similar thing, from the other angle: update a snapshot that you will boot to once the update is complete. Of course, just like GNOME's approach, it also requires you to reboot the system. And in plus it requires a very strict separation of program files managed by the package manager from the data part (as would be the case with rollback). I'm not yet sure if any of the methods being implemented / tested at this moment will give us ALL solutions to all problems... but exploring the options is certainly the way to go. No shame in admitting at one point 'oh well, that did not work out as expected - but we learned this and that about the problem' Cheers, Dominique

Hi, On Tue, 21 Mar 2017, Dimstar / Dominique Leuenberger wrote:
Don't get me wrong: I like the fact that people DISCUSS about the solution and potential issues they see. And I'm sure Thorsten agrees on that too. Just don't always phrase everything in a way stating that all the new stuff just is more broken than all the old broken stuff.
Well, I'll try, but unfortunately I think this is actually the case :-/
GNOME upstream tries to eliminate this by downloading all the updates to the local cache, reboot the machine to a minimal running system, update all the packages and reboot the system into a working system (unlike Windows, though, it does NOT tell you when you have to do it: you actively click on 'reboot and update' from within GNOME Software or select the checkbox 'perform updates on next boot' when you reboot/shutdown your system) - of course everybody complains that booting to apply updates is not what we are used to do on Unix/Linux, and if you understand the setup well enough, you can actually judge on it - most 'users' I tend to talk to would not know how to get started.
Transactional Updates does a similar thing, from the other angle: update a snapshot that you will boot to once the update is complete. Of course, just like GNOME's approach, it also requires you to reboot the system. And in plus it requires a very strict separation of program files managed by the package manager from the data part (as would be the case with rollback).
So, from the reboot-is-necessary aspect, both are equivalent. People will either dislike this or not, but the same for both. Also the time for the activation of new stuff is the same (at reboot). So the difference is only how the new bytes becomes activated. With initrd/Windows/GNOME approach: installed on top of whatever was there at reboot time; transactional-updates: _replacing_ whatever was there at reboot time with whatever was there at reboot-time minus $arbitrary_time plus updates. I mean, it seems so very obvious to me that the latter is the worse of the two.
I'm not yet sure if any of the methods being implemented / tested at this moment will give us ALL solutions to all problems... but exploring the options is certainly the way to go. No shame in admitting at one point 'oh well, that did not work out as expected - but we learned this and that about the problem'
Fair enough. Experiments certainly will provide real knowledge instead of mere arguments :) Ciao, Michael. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

21.03.2017 19:05, Michael Matz пишет: ...
GNOME upstream tries to eliminate this by downloading all the updates to the local cache, reboot the machine to a minimal running system, update all the packages and reboot the system into a working system (unlike Windows, though, it does NOT tell you when you have to do it: you actively click on 'reboot and update' from within GNOME Software or select the checkbox 'perform updates on next boot' when you reboot/shutdown your system) - of course everybody complains that booting to apply updates is not what we are used to do on Unix/Linux, and if you understand the setup well enough, you can actually judge on it - most 'users' I tend to talk to would not know how to get started.
Transactional Updates does a similar thing, from the other angle: update a snapshot that you will boot to once the update is complete. Of course, just like GNOME's approach, it also requires you to reboot the system. And in plus it requires a very strict separation of program files managed by the package manager from the data part (as would be the case with rollback).
So, from the reboot-is-necessary aspect, both are equivalent. People will either dislike this or not, but the same for both. Also the time for the activation of new stuff is the same (at reboot).
Not really. GNOME applies updates after reboot; which means reboot may take arbitrary large time and may require second reboot (e.g. if kernel/glibc are updated). Transactional update prepares full environment in advance so reboot time is reduced to the minimum and it is ensured you need just once reboot. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Le mardi 21 mars 2017 à 20:28 +0300, Andrei Borzenkov a écrit :
21.03.2017 19:05, Michael Matz пишет: ...
GNOME upstream tries to eliminate this by downloading all the updates to the local cache, reboot the machine to a minimal running system, update all the packages and reboot the system into a working system (unlike Windows, though, it does NOT tell you when you have to do it: you actively click on 'reboot and update' from within GNOME Software or select the checkbox 'perform updates on next boot' when you reboot/shutdown your system) - of course everybody complains that booting to apply updates is not what we are used to do on Unix/Linux, and if you understand the setup well enough, you can actually judge on it - most 'users' I tend to talk to would not know how to get started.
Transactional Updates does a similar thing, from the other angle: update a snapshot that you will boot to once the update is complete. Of course, just like GNOME's approach, it also requires you to reboot the system. And in plus it requires a very strict separation of program files managed by the package manager from the data part (as would be the case with rollback).
So, from the reboot-is-necessary aspect, both are equivalent. People will either dislike this or not, but the same for both. Also the time for the activation of new stuff is the same (at reboot).
Not really. GNOME applies updates after reboot; which means reboot may take arbitrary large time and may require second reboot (e.g. if kernel/glibc are updated). Transactional update prepares full environment in advance so reboot time is reduced to the minimum and it is ensured you need just once reboot.
It would be possible to "plug" transaction-updates as a replacement for PackageKit (it isn't really GNOME) offline-updates : if transactional- update package is installed and offline-updates are enabled, transactional-update would be used instead of PK offline-update and then a reboot would be triggered. (I planned to test this during Hackweek but didn't had time unfortunately) -- Frederic Crozat Enterprise Desktop Release Manager SUSE -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Hello, On Mar 21 16:39 Dimstar / Dominique Leuenberger wrote (excerpt):
Transactional Updates ... ... requires a very strict separation of program files managed by the package manager from the data part
how is that requirement enforced? I assume via btrfs subvolumes for the "data part". But what about /etc/ ? Files in /etc/ belong both to the package manager and to the user when the user works as admin. I assume while a transactional update is running the user must not change files in /etc/. I assume on enterprise systems the admins know that but what about unexperienced openSUSE users who may run whatever YaST or other config programs while a transactional update is running? Perhaps a transactional update may even run unnoticed by the user somehow in the background? Kind Regards Johannes Meixner -- SUSE LINUX GmbH - GF: Felix Imendoerffer, Jane Smithard, Graham Norton - HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Tue, Mar 21, Michael Matz wrote:
Well, but it must at least be sound if it causes work for others, right?
It does not. The work for transactional updates is the same as for snapshots and rollback. This is the nice idea behind it. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On 03/21/2017 11:26 PM, Thorsten Kukuk wrote:
On Tue, Mar 21, Richard Biener wrote:
On Tue, 21 Mar 2017, Thorsten Kukuk wrote:
On Tue, Mar 21, Johannes Meixner wrote:
Hello Thorsten,
On Mar 20 15:06 Thorsten Kukuk wrote (excerpt):
https://en.opensuse.org/openSUSE:Packaging_for_transactional-updates
It reads: ------------------------------------------------------------------------- ... instead of creating a snapshot, updating the current system and rolling back if an error happened, we create a snapshot, update this snapshot, and do a "rollback" to that snapshot if no error did occur. -------------------------------------------------------------------------
Can you explain therein the reason behind why it is done this way or add a link that points to an explanation?
That's explained in the very first sentences:
"Transactional updates are atomic. This means, either the update is fully applied without any error, or no change is made to the system. Additional, transactional upates should not influence the currently running processes."
If you update the running system, none of this is fullfillable.
But it also means the "running system" part that is transactionally modified should better be readonly as otherwise you lose changes done during transaction start and commit?
If you want to be 100% safe: yes, the root filesystem should be read-only, as we do for SUSE CaaSP for this reason. But, if you run the transactional-update in the night with automatic reboot, or make sure that all data is written in other subvolumes beside the root subvolume, you are mostly safe, too.
Only just got to catching up on the thread sorry if this was already mentioned, A further issue here is that some programs write often, the enlightenment desktop for example has been known to regularly update its config during operation admittedly this is probably only a issue if someone is using btrfs without a separate /home partition but it probably needs to be considered that much modern desktop software doesn't like it when it can't write to its normal writable location whether thats somewhere in ~/.local or ~/.cache or anywhere else that could be within the root file system. -- Simon Lees (Simotek) http://simotek.net Emergency Update Team keybase.io/simotek SUSE Linux Adelaide Australia, UTC+10:30 GPG Fingerprint: 5B87 DB9D 88DC F606 E489 CEC5 0922 C246 02F0 014B

Le jeudi 23 mars 2017 à 20:57 +1030, Simon Lees a écrit :
On 03/21/2017 11:26 PM, Thorsten Kukuk wrote:
On Tue, Mar 21, Richard Biener wrote:
On Tue, 21 Mar 2017, Thorsten Kukuk wrote:
On Tue, Mar 21, Johannes Meixner wrote:
Hello Thorsten,
On Mar 20 15:06 Thorsten Kukuk wrote (excerpt):
https://en.opensuse.org/openSUSE:Packaging_for_transactiona l-updates
It reads: ----------------------------------------------------------- -------------- ... instead of creating a snapshot, updating the current system and rolling back if an error happened, we create a snapshot, update this snapshot, and do a "rollback" to that snapshot if no error did occur. ----------------------------------------------------------- --------------
Can you explain therein the reason behind why it is done this way or add a link that points to an explanation?
That's explained in the very first sentences:
"Transactional updates are atomic. This means, either the update is fully applied without any error, or no change is made to the system. Additional, transactional upates should not influence the currently running processes."
If you update the running system, none of this is fullfillable.
But it also means the "running system" part that is transactionally modified should better be readonly as otherwise you lose changes done during transaction start and commit?
If you want to be 100% safe: yes, the root filesystem should be read-only, as we do for SUSE CaaSP for this reason. But, if you run the transactional-update in the night with automatic reboot, or make sure that all data is written in other subvolumes beside the root subvolume, you are mostly safe, too.
Only just got to catching up on the thread sorry if this was already mentioned, A further issue here is that some programs write often, the enlightenment desktop for example has been known to regularly update its config during operation admittedly this is probably only a issue if someone is using btrfs without a separate /home partition but it probably needs to be considered that much modern desktop software doesn't like it when it can't write to its normal writable location whether thats somewhere in ~/.local or ~/.cache or anywhere else that could be within the root file system.
Even without a separate home partition, /home will be in a different btrfs subvolme so it won't be snapshotted.. It could be an issue for /root, thought ;) -- Frederic Crozat Enterprise Desktop Release Manager SUSE -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Mon, Mar 20, 2017 at 03:06:28PM +0100, Thorsten Kukuk wrote:
I started to collect issues with RPMs and transational-updates and how to avoid them: https://en.opensuse.org/openSUSE:Packaging_for_transactional-updates
1. The document talks about things in/outside main root btrfs subvolume. As not all packagers are using BtrFS and default partitioning proposal themselves (and the default layout also keeps changing), it would be helpful if the document was more specific about which standard directories should be expected to be outside the root subvolume. 2. In section "Data and applications" "/srv" is a real nightmare in this regard: it contains applications, user data and configuration files mixed up That IMHO sounds rather like "/opt", is "/srv" really meant here? IIRC there was a recommendation recently that packages shouldn't install anything into /srv except empty subdirectories. Michal Kubeček -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

On Thu, Mar 23, Michal Kubecek wrote:
1. The document talks about things in/outside main root btrfs subvolume. As not all packagers are using BtrFS and default partitioning proposal themselves (and the default layout also keeps changing), it would be helpful if the document was more specific about which standard directories should be expected to be outside the root subvolume.
Thanks for the feedback, I will try to be more specific here.
2. In section "Data and applications"
"/srv" is a real nightmare in this regard: it contains applications, user data and configuration files mixed up
That IMHO sounds rather like "/opt", is "/srv" really meant here? IIRC there was a recommendation recently that packages shouldn't install anything into /srv except empty subdirectories.
/opt is an own issue, since only ISVs are allowed to write there and distributors should not install there anything. And yes, I mean /srv. The recommendation is there since not seperating data is already a problem with snapshots and rollback today. But that there is a recommendation does not mean, that immeaditly all packages are "fixed", not even that people start working on this ... So today /srv is still a real nightmare for snapshots and rollback and I'm afraid this will only change if we enforce the recommendation. But to do so, we would need solutions, and for quite some problems we don't have one. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org
participants (13)
-
Andrei Borzenkov
-
Dimstar / Dominique Leuenberger
-
Frederic Crozat
-
Joachim Werner
-
Johannes Meixner
-
Lars Marowsky-Bree
-
Ludwig Nussel
-
Michael Matz
-
Michal Kubecek
-
Richard Biener
-
Sebastian
-
Simon Lees
-
Thorsten Kukuk