[opensuse-packaging] Packaging hints for transactional updates

newer
[opensuse-packaging] Try to update...

Thorsten Kukuk

20 Mar 2017 20 Mar '17

14:06

Hi, I started to collect issues with RPMs and transational-updates and how to avoid them: https://en.opensuse.org/openSUSE:Packaging_for_transactional-updates Luckily, until now it's not much and most RPMs are fine. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Show replies by date

Johannes Meixner

21 Mar 21 Mar

09:23

Hello Thorsten, On Mar 20 15:06 Thorsten Kukuk wrote (excerpt):

...

https://en.opensuse.org/openSUSE:Packaging_for_transactional-updates

It reads: ------------------------------------------------------------------------- ... instead of creating a snapshot, updating the current system and rolling back if an error happened, we create a snapshot, update this snapshot, and do a "rollback" to that snapshot if no error did occur. ------------------------------------------------------------------------- Can you explain therein the reason behind why it is done this way or add a link that points to an explanation? ( I have my own personal idea what the reason behind could be but I would prefer to also know the "official" reason behind ;-) Kind Regards Johannes Meixner -- SUSE LINUX GmbH - GF: Felix Imendoerffer, Jane Smithard, Graham Norton - HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Thorsten Kukuk

12:40

On Tue, Mar 21, Johannes Meixner wrote:

...

Hello Thorsten,

On Mar 20 15:06 Thorsten Kukuk wrote (excerpt):

...
https://en.opensuse.org/openSUSE:Packaging_for_transactional-updates

It reads: ------------------------------------------------------------------------- ... instead of creating a snapshot, updating the current system and rolling back if an error happened, we create a snapshot, update this snapshot, and do a "rollback" to that snapshot if no error did occur. -------------------------------------------------------------------------

Can you explain therein the reason behind why it is done this way or add a link that points to an explanation?

That's explained in the very first sentences: "Transactional updates are atomic. This means, either the update is fully applied without any error, or no change is made to the system. Additional, transactional upates should not influence the currently running processes." If you update the running system, none of this is fullfillable. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Richard Biener

12:50

On Tue, 21 Mar 2017, Thorsten Kukuk wrote:

...

On Tue, Mar 21, Johannes Meixner wrote:

...
Hello Thorsten,

On Mar 20 15:06 Thorsten Kukuk wrote (excerpt):

...
https://en.opensuse.org/openSUSE:Packaging_for_transactional-updates

It reads: ------------------------------------------------------------------------- ... instead of creating a snapshot, updating the current system and rolling back if an error happened, we create a snapshot, update this snapshot, and do a "rollback" to that snapshot if no error did occur. -------------------------------------------------------------------------

Can you explain therein the reason behind why it is done this way or add a link that points to an explanation?

That's explained in the very first sentences:

"Transactional updates are atomic. This means, either the update is fully applied without any error, or no change is made to the system. Additional, transactional upates should not influence the currently running processes."

If you update the running system, none of this is fullfillable.

But it also means the "running system" part that is transactionally modified should better be readonly as otherwise you lose changes done during transaction start and commit? Richard.

...

Thorsten

-- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Thorsten Kukuk

12:56

On Tue, Mar 21, Richard Biener wrote:

...

On Tue, 21 Mar 2017, Thorsten Kukuk wrote:

...
On Tue, Mar 21, Johannes Meixner wrote:

...
Hello Thorsten,

On Mar 20 15:06 Thorsten Kukuk wrote (excerpt):

...
https://en.opensuse.org/openSUSE:Packaging_for_transactional-updates

It reads: ------------------------------------------------------------------------- ... instead of creating a snapshot, updating the current system and rolling back if an error happened, we create a snapshot, update this snapshot, and do a "rollback" to that snapshot if no error did occur. -------------------------------------------------------------------------

Can you explain therein the reason behind why it is done this way or add a link that points to an explanation?

That's explained in the very first sentences:

"Transactional updates are atomic. This means, either the update is fully applied without any error, or no change is made to the system. Additional, transactional upates should not influence the currently running processes."

If you update the running system, none of this is fullfillable.

But it also means the "running system" part that is transactionally modified should better be readonly as otherwise you lose changes done during transaction start and commit?

If you want to be 100% safe: yes, the root filesystem should be read-only, as we do for SUSE CaaSP for this reason. But, if you run the transactional-update in the night with automatic reboot, or make sure that all data is written in other subvolumes beside the root subvolume, you are mostly safe, too. So my openSUSE Tumbleweed installations are updated at 3 o'clock in the night automatically and reboot if patches where successfully applied. But some people already think about porting the read-only root subvolume code to Tumbleweed for transactional updates. But this will require quite a lot of changes to existing RPMs. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Richard Biener

14:31

On Tue, 21 Mar 2017, Thorsten Kukuk wrote:

...

On Tue, Mar 21, Richard Biener wrote:

...
On Tue, 21 Mar 2017, Thorsten Kukuk wrote:

...
On Tue, Mar 21, Johannes Meixner wrote:

...
Hello Thorsten,

On Mar 20 15:06 Thorsten Kukuk wrote (excerpt):

...
https://en.opensuse.org/openSUSE:Packaging_for_transactional-updates

It reads: ------------------------------------------------------------------------- ... instead of creating a snapshot, updating the current system and rolling back if an error happened, we create a snapshot, update this snapshot, and do a "rollback" to that snapshot if no error did occur. -------------------------------------------------------------------------

Can you explain therein the reason behind why it is done this way or add a link that points to an explanation?

That's explained in the very first sentences:

"Transactional updates are atomic. This means, either the update is fully applied without any error, or no change is made to the system. Additional, transactional upates should not influence the currently running processes."

If you update the running system, none of this is fullfillable.

But it also means the "running system" part that is transactionally modified should better be readonly as otherwise you lose changes done during transaction start and commit?

If you want to be 100% safe: yes, the root filesystem should be read-only, as we do for SUSE CaaSP for this reason.

But I wonder if the current scheme of doing the modification in the live system and rolling back on error works closer to 100% then (in the case of read-write root). Given a transaction abort should be the minority of cases ...

...

But, if you run the transactional-update in the night with automatic reboot, or make sure that all data is written in other subvolumes beside the root subvolume, you are mostly safe, too.

Doesn't sound like "no changes to the running system" to me then ;)

...

So my openSUSE Tumbleweed installations are updated at 3 o'clock in the night automatically and reboot if patches where successfully applied.

But some people already think about porting the read-only root subvolume code to Tumbleweed for transactional updates. But this will require quite a lot of changes to existing RPMs.

So what issue are we solving then? That is, is this for CaaSP only where we can guarantee the r/o root? Richard. -- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Thorsten Kukuk

14:33

On Tue, Mar 21, Richard Biener wrote:

...

So what issue are we solving then? That is, is this for CaaSP only where we can guarantee the r/o root?

The problems we want to solve are all the broken systems you can regular read on the factory list about after an update, because the update in the running system did break other running processes. Especially the Desktop. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Richard Biener

14:45

On Tue, 21 Mar 2017, Thorsten Kukuk wrote:

...

On Tue, Mar 21, Richard Biener wrote:

...
So what issue are we solving then? That is, is this for CaaSP only where we can guarantee the r/o root?

The problems we want to solve are all the broken systems you can regular read on the factory list about after an update, because the update in the running system did break other running processes. Especially the Desktop.

Sure. But do we now exchange this for all the broken systems where replacing root with the snapshot after the transaction? And are we sure the number of broken systems will actually shrink with this change? (and can you prove that?) Richard.

...

Thorsten

Thorsten Kukuk

14:51

On Tue, Mar 21, Richard Biener wrote:

...

On Tue, 21 Mar 2017, Thorsten Kukuk wrote:

...
On Tue, Mar 21, Richard Biener wrote:

...
So what issue are we solving then? That is, is this for CaaSP only where we can guarantee the r/o root?

The problems we want to solve are all the broken systems you can regular read on the factory list about after an update, because the update in the running system did break other running processes. Especially the Desktop.

Sure. But do we now exchange this for all the broken systems where replacing root with the snapshot after the transaction? And are we sure the number of broken systems will actually shrink with this change? (and can you prove that?)

Sorry, but I neither understand what you are writing here nor what your problem is. What I can prove is: all the problems the people had with updates, that their running applications did crash and the update did not finish and their system were left over in a unbootable state can be solved with this. And about your fear about that the snapshot is in a broken state after update even if zypper did not return any error: if this happens, this would also happen with your normal running system. It does not matter if you update a snapshot or the real system, the installed RPMs are the same. And, don't forget: if you don't like transactional updates, nobody is forcing you to use them. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Richard Biener

15:07

On Tue, 21 Mar 2017, Thorsten Kukuk wrote:

...

On Tue, Mar 21, Richard Biener wrote:

...
On Tue, 21 Mar 2017, Thorsten Kukuk wrote:

...
On Tue, Mar 21, Richard Biener wrote:

...
So what issue are we solving then? That is, is this for CaaSP only where we can guarantee the r/o root?

The problems we want to solve are all the broken systems you can regular read on the factory list about after an update, because the update in the running system did break other running processes. Especially the Desktop.

Sure. But do we now exchange this for all the broken systems where replacing root with the snapshot after the transaction? And are we sure the number of broken systems will actually shrink with this change? (and can you prove that?)

Sorry, but I neither understand what you are writing here nor what your problem is.

What I can prove is: all the problems the people had with updates, that their running applications did crash and the update did not finish and their system were left over in a unbootable state can be solved with this.

True - this is because you are not updating the running system but the one that is activated after the next reboot (as far as I understand).

...

And about your fear about that the snapshot is in a broken state after update even if zypper did not return any error: if this happens, this would also happen with your normal running system. It does not matter if you update a snapshot or the real system, the installed RPMs are the same.

No, I am refering to the time window between creating the snapshot and activating it. For a true transaction you'd need to verify the root you are about to replace with the updated snapshot is in the same state as at the time of snapshot creation (thus, it had better be readonly). Otherwise you are losing data.

...

And, don't forget: if you don't like transactional updates, nobody is forcing you to use them.

Of course. But the system you are implementing sounds a more dangerous way of effectively downloading the update in the running system, rebooting, and at defined state (say, in initrd context) create the snapshot, install into it and continue booting from it. _Without_ the issue of that inconsistency due to the time window the root is active between creating the snapshot and activating it. Richard.

...

Thorsten

Thorsten Kukuk

15:12

On Tue, Mar 21, Richard Biener wrote:

...

No, I am refering to the time window between creating the snapshot and activating it. For a true transaction you'd need to verify the root you are about to replace with the updated snapshot is in the same state as at the time of snapshot creation (thus, it had better be readonly). Otherwise you are losing data.

If you clearly seperate data from applications, as you have to do for snapshot and rollback anyways, the risk is really very low. And if you use a read-only root filesystem, the risk is zero. But this are not only problems with transactional updates, you have the same problems already today if you use rollback. And there the risk of data lossage is much, much higher.

...

But the system you are implementing sounds a more dangerous way of effectively downloading the update in the running system, rebooting, and at defined state (say, in initrd context) create the snapshot, install into it and continue booting from it.

That's how Windows is doing it and GNOME tries to implement it. You should watch my presentation at Fosdem this year, which negativ impact this already had in the past. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Michael Matz

15:24

Hi, On Tue, 21 Mar 2017, Thorsten Kukuk wrote:

...

If you clearly seperate data from applications, as you have to do for snapshot and rollback anyways, the risk is really very low. And if you use a read-only root filesystem, the risk is zero. But this are not only problems with transactional updates, you have the same problems already today if you use rollback. And there the risk of data lossage is much, much higher.

But a rollback always implies data-loss (namely all the data written since). That's known by people (at least by those that have some mental concept of "transactional"). Data-loss by activation of an update is surprising. What happens e.g. in this situation: % user installs update % user installs more updates % user adds repo and installs a new rpm foo % user reboots (because finally he's annoyed by the warning of having to reboot) foo is gone from the rebooted system, right (except of course for the desktop item for that program, now pointing to a dead file, which is even more surprising)? Are at least updates 1 and 2 merged?

...

...
But the system you are implementing sounds a more dangerous way of effectively downloading the update in the running system, rebooting, and at defined state (say, in initrd context) create the snapshot, install into it and continue booting from it.

That's how Windows is doing it and GNOME tries to implement it.

But the way Windows does it (I don't know about GNOME) is definitely more like a proper transaction than installing into a snapshot and just activating it at next reboot over whatever was there before. That's not transactional at all because the abort-transaction with concurrent writes is missing.

...

You should watch my presentation at Fosdem this year, which negativ impact this already had in the past.

Perhaps, are the slides somewhere? Ciao, Michael. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Thorsten Kukuk

15:28

On Tue, Mar 21, Michael Matz wrote:

...

Hi,

On Tue, 21 Mar 2017, Thorsten Kukuk wrote:

...
If you clearly seperate data from applications, as you have to do for snapshot and rollback anyways, the risk is really very low. And if you use a read-only root filesystem, the risk is zero. But this are not only problems with transactional updates, you have the same problems already today if you use rollback. And there the risk of data lossage is much, much higher.

But a rollback always implies data-loss (namely all the data written since). That's known by people (at least by those that have some mental concept of "transactional"). Data-loss by activation of an update is surprising.

What happens e.g. in this situation:

% user installs update % user installs more updates % user adds repo and installs a new rpm foo % user reboots (because finally he's annoyed by the warning of having to reboot)

foo is gone from the rebooted system, right (except of course for the desktop item for that program, now pointing to a dead file, which is even more surprising)? Are at least updates 1 and 2 merged?

Nothing happend and foo is there, if the user only uses the transactional-update script and not a mix of different tools.

...

But the way Windows does it (I don't know about GNOME) is definitely more like a proper transaction than installing into a snapshot and just activating it at next reboot over whatever was there before. That's not transactional at all because the abort-transaction with concurrent writes is missing.

Sorry, but exactly the other way around is true. Windows is not doing any transaction at all. If your system breaks, it's broken. What Windows calls "transaction" in this case is only what is written into the windows registry, not on harddisk.

...

Perhaps, are the slides somewhere?

Yes, in the Fosdem archive. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Michael Matz

15:45

Hi, On Tue, 21 Mar 2017, Thorsten Kukuk wrote:

...

...
What happens e.g. in this situation:

% user installs update % user installs more updates % user adds repo and installs a new rpm foo % user reboots (because finally he's annoyed by the warning of having to reboot)

foo is gone from the rebooted system, right (except of course for the desktop item for that program, now pointing to a dead file, which is even more surprising)? Are at least updates 1 and 2 merged?

Nothing happend and foo is there, if the user only uses the transactional-update script and not a mix of different tools.

That's nice. Is zypper for installing the rpm at above third step a different tool for these purposes (and does it mean that the rpm is installed twice, once into the used-after-reboot snapshot, and once into the currently used one?)

...

...
But the way Windows does it (I don't know about GNOME) is definitely more like a proper transaction than installing into a snapshot and just activating it at next reboot over whatever was there before. That's not transactional at all because the abort-transaction with concurrent writes is missing.

Sorry, but exactly the other way around is true. Windows is not doing any transaction at all. If your system breaks, it's broken. What Windows calls "transaction" in this case is only what is written into the windows registry, not on harddisk.

Um. I'm not sure how to say this, but you do know that NTFS provides snapshots as well and Windows makes use of them for update purposes much similar to our rollback scenario with btrfs? And as the kernel-related updates for it are installed at points where there are definitely no other writers, namely at shutdown or bootup, yes, that is actually transactional (or at least more so, all this talk about "transactional" related to updates and filesystems is a bit dodgy as it's quite a bit unlike transactions in the database sense). That this is done in Windows had initially different reasons, but now the side-effect is that the updates are safe from concurrent writes. (that would be comparable to us installing the updates from initrd just after mounting the FSes)

...

...
Perhaps, are the slides somewhere?

Yes, in the Fosdem archive.

Looking. I'm still hoping I have some basic misunderstanding of the whole "transactional" updates idea. Ciao, Michael. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Thorsten Kukuk

15:56

On Tue, Mar 21, Michael Matz wrote:

...

Um. I'm not sure how to say this, but you do know that NTFS provides snapshots as well and Windows makes use of them for update purposes much similar to our rollback scenario with btrfs?

If it is used for Windows Updates it does not work in practice, else the windows installation on my laptop wouldn't go in a corrupt state unrepairable by Windows by an update. And only wikipedia assumes that the Windows Update is using it, Microsoft itself is not mentioning it. Ah, wait, yes, Windows can do a rollback and Microsoft explains it, but at first you need to get the current broken state booting to start the GUI to do the rollback ... If there are other ways, MS is hiding them well in their documentation. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Michael Matz

16:22

Hi, On Tue, 21 Mar 2017, Thorsten Kukuk wrote:

...

On Tue, Mar 21, Michael Matz wrote:

...
Um. I'm not sure how to say this, but you do know that NTFS provides snapshots as well and Windows makes use of them for update purposes much similar to our rollback scenario with btrfs?

If it is used for Windows Updates it does not work in practice, else the windows installation on my laptop wouldn't go in a corrupt state unrepairable by Windows by an update. And only wikipedia assumes that the Windows Update is using it, Microsoft itself is not mentioning it. Ah, wait, yes, Windows can do a rollback and Microsoft explains it, but at first you need to get the current broken state booting to start the GUI to do the rollback ... If there are other ways, MS is hiding them well in their documentation.

Like us they can boot from a snapshot (or rather at boot time you can restore a snapshot from the "emergency initrd" (which even has graphical user interface!)). Depending on Windows installation and settings snapshoting is not always active (like with us), and sometimes they remove snapshots too early. But sure: they have bugs, that's no wonder. Us too, if I may be so bold :) (that your laptop can't be recovered is not a good data point; my sometimes-windows-gaming desktop was recoverable (and actually needed it only once in the whole lifetime, unlike us when we botched the bootloader installation for the 20th time), so can I say now that it does work in practice? :) ). In any case bugs in their implementation of it don't directly show an inherent flaw in their approach. I do see some problems with that approach, but far fewer than in the transactional-updates approach. But I guess somebody will have to properly implement the initrd approach for us so that we can really compare both on the system we care about. After all, possibly I'm wrong ;) Ciao, Michael. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Thorsten Kukuk

16:25

On Tue, Mar 21, Michael Matz wrote:

...

In any case bugs in their implementation of it don't directly show an inherent flaw in their approach. I do see some problems with that approach, but far fewer than in the transactional-updates approach. But I guess somebody will have to properly implement the initrd approach for us so that we can really compare both on the system we care about. After all, possibly I'm wrong ;)

As I wrote: switch your btrfs root subvolume to read-only and transactional-updates are 100% safe. And since you can apply them at any time, you even don't need to spend the time waiting that your mission critical server is alive again ;) Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Michael Matz

16:33

Hi, On Tue, 21 Mar 2017, Thorsten Kukuk wrote:

...

As I wrote: switch your btrfs root subvolume to read-only and transactional-updates are 100% safe.

Yes. And as we were saying, if you don't do that you don't have anything at all.

...

And since you can apply them at any time, you even don't need to spend the time waiting that your mission critical server is alive again ;)

You mean with read-only / ? How could updates become active without reboot? You can't change the files already opened by running processes. So eventually you _have_ to wait for reboot/app-restart (let's ignore live patching for this thread :) ), but that's of course the same with all update approaches (in other words I don't see how transactional-updates specifically change anything in this regard). Ciao, Michael. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Sebastian

18:04

...

...
And since you can apply them at any time, you even don't need to spend the time waiting that your mission critical server is alive again ;) You mean with read-only / ? How could updates become active without reboot? You can't change the files already opened by running processes. Are we only talking about updates of the kernel and libc or about all

On 03/21/2017 05:33 PM, Michael Matz wrote: programs? For the first one, we need to to reboot, but not for the latter one. I always just restart services, not the whole server. For GUI systems, often a logout/login is sufficient. Are transactional upgrades enforcing reboots after every upgrade? -- python programming - mail server - photo - video - https://sebix.at cryptographic key at https://sebix.at/DC9B463B.asc and on public keyservers

Thorsten Kukuk

19:34

On Tue, Mar 21, Sebastian wrote:

...

...
...
And since you can apply them at any time, you even don't need to spend the time waiting that your mission critical server is alive again ;) You mean with read-only / ? How could updates become active without reboot? You can't change the files already opened by running processes. Are we only talking about updates of the kernel and libc or about all

On 03/21/2017 05:33 PM, Michael Matz wrote: programs? For the first one, we need to to reboot, but not for the latter one. I always just restart services, not the whole server. For GUI systems, often a logout/login is sufficient.

Are transactional upgrades enforcing reboots after every upgrade?

transactional upgrades always need a reboot to activate the changes. That's with all implementations the case, independent of how it is implemented. Else you couldn't do it "atomic" and without influence on the running system. If you have a read-write root filesystem, you could apply small changes with zypper, and only do the big ones with transactional-updates. But in this case, you should not continue to use zypper until you reboot, else this changes will go lost. But since openSUSE Tumbleweed is only updated at max. once a day, you can run the update in the night, including reboot, without any problems or risks. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Sebastian

19:48

On 03/21/2017 08:34 PM, Thorsten Kukuk wrote:

...

But since openSUSE Tumbleweed is only updated at max. once a day, you can run the update in the night, including reboot, without any problems or risks. So this only applies to Tumbleweed? Just now or in the future too?

-- python programming - mail server - photo - video - https://sebix.at cryptographic key at https://sebix.at/DC9B463B.asc and on public keyservers

Joachim Werner

21:04

...

Am 21.03.2017 um 17:33 schrieb Michael Matz <matz@suse.de>:

Hi,

...
On Tue, 21 Mar 2017, Thorsten Kukuk wrote:

As I wrote: switch your btrfs root subvolume to read-only and transactional-updates are 100% safe.

Yes. And as we were saying, if you don't do that you don't have anything at all.

...
And since you can apply them at any time, you even don't need to spend the time waiting that your mission critical server is alive again ;)

You mean with read-only / ? How could updates become active without reboot? You can't change the files already opened by running processes. So eventually you _have_ to wait for reboot/app-restart (let's ignore live patching for this thread :) ), but that's of course the same with all update approaches (in other words I don't see how transactional-updates specifically change anything in this regard).

I'm sure I'm missing something, but for a system where data, applications, and configuration are separated reasonably well into their own btrfs subvolumes (which I think is the case for a default SUSE Linux setup), what kind of writes exactly could happen, in the same subvolume as the one the transactional snapshot is done to, that could really lead to data loss? Obviously none if the subvolume in question is mounted read-only. But even if it isn't, what kinds of (intended) writes would happen to /usr/* during a transactional update run? I'm always surprised how well traditional RPM updates work in the running (server) system, although we are basically relying on old software that hasn't been updated yet to peacefully co-exist with new software that has been updated already. But there are real problems, and from time to time we hit them. For example, we've run into cases where updating Salt with Salt fails because the running Salt process may lazy-load updated code that doesn't match the running code's APIs any more. And of course it doesn't work smoothly at all if you try to update a Gnome or KDE from within a running desktop session. Compared to the alternative approaches (Windows, Mac, iOS, Gnome, Android), I see the "Kukuk-approach" as the best choice if we can get those (IMHO rather hypothetical) snapshot-related data losses under control: The downtime is really reduced to just the reboot, while the other approaches leave the system in a non-productive state during at least one reboot plus the time all updates take to be applied, which can take many minutes even if the updates have been completely downloaded before. Joachim -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Richard Biener

22 Mar 22 Mar

08:42

On Tue, 21 Mar 2017, Joachim Werner wrote:

...

...
Am 21.03.2017 um 17:33 schrieb Michael Matz <matz@suse.de>:

Hi,

...
On Tue, 21 Mar 2017, Thorsten Kukuk wrote:

As I wrote: switch your btrfs root subvolume to read-only and transactional-updates are 100% safe.

Yes. And as we were saying, if you don't do that you don't have anything at all.

...
And since you can apply them at any time, you even don't need to spend the time waiting that your mission critical server is alive again ;)

You mean with read-only / ? How could updates become active without reboot? You can't change the files already opened by running processes. So eventually you _have_ to wait for reboot/app-restart (let's ignore live patching for this thread :) ), but that's of course the same with all update approaches (in other words I don't see how transactional-updates specifically change anything in this regard).

I'm sure I'm missing something, but for a system where data, applications, and configuration are separated reasonably well into their own btrfs subvolumes (which I think is the case for a default SUSE Linux setup), what kind of writes exactly could happen, in the same subvolume as the one the transactional snapshot is done to, that could really lead to data loss?

Obviously none if the subvolume in question is mounted read-only. But even if it isn't, what kinds of (intended) writes would happen to /usr/* during a transactional update run?

I'm always surprised how well traditional RPM updates work in the running (server) system, although we are basically relying on old software that hasn't been updated yet to peacefully co-exist with new software that has been updated already.

But there are real problems, and from time to time we hit them. For example, we've run into cases where updating Salt with Salt fails because the running Salt process may lazy-load updated code that doesn't match the running code's APIs any more.

And of course it doesn't work smoothly at all if you try to update a Gnome or KDE from within a running desktop session.

Compared to the alternative approaches (Windows, Mac, iOS, Gnome, Android), I see the "Kukuk-approach" as the best choice if we can get those (IMHO rather hypothetical) snapshot-related data losses under control:

The downtime is really reduced to just the reboot, while the other approaches leave the system in a non-productive state during at least one reboot plus the time all updates take to be applied, which can take many minutes even if the updates have been completely downloaded before.

But as you are safely doing it during night anyway that little extra time doesn't matter. I suppose that the transactional update should work this way: 1) download updates 2) force sub-volumes we are going to snapshot r/o 3) snapshot, apply updates ... time passes (hopefully running system is happy with r/o state) 4) you reboot, updated snapshot gets activated then we're safe. But if you leave out 2) there's the possibility of breakage (or you need a verification step before 4) that the subvolumes you snapshotted didn't change so you can rollback in that case and try again). You can do 2) and 3) also during/after 4) easily. Your theory above suggests that 2) is not going to be an issue for the running system or your productivity. Richard. -- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Thorsten Kukuk

07:24

On Tue, Mar 21, Michael Matz wrote:

...

But I guess somebody will have to properly implement the initrd approach for us so that we can really compare both on the system we care about. After all, possibly I'm wrong ;)

I did implement the initrd approach two years ago during hackweek in a few hours. But it requires that you always boot twice, first to run zypper from a initrd, second to activate all changes and get back to a consistent system. You gain nothing from this except sitting a long time for a for normal work unuseable machine. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Richard Biener

08:45

On Wed, 22 Mar 2017, Thorsten Kukuk wrote:

...

On Tue, Mar 21, Michael Matz wrote:

...
But I guess somebody will have to properly implement the initrd approach for us so that we can really compare both on the system we care about. After all, possibly I'm wrong ;)

I did implement the initrd approach two years ago during hackweek in a few hours. But it requires that you always boot twice, first to run zypper from a initrd, second to activate all changes and get back to a consistent system. You gain nothing from this except sitting a long time for a for normal work unuseable machine.

Not sure why you need to reboot twice - you'd apply a kernel update online in a non-transactional way (we have working ways of "rolling" back to the old kernel). So the reboot updates the kernel, from the initrd you apply the update and simply continue booting? Updates usually are small, so this won't take long (assuming you have downloaded them before, of course). The interesting part is of course some intelligence to decide which updates to apply online and which ones to do "transactional". Richard. -- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Andrei Borzenkov

08:52

On Wed, Mar 22, 2017 at 11:45 AM, Richard Biener <rguenther@suse.de> wrote:

...

Not sure why you need to reboot twice - you'd apply a kernel update online in a non-transactional way (we have working ways of "rolling" back to the old kernel). So the reboot updates the kernel, from the initrd you apply the update and simply continue booting?

Some services may be taken over from initrd. In this case initrd had been generated using old versions and they will continue to be used even if you update root to new version until next reboot. This may lead to rather hard to debug issues because everything after boot will indicate you have new versions installed. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Richard Biener

09:01

On Wed, 22 Mar 2017, Andrei Borzenkov wrote:

...

On Wed, Mar 22, 2017 at 11:45 AM, Richard Biener <rguenther@suse.de> wrote:

...
Not sure why you need to reboot twice - you'd apply a kernel update online in a non-transactional way (we have working ways of "rolling" back to the old kernel). So the reboot updates the kernel, from the initrd you apply the update and simply continue booting?

Some services may be taken over from initrd. In this case initrd had been generated using old versions and they will continue to be used even if you update root to new version until next reboot. This may lead to rather hard to debug issues because everything after boot will indicate you have new versions installed.

Well, apply the update from init (systemd) then before it spawns anything else. init can reload itself. Yes, there are implementation difficulties but it can work. You still do have to wait for those updates to be applied of course. Richard. -- Richard Biener <rguenther@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Thorsten Kukuk

09:05

On Wed, Mar 22, Richard Biener wrote:

...

On Wed, 22 Mar 2017, Andrei Borzenkov wrote:

...
On Wed, Mar 22, 2017 at 11:45 AM, Richard Biener <rguenther@suse.de> wrote:

...
Not sure why you need to reboot twice - you'd apply a kernel update online in a non-transactional way (we have working ways of "rolling" back to the old kernel). So the reboot updates the kernel, from the initrd you apply the update and simply continue booting?

Some services may be taken over from initrd. In this case initrd had been generated using old versions and they will continue to be used even if you update root to new version until next reboot. This may lead to rather hard to debug issues because everything after boot will indicate you have new versions installed.

Well, apply the update from init (systemd) then before it spawns anything else. init can reload itself.

And afterwards you have to reboot to activate the new kernel and initrd ... Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Michael Matz

14:50

Hi, On Wed, 22 Mar 2017, Thorsten Kukuk wrote:

...

...
...
...
Not sure why you need to reboot twice - you'd apply a kernel update online in a non-transactional way (we have working ways of "rolling" back to the old kernel). So the reboot updates the kernel, from the initrd you apply the update and simply continue booting?

Some services may be taken over from initrd. In this case initrd had been generated using old versions and they will continue to be used even if you update root to new version until next reboot. This may lead to rather hard to debug issues because everything after boot will indicate you have new versions installed.

Well, apply the update from init (systemd) then before it spawns anything else. init can reload itself.

And afterwards you have to reboot to activate the new kernel and initrd ...

Nah. updates that affect kernel or initrd will simply be applied in the running system (and initrd rebuild). The nature of them is such that they can't possibly affect anything in the running system. No need for two reboots. Ciao, Michael. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Ludwig Nussel

10:19

Richard Biener wrote:

...

On Wed, 22 Mar 2017, Andrei Borzenkov wrote:

...
On Wed, Mar 22, 2017 at 11:45 AM, Richard Biener <rguenther@suse.de> wrote:

...
Not sure why you need to reboot twice - you'd apply a kernel update online in a non-transactional way (we have working ways of "rolling" back to the old kernel). So the reboot updates the kernel, from the initrd you apply the update and simply continue booting?

Some services may be taken over from initrd. In this case initrd had been generated using old versions and they will continue to be used even if you update root to new version until next reboot. This may lead to rather hard to debug issues because everything after boot will indicate you have new versions installed.

Well, apply the update from init (systemd) then before it spawns anything else. init can reload itself.

Yes, there are implementation difficulties but it can work. You still do have to wait for those updates to be applied of course.

That way already exists. It's called "offline system updates". A description is online¹. In short packagekit has a mode to just download rpms and notify the system of their presence. On next reboot systemd boots into a special target, installs the downloaded files and reboots again. This mode is not just for small updates. This is what you'd use for updating e.g. whole desktop environments with hundreds of packages. So applying updates this way does take time. I'm sure it will take even more time just in the moment one needs the system to be back ASAP :-) The "transactional update" mechanism on the other hand would download and install packages while the user can still use the system. Since the installation goes to a separate snapshot of the file system, the running system is not impacted. After successful installation a reboot would boot into the newly created snapshot just as quick (or slow :-)) as usual. Both methods obviously need safeguards. If the installation step of the offline updates is too dumb (like rpm -U --force --nodeps *) it has the potential to bring the system to a inconsistent state, in case packages got installed or removed after the updates were downloaded. transactional updates would always lead to a consistent system at least. However, if safeguards are not in place to handle or disallow package installations/removals after creating the new snapshot, one would miss those modifications after reboot. I guess a better integration with zypper and rpm itself is needed to prevent that. However, both approaches are unsuitable for small updates IMO. On a traditional server it would be crazy to require a reboot just to apply e.g. a security update on apache or even systemd when both services can handle inline replacement just fine. We have to be better than that. That also applies to the desktop. Who wants to reboot to update Firefox after all? So IMO the perfect solution would allow both, installing small updates like today, plus having a way to apply "disruptive" changes without impacting the running system. For the latter the offline updates approach is a cheap solution that can be used immediately and on any file system. It's not really the best way for users though. Transactional updates are more effort on engineering side. The changes required to packages and the system are the right thing to do anyways though (like sorting out the /srv mess or separating data/config migrations from installation). So it's worthwhile to fix the packages independent of the discussion about the right update mechanism. cu Ludwig [1] https://www.freedesktop.org/software/systemd/man/systemd.offline-updates.htm... -- (o_ Ludwig Nussel //\ V_/_ http://www.suse.com/ SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org

Thorsten Kukuk

13:15

On Wed, Mar 22, Ludwig Nussel wrote:

...

However, both approaches are unsuitable for small updates IMO. On a traditional server it would be crazy to require a reboot just to apply e.g. a security update on apache or even systemd when both services can handle inline replacement just fine.

And your apache case is the case, why big customers want to have transactional updates: for them it is not acceptable, that the web server is restarted during an update and maybe the transaction with the customer will fail because of that. They prefer to have scheduled reboots for this. So, there are hundred of different use cases and requirements, and it is clear that there will not be the one solution fiting everything. Thorsten -- Thorsten Kukuk, Distinguished Engineer, Senior Architect SLES & CaaSP SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nuernberg, Germany GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg) -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-packaging+owner@opensuse.org