Feature changed by: Duncan Mac-Vicar (dmacvicar) Feature #308626, revision 57 Title: Snapshot/Rollback interface to the ZYpp stack openSUSE-11.3: Rejected by Gerald Pfeifer (geraldpfeifer) reject date: 2010-11-17 00:53:34 reject reason: openSUSE 11.3 has been released months ago. Priority Requester: Mandatory - openSUSE-11.4: New + openSUSE-11.4: Rejected by Duncan Mac-Vicar (dmacvicar) + reject date: 2013-03-05 17:37:32 + reject reason: not done for 11.4 but later Priority Requester: Important openSUSE 12.1: Done Priority Requester: Important Requested by: Duncan Mac-Vicar (dmacvicar) - Product Manager: Federico Lucifredi (flucifredi) Partner organization: openSUSE.org Description: Ability for libzypp and exposed via zypper to snapshot either automatically or explicitly the system before commit happens, plus an interface to revert the system. The interface should be generic and allow for different possible implementations which should be offered depending on the context: * btrfs based if /root is btrfs * simple based on package list before commit, plus package history and current package list to determine diff last transaction * two root partitions, where the interface would clone root filesystem before commit and switch them at rollback time. The feature includes the concept, and a working implementation with one or more of the strategies described before. References: https://fedoraproject.org/wiki/Features/SystemRollbackWithBtrfs http://dev.chromium.org/chromium-os/chromiumos-design-docs/filesystem-autoup... http://wiki.rpath.com/wiki/Conary:conary_rollback packages: libzypp snapper zypper Relations: - grub1 patch for btrfs boot (url: ) Documentation Impact: Requires description of the functionality in zypper options Test Case: zypper install foo bar zypper rollback System should be as before the installation. Corner cases are not important. Use Case: Administrator installs a bunch of updates, which leaves the system in an an unworkable state. He types zypper rollback to undo the changes. Business case (Partner benefit): openSUSE.org: * Customer asked about snapshoting during our last meeting. * Feature parity with Solaris which is implementing package rollback based on ZFS * RHEL goes the same route, announced btrfs rollbacks in yum * We can do better, offering something more flexible and not btrfs dependent to support different scenarios: SLES (btrfs), Appliances (dual system partition), openSUSE (simple package list) Discussion: #1: Federico Lucifredi (flucifredi) (2010-03-05 07:29:41) PM: Important for 11.3. (sorry, on web client). #3: Joachim Werner (joachimwerner) (2010-04-28 18:10:50) I have SLEPOS-specific use cases for this feature, too. IBM implemented a similar rollback feature in IRES2, based on tripwire AFAIK. Some techical considerations we need to understand: Rolling back the right things What customers really want is not a dumb rollback of everything on the system to the old state, but just the changes done by the update. So it may be necessary to make sure the snapshot only applies to the configuration directories (usually /etc) and the code directories (/usr) affected by the update, but not things like logfiles (which should be continuous) or application data (nobody wants to get their database files rolled back by accident, or data a user has saved in /home/username in the meantime). Another potential issue with some applications is that they can be in an inconsistent state if they aren't stopped during a snapshot. This is a more generic problem with snapshots as such. For this Windows has mechanisms to inform an application that it's going to do a shadow copy now. In Linux there is no single interface to trigger that, although some databases and other applications have there own way of switching into a read-only snapshotting-safe mode. So, this is all doable, but just applying a system-wide snapshot seems to be too simple. Maybe we can just copy what Red Hat is doing here (if it's good enough). Red Hat has finished this feature in Fedora, and it's probably going to be in RHEL6 at some point, too. https://fedoraproject.org/wiki/Features/SystemRollbackWithBtrfs (https://fedoraproject.org/wiki/Features/SystemRollbackWithBtrfs) contains a few considerations on this. Avoiding situations that can't be recovered zypper rollback sounds like a good plan at first, but it is not in all cases. If you apply an update that breaks the system, this could mean that a kernel update went wrong and you can't even boot beyond a certain stage. So there needs to be a way of entering the recovery from the earliest possible level, the boot prompt. There needs to be an option to boot into the snapshotted system, and from there eliminate the changes made after the snapshot and declare the snapshot to be the current state. For the worst case (a broken grub package or so kills the bootloader configuration) we need to have a recovery workflow (from CD or so) that can still bring back the old working state. Do one thing right, not three things that solve the same problem If we decide to go down the btrfs route (which I like) we should not also solve the same problem using several other approaches. Why use dual system partitions if you can have snapshots? #7: Duncan Mac-Vicar (dmacvicar) (2010-10-21 10:43:26) I would like to clarify some issues: For rollbacks, it would be the ideal situation to have the root filesystem completely under btrfs, otherwise you run into the problem that kernels are not affected by the rollback and this increases complexity. * a) For this we need to boot with btrfs. Matthias told me we should ignore this problem because under EFI you have to put the kernel under /efi in a FAT partition, but my own research tells me that what is in FAT /efi is the grub EFI program, and therefore the kernel could be in btrfs if the grub EFI program supports finding the kernel with btrfs. Can anyone confirm this? * b) For booting with btrfs we need either grub2, or syslinux or a patch for grub1 (added as a reference to the feature). Which route will we follow here? Do we have anyone empowered to make this decision soon? * c) if b) is solved with grub1 patch I suggest we open a feature entry to track that one. #9: Olaf Kirch (okir) (2010-10-22 12:56:49) (reply to #7) My preferred solution would be to enable grub1 for btrfs; it seems to pose the least risk - assuming the btrfs code is stable enough. (BTW either my fate client is too old, or the link you added is broken). #13: Gerald Pfeifer (geraldpfeifer) (2010-11-17 00:53:01) One further idea that came up is adding a bootloader entry (or several) that, when chosen, would revert the system into an old state as represented/described by that entry and boot into it. This is more involved than the more basic scenarios here and might be suitable for another, linked FATE? #15: Federico Lucifredi (flucifredi) (2011-03-25 21:02:37) (reply to #13) I like that. If at all possible, we should provide a boot item into the previous snapshot. #14: Jörg Schmela (herz-von-hessen) (2011-02-25 18:42:40) therefore my full Consent #18: Michael Andres (mlandres) (2011-06-28 12:32:03) libzypp ability to execute pre/post commit plugin scripts is done (see http://doc.opensuse.org/projects/libzypp/HEAD/plugin-commit.html). Arvin is about to provide the snapper plugin, which actually takes the snapshots. #19: Arvin Schnell (aschnell) (2011-06-29 10:12:38) (reply to #18) I have now added a python-plugin for zypp that calls snapper pre and post to commits. I was told that this is all we provide for SLE11 SP2 so I'm setting this feature to Done although many more ideas are mentioned here. #20: Duncan Mac-Vicar (dmacvicar) (2011-09-21 13:33:56) State was wrong. ZYpp does not support this on 11.4. Set it to done on 12.1. Coolo, can you reject for 11.4? -- openSUSE Feature: https://features.opensuse.org/308626