[openFATE 308626] Snapshot/Rollback interface to the ZYpp stack
Feature changed by: Jose Ricardo De Leon Solis (derhundchen) Feature #308626, revision 13 Title: Snapshot/Rollback interface to the ZYpp stack openSUSE-11.3: Evaluation Priority Requester: Mandatory + openSUSE-11.4: Unconfirmed + Priority + Requester: Important Requested by: Duncan Mac-Vicar (dmacvicar) Developer: (Novell) Developer: (Novell) Partner organization: openSUSE.org Description: Ability for libzypp and exposed via zypper to snapshot either automatically or explicitly the system before commit happens, plus an interface to revert the system. The interface should be generic and allow for different possible implementations which should be offered depending on the context: * btrfs based if /root is btrfs * simple based on package list before commit, plus package history and current package list to determine diff last transaction * two root partitions, where the interface would clone root filesystem before commit and switch them at rollback time. The feature includes the concept, and a working implementation with one or more of the strategies described before. References: https://fedoraproject.org/wiki/Features/SystemRollbackWithBtrfs http://dev.chromium.org/chromium-os/chromiumos-design-docs/filesystem-autoup... http://wiki.rpath.com/wiki/Conary:conary_rollback Documentation Impact: Requires description of the functionality in zypper options Test Case: zypper install foo bar zypper rollback System should be as before the installation. Corner cases are not important. Use Case: Administrator installs a bunch of updates, which leaves the system in an an unworkable state. He types zypper rollback to undo the changes. Business case (Partner benefit): openSUSE.org: * Customer asked about snapshoting during our last meeting. * Feature parity with Solaris which is implementing package rollback based on ZFS * RHEL goes the same route, announced btrfs rollbacks in yum * We can do better, offering something more flexible and not btrfs dependent to support different scenarios: SLES (btrfs), Appliances (dual system partition), openSUSE (simple package list) Discussion: #1: Federico Lucifredi (flucifredi) (2010-03-05 07:29:41) PM: Important for 11.3. (sorry, on web client). #3: Joachim Werner (joachimwerner) (2010-04-28 18:10:50) I have SLEPOS-specific use cases for this feature, too. IBM implemented a similar rollback feature in IRES2, based on tripwire AFAIK. Some techical considerations we need to understand: Rolling back the right things What customers really want is not a dumb rollback of everything on the system to the old state, but just the changes done by the update. So it may be necessary to make sure the snapshot only applies to the configuration directories (usually /etc) and the code directories (/usr) affected by the update, but not things like logfiles (which should be continuous) or application data (nobody wants to get their database files rolled back by accident, or data a user has saved in /home/username in the meantime). Another potential issue with some applications is that they can be in an inconsistent state if they aren't stopped during a snapshot. This is a more generic problem with snapshots as such. For this Windows has mechanisms to inform an application that it's going to do a shadow copy now. In Linux there is no single interface to trigger that, although some databases and other applications have there own way of switching into a read-only snapshotting-safe mode. So, this is all doable, but just applying a system-wide snapshot seems to be too simple. Maybe we can just copy what Red Hat is doing here (if it's good enough). Red Hat has finished this feature in Fedora, and it's probably going to be in RHEL6 at some point, too. https://fedoraproject.org/wiki/Features/SystemRollbackWithBtrfs (https://fedoraproject.org/wiki/Features/SystemRollbackWithBtrfs) contains a few considerations on this. Avoiding situations that can't be recovered zypper rollback sounds like a good plan at first, but it is not in all cases. If you apply an update that breaks the system, this could mean that a kernel update went wrong and you can't even boot beyond a certain stage. So there needs to be a way of entering the recovery from the earliest possible level, the boot prompt. There needs to be an option to boot into the snapshotted system, and from there eliminate the changes made after the snapshot and declare the snapshot to be the current state. For the worst case (a broken grub package or so kills the bootloader configuration) we need to have a recovery workflow (from CD or so) that can still bring back the old working state. Do one thing right, not three things that solve the same problem If we decide to go down the btrfs route (which I like) we should not also solve the same problem using several other approaches. Why use dual system partitions if you can have snapshots? -- openSUSE Feature: https://features.opensuse.org/308626
participants (1)
-
fate_noreply@suse.de