On 11/23/2014 04:01 PM, Andrei Borzenkov wrote:
Production systems have limited maintenance window when changes can be applied. If anything goes wrong they do not have luxury of spending time to recover - you need system up and running as soon as possible. btrfs opens up possibility of having cheap and space efficient fallback to known good state so you can easily revert changes. Whether you are going to use this possibility or not - is up to you. I think this possibility is very useful and that current implementation in (open)SUSE makes it less convenient to use than it could be.
Where time is critical there are other mechanism such as 'hot cut=over'. What's missing from this part of the thread is the specifics. Not all catastrophic instances are the same. Change management, that is apply changes and updates, is not as simple for critical production systems such as banks, telecos, brokerages and the like. I know this first-hand having done my time on the change control committee for a national bank. Companies that have the capability test out changes on non-critical systems before applying them to critical systems. That goes just as much for "mainframes" as for armies of Pcs. Yes, the COW mechanism we now have makes reversing changes much easier, and opens up alternatives for organizations that don't have the back-room of test-bed systems and staff to exercise the hell out of updates. But it does mean that the business and operational processes involved in applying updates to critical systems have to be worked out. While I can on my non-critical, or at least non time critical, system apply 'zypper up' daily, and can do that 99.9% of the time without having to concern myself about reversing a update, I think that a business which applies all available updates willy-nilly is foolhardy. Relying on snapshotting to be able to reverse changes unconditionally is optimistic. A proper change control procedure will examine each individual update to see if it or one of the changes it implies will affect any of the production processes and if so how. Realistically we fond that many were obvious and straight forward and easy to verify.It helps when the change references a CVE. One of the advantages of Linux over a big IBM 3xxx or whatever class they are calling mainframes now is that changes can be tested on a comparatively cheap machine :-) The BtrFS snapshotting revolves around COW. COW is not a true copy, what's in a snapshot is not a real copy of the 'prior state'. It is not a backup. Don't try treating it as one. You can't take it away like you can a backup to tape or DVD or USB. And its not like a LVM2 snapshot either! There are ways to crash a file system where the 'main' file is corrupted but the COW copy has not been able to be ... Well, copied. bear that in mind too. -- /"\ \ / ASCII Ribbon Campaign X Against HTML Mail / \ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org