Robert,
thanks for your feedback.
There is one important thing I did not state clearly which is the
interactions between the branches (like libs underlying and affecting
desktop, or what Coolo mentioned, libs that are part of some desktop and
that are used by packages outside the desktop).
Robert Schweikert
On 12/02/2013 12:16 PM, Susanne Oberhauser-Hirschoff wrote:
Let's assume there is some code stable base, call it Factory.
The goal is to get updates in there, reliably, regularly, to get it to the next level of being a stable code base.
For "leaf packages" that is simple: build the package, test it's functionality, then release it.
Well, I do not think it is that simple. One could argue that Perl is a leaf package. But we have perl-bootloader and thus and update to Perl could break perl-bootloader which in turn would be a bad thing with pretty far reaching effects. Father, perl-bootloader does not stand unto it's own, it uses Perl modules that should definitely be considered as leaf packages.
A similar argument can be made for KIWI, which depends on a lot of leaf packages but, KIWI is very important to create our ISO images. Thus, the line for leaf packages is blurry at best.
There is packages that without much harm can be integrated late because they have very little to no impact on other packages. That to me wrt integration is a leaf package. perl is certainly not amongst them. And this whole tree is arranged by a reasonable (heuristic) order of integration.
Now supposed there is a cascade of staging projects, which potentially 'release', say, every week or every other week (that's the "cadence").
They build a tree structure, something like this:
a t \u o \l \l \t o \i \e \o l \b \a gcc \- s \s \f --------------------------------------------------------> Factory / / /\l n/ / \x r/ K/\G \d e/ D/ \N \e k E/ \O \M \E
The number of nodes from the root (Factory) to a branch corresponds to the interactions that need to be tested for what goes into that branch.
This gives growing rings of scope for interaction testing and integration succes. Successfull build and automatic tests are necessary, sometimes even sufficient for interaction test and integration success. They propagate automatically, to give a 'tentative' next build. That, however, does not affect the 'last known good' build --- that last known good state remains available, too.
Yes, however, what is being neglected is that there is a fundamental problem with the cadence. The cadence itself is influenced by the process, through rebuild times and other snafus that are inevitable.
[waterfall, one common build target in all branches at any time, hundreds of staging projects, thesis on how (toolchain) changes propagate]
Ah. I definitively did not communicate clearly. Every package needs two ways to get into the next stable base, the next Factory: very quickly flow through the system, no matter which branch you twig off, because the integration impact is low. Or in flow with the big wave that comes down the integration flow. The first path is what I call the 'fast track'. It is for minor updates, patches, fixes, things that don't impact much and can be released with little integration testing, or sufficiently reliable automatic tests. The second path is the one for things that really need integration work. That path is indeed taking longer but I don't get to match your math with my model. I don't see why a usefull tree would have more than a few dozen branches with a total depth of at most half a dozen stages. Then there is an asumption that there is exactly one build and integration target of new stuff. That indeed would cause the dealays and friction you describe. However packages added at a branch are built for both what is comming next from upstream and the 'last known good state' from upstream. First let's look from above: \ \ OTHER BRANCH \'devel' \'devel' \ \ \ 'integrate' 'devel' \ 'integrate' \_____ next cycle \_____ _____*--------------------------*------ / decision decision / / /'alpha' THE FORK UPSTREAM THIS BRANCH DOWNSTREAM pull add new stuff locally integrate integrate with other release (or don't) In a real world factory production line (steam, dust, oil) that is using kanban, when you move a part down the line, at well defined points you will send a 'signal kanban' to branches that are going to be merged soon. Like "make me a new engine, I'll need it in a short while". This is how rebuilds and automatic tests should be triggered for what you anticipate to provide soon. So here comes the part that I did not spell out clearly: There have to be *three* target builds for each branch, built from *one* source: - Build for the last accepted good state, Factory unmodified. This gives you clarity on whether your local changes work. - Build for the 'fast track' from upstream, changes that heuristically have no or only well known integration impact. - Build for the proposed next state bigger update from upstream, if any. The "chaperones" of each branch will make two major decisions during each cycle: At the beginning of each cycle they decide what to do in this cycle in this branch: small changes? big changes? and when will they start integration through 'the fork' with the 'other branch'? They may then work on three possible things, until they make the cycle end decision: * If there is a fast track from upstream, they will move that along, possibly adding their own fast track stuff, possibly negotiating with their upstream what has priority, upstream additions or their own. They might also release it mid-cycle to further accelerate downstream processing of this fast track. * If there is work on their branch they work on it, possibly getting it ready as next upgrade. * They switch to integration with the 'other branch' to have a next integrated version ready for their downstream. They may not do this at all during a cycle. That's saying: no upgrade this time from this branch. At the end of the cycle the "chaperones" decide whether they release their work for their downstream, so they decide what they offer to their downstream to start the next cycle with. They also cosnciously decide what they provide as fast track to their downstream: the fast track they got from upstream (only doing integration with the other branch), only their own fast track stuff, holding back the fast track from upstream, or both. The real novelty of this tree structure is clear focus and scope for integration: What do I need to ensure at this decision point? It is *not* less work. It just gives the work that needs to be done clear areas of responsibility: in each branch, the work is local. At each fork, the work is local for two branches, meaning collaboration. At the end of each cycle there is two outputs after each fork: - a fast track build - a major update/upgrade build So at the end of each cycle, there is a sane starting state for the next branch downstream to pull from and build upon. It is then their job to try with the most recent 'good to pull', the most recent 'next version'. The more I think about it, the tree resembles the merging part of a git merge graph. The novelty here is, that this proposal structures the branches around things that integrate easily locally, in the branches, and where the real integration then needs to happen, at the bifurcation/branch point/fork. The tree structure gives a heuristically proven order to usually best do this. In the past there was concern about an explosion of builds in such a model. This doesn't happen, though: you only build for - the last accepted good state. It gives you clarity on what you have locally changed. this will be your fallback fast track output for your downstream. - the proposed minor updates to said last known good state. this is accepting things on the fast track upstream, for integration with your fast track upgrades. If you manage to use it successfully, *this* will be your fast track merge with the other branch and your joint offer to your downstream. - the proposed next state from upstream. This is where you also contribute your big changes. So there is no build explosion. You decide which of your results is good enough to be provoded to your downstream in two buckets (fast track and big update/upgrade), plus the underlying factory. If you scope rebuilds to not percolate ahead of your integration review and test, it may even do fewer builds.
Thus at each branch, every week a decision can be made: is the combination of the 'new' stuff good enough already to *pull* (!!) it in together?or do we --- for the combination! --- have to stick with what we had so far, the last known good version?
There is no way to know if you can pull things in together because staging branches do not get cross built against each other. In the figure above everything is nicely spaced, but that probably does not reflect the real world. If libs and the desktops are ready at he same time one can still not merge them into factory at the same time because they have not built against each other, they have built against "current" factory.
They won't be ready at the same time in this way. At each junction point, these options exist: - just one of the branches is good enough - no branch is good enough - both branches are good to merge Only if both are good to merge, integration builds need to be created and tested and then, based on that result, be provided one level further downstream or not. So downstream will always see three options of base builds: - last known good state (no changes, aka "Factory") - fast track changes - serious new integration changes And downstream will make a conscious decision what they can digest at this point in time.
But the technical requirement creates a people problem ;) . People hate the waterfall and hurry up and wait stuff.
That's what the 'fast track' option is for. Which should really only be used for things that should create little to no pain, and that should be an easily overseeable amount of changes, where the chaperones at each branch can tell what to look at.
Therefore, what tends to happen is that multiple staging branches get merged into the reference branch based on heuristic historical data of "no adverse interaction when merging a given set of branches in the past". This data has a number of problems:
- past behavior does not guarantee future performance if perl-bootloader or kiwi depend on a new leaf package the heuristic data of those staging trees is useless as a new set of interactions is created
- the heuristic knowledge is intrinsic to the chaperone of the reference branch (granted, this is not necessarily much different than it is today, but we are looking for improvement and not "the same")
- the bus factor remains 1, i.e. the chaperone of the reference branch
To truly solve that you'd need to solve the halting problem... What alternative do you see to structure integration work based on heuristics or even measured past performance?
At each junction it is clear what needs to be tested. If the new leaf gtk+ application, gimp, can also be integrated with the 'last known good version', the one that is still in Factory, then it can be integrated into that and then thus moves on, ahead of the rest of GNOME, at the next cycle.
But this implies that gimp has it's own staging branch, thus one is feeding the "ever expanding number of staging branches" monster. One cannot pull a part of a staging branch without placing the pulled pieces into a staging tree of its own and building and testing that staging tree against a "frozen" reference branch.
No monsters under this bed: gimp is on one of the branches. It is built to 'fast track', 'known good' and 'next integration'. That's it.
So if we tilt the above tree and look at it sideways, it almost looks like git integration:
new stuff merge success ---------------------------------------------*-------- / \ / \pull new gimp / \ last known good --------------------*-------------------------------R.I.P.
True, but one still has to build and test the cherry picked stuff, i.e. that's where the need for yet another staging project is created. This rests on the basic assumption that only stuff built and tested against the reference branch can be merged.
Again: No additional staging project. Just the three build targets. And the *build* is what is handed down or rejected. The 'last known good' is the reference tree you mention. The 'fast track' is updates with (heuristically) low impact. The 'integration' is upgrades with (heuristically) high impact. Things on the fast track propagate quickly to become the next 'known good' state. As there is no skip lining at integration decision points, they will trigger rebuilds of all dependent packages in all branches. if one of them turns the update down (by not using it as the base for their fast track), communication and negotiation and fixes are needed, corresponding to what would normally happen where the two 'normally' meet. Once the 'fast track' is connected all the way from all tips to the root, it is the next 'known good'. Likewise 'integration'. You also mentioned base libs getting ready at the same time as some app. Well, one of the two will be further up in the tree, and will arrive at the end earlyer. They at the same time get into their respectvie next build, which then can be handed down. If we assume a tree of 6 steps and a weekly cadence that's a maximum of 6 weeks, unless the change is on a fast track. (Lightbulb) now I get something I didn't see before :) In a situation like this, say you worked something all the way from B to ABCDE, whcih was painfull and too several cycles. D\ /E \B \C * \F \ \ \DE \ ------*-----*-------*------*------- A AB ABC ABCDE ABCDEF Now A is already working on the next big thing, and they could do it faster, but not as in fast track they might want to catch up. They don't, B is going in the next ABCDEF. So what I realized is that this is rather layers of upgrades, generations, integration versions, releases, cadence ids, whatever you may call it. In this model, you create new builds in your branch, based on a latest common release, and these are handed downstream and because this is cadenced, each cycle will produce an integrated result at the end. So it's not two layers (fast track and next generation) but no harm is done, because the number remains reasonably low: The number of layers is the number of cycles a branch is releasing a new base for their downstream, be it fast track (update) or full integration (upgrade). And that number should be limited. If a branch is piling up releases that are never picked up and never make it all the way to the user, something is going wrong. Btw, if the builds are tagged by 'quality', a power beta tester user can request beta level for some branch. An end user can request release quality for some branch. Developers could also pick alpha level code for testing, before they tag it beta. S. -- Susanne Oberhauser SUSE LINUX Products GmbH +49-911-74053-574 Maxfeldstraße 5 Processes and Infrastructure 90409 Nürnberg GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg) -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org