Re: [opensuse-factory] O Factory - Where art Thou?

3 Dec 2013

      Robert,

thanks for your feedback.

There is one important thing I did not state clearly which is the
interactions between the branches (like libs underlying and affecting
desktop, or what Coolo mentioned, libs that are part of some desktop and
that are used by packages outside the desktop).

Robert Schweikert  writes:
...
On 12/02/2013 12:16 PM, Susanne Oberhauser-Hirschoff wrote:
...
...
Let's assume there is some code stable base, call it Factory.
The goal is to get updates in there, reliably, regularly, to get it to
the next level of being a stable code base.
For "leaf packages" that is simple: build the package, test it's
functionality, then release it.
Well, I do not think it is that simple. One could argue that Perl is a
leaf package. But we have perl-bootloader and thus and update to Perl
could break perl-bootloader which in turn would be a bad thing with
pretty far reaching effects. Father, perl-bootloader does not stand
unto it's own, it uses Perl modules that should definitely be
considered as leaf packages.
A similar argument can be made for KIWI, which depends on a lot of
leaf packages but, KIWI is very important to create our ISO
images. Thus, the line for leaf packages is blurry at best.
There is packages that without much harm can be integrated late because
they have very little to no impact on other packages.  That to me wrt
integration is a leaf package.  perl is certainly not amongst them.

And this whole tree is arranged by a reasonable (heuristic) order of
integration.
...
...
Now supposed there is a cascade of staging projects, which potentially
'release', say, every week or every other week (that's the "cadence").
They build a tree structure, something like this:
a  t
       \u  o            \l              \l
        \t  o            \i              \e
         \o  l            \b              \a
   gcc    \-  s            \s              \f
    --------------------------------------------------------> Factory
              /                     /
             /\l                  n/
            /  \x                r/
          K/\G  \d              e/
         D/  \N  \e            k
        E/    \O
               \M
                \E
The number of nodes from the root (Factory) to a branch corresponds
to the interactions that need to be tested for what goes into that
branch.
This gives growing rings of scope for interaction testing and
integration succes.  Successfull build and automatic tests are
necessary, sometimes even sufficient for interaction test and
integration success.  They propagate automatically, to give a
'tentative' next build.  That, however, does not affect the 'last
known good' build --- that last known good state remains available,
too.
Yes, however, what is being neglected is that there is a fundamental
problem with the cadence. The cadence itself is influenced by the
process, through rebuild times and other snafus that are inevitable.
[waterfall, one common build target in all branches at any time,
hundreds of staging projects, thesis on how (toolchain) changes
propagate]
Ah.  I definitively did not communicate clearly.

Every package needs two ways to get into the next stable base, the next
Factory: very quickly flow through the system, no matter which branch
you twig off, because the integration impact is low.  Or in flow with
the big wave that comes down the integration flow.

The first path is what I call the 'fast track'.  It is for minor
updates, patches, fixes, things that don't impact much and can be
released with little integration testing, or sufficiently reliable
automatic tests.

The second path is the one for things that really need integration work.

That path is indeed taking longer but I don't get to match your math
with my model.

I don't see why a usefull tree would have more than a few dozen branches
with a total depth of at most half a dozen stages.

Then there is an asumption that there is exactly one build and
integration target of new stuff.  That indeed would cause the dealays
and friction you describe.

However packages added at a branch are built for both what is comming
next from upstream and the 'last known good state' from upstream.

First let's look from above:

    \                          \ OTHER BRANCH
     \'devel'                   \'devel'
      \                          \
       \  'integrate'  'devel'    \  'integrate'
        \_____      next cycle     \_____
         _____*--------------------------*------
        /      decision            decision
       /
      /
     /'alpha'
                                   THE FORK
   UPSTREAM      THIS BRANCH               DOWNSTREAM
             pull
               add new stuff
                    locally integrate
                                  integrate
                                  with other
                                        release (or don't)

In a real world factory production line (steam, dust, oil) that is using
kanban, when you move a part down the line, at well defined points you
will send a 'signal kanban' to branches that are going to be merged
soon.  Like "make me a new engine, I'll need it in a short while".

This is how rebuilds and automatic tests should be triggered for what
you anticipate to provide soon.

So here comes the part that I did not spell out clearly:

There have to be *three* target builds for each branch, built from *one*
source:

  - Build for the last accepted good state, Factory unmodified.  This
    gives you clarity on whether your local changes work.

  - Build for the 'fast track' from upstream, changes that heuristically
    have no or only well known integration impact.

  - Build for the proposed next state bigger update from upstream, if
    any.

The "chaperones" of each branch will make two major decisions during
each cycle:

At the beginning of each cycle they decide what to do in this cycle in
this branch: small changes?  big changes?  and when will they start
integration through 'the fork' with the 'other branch'?

They may then work on three possible things, until they make the cycle
end decision:

  * If there is a fast track from upstream, they will move that along,
    possibly adding their own fast track stuff, possibly negotiating
    with their upstream what has priority, upstream additions or their
    own.

    They might also release it mid-cycle to further accelerate
    downstream processing of this fast track.

  * If there is work on their branch they work on it, possibly getting
    it ready as next upgrade.

  * They switch to integration with the 'other branch' to have a next
    integrated version ready for their downstream.  They may not do this
    at all during a cycle.  That's saying: no upgrade this time from
    this branch.

At the end of the cycle the "chaperones" decide whether they release
their work for their downstream, so they decide what they offer to their
downstream to start the next cycle with.

They also cosnciously decide what they provide as fast track to their
downstream:  the fast track they got from upstream (only doing
integration with the other branch), only their own fast track stuff,
holding back the fast track from upstream, or both.

The real novelty of this tree structure is clear focus and scope for
integration: What do I need to ensure at this decision point?

It is *not* less work.

It just gives the work that needs to be done clear areas of
responsibility: in each branch, the work is local.

At each fork, the work is local for two branches, meaning collaboration.

At the end of each cycle there is two outputs after each fork:

  - a fast track build
  - a major update/upgrade build

So at the end of each cycle, there is a sane starting state for the next
branch downstream to pull from and build upon.  It is then their job to
try with the most recent 'good to pull', the most recent 'next version'.

The more I think about it, the tree resembles the merging part of a git
merge graph.

The novelty here is, that this proposal structures the branches around
things that integrate easily locally, in the branches, and where the
real integration then needs to happen, at the bifurcation/branch
point/fork.  The tree structure gives a heuristically proven order to
usually best do this.

In the past there was concern about an explosion of builds in such a
model.

This doesn't happen, though:  you only build for

  - the last accepted good state.  It gives you clarity on what you have
    locally changed.  this will be your fallback fast track output for
    your downstream.

  - the proposed minor updates to said last known good state.  this is
    accepting things on the fast track upstream, for integration with
    your fast track upgrades.  If you manage to use it successfully,
    *this* will be your fast track merge with the other branch and your
    joint offer to your downstream.

  - the proposed next state from upstream.  This is where you also
    contribute your big changes.

So there is no build explosion.  You decide which of your results is
good enough to be provoded to your downstream in two buckets (fast track
and big update/upgrade), plus the underlying factory.

If you scope rebuilds to not percolate ahead of your integration review
and test, it may even do fewer builds.
...
...
Thus at each branch, every week a decision can be made: is the
combination of the 'new' stuff good enough already to *pull* (!!) it in
together?or do we --- for the combination! --- have to stick with what
we had so far, the last known good version?
There is no way to know if you can pull things in together because
staging branches do not get cross built against each other. In the
figure above everything is nicely spaced, but that probably does not
reflect the real world. If libs and the desktops are ready at he same
time one can still not merge them into factory at the same time
because they have not built against each other, they have built
against "current" factory.
They won't be ready at the same time in this way.

At each junction point, these options exist:

  - just one of the branches is good enough
  - no branch is good enough
  - both branches are good to merge

Only if both are good to merge, integration builds need to be created
and tested and then, based on that result, be provided one level further
downstream or not.

So downstream will always see three options of base builds:

  - last known good state (no changes, aka "Factory")

  - fast track changes
  - serious new integration changes

And downstream will make a conscious decision what they can digest at
this point in time.
...
But the technical requirement creates a people problem ;) . People
hate the waterfall and hurry up and wait stuff.
That's what the 'fast track' option is for.  Which should really only be
used for things that should create little to no pain, and that should be
an easily overseeable amount of changes, where the chaperones at each
branch can tell what to look at.
...
Therefore, what tends
to happen is that multiple staging branches get merged into the
reference branch based on heuristic historical data of "no adverse
interaction when merging a given set of branches in the past". This
data has a number of problems:
- past behavior does not guarantee future performance
  if perl-bootloader or kiwi depend on a new leaf package the
  heuristic data of those staging trees is useless as a new set of
  interactions is created
- the heuristic knowledge is intrinsic to the chaperone of the
  reference branch (granted, this is not necessarily much different
  than it is today, but we are looking for improvement and not "the
  same")
- the bus factor remains 1, i.e. the chaperone of the reference branch
To truly solve that you'd need to solve the halting problem...

What alternative do you see to structure integration work based on
heuristics or even measured past performance?
...
...
At each junction it is clear what needs to be tested.  If the new leaf
gtk+ application, gimp, can also be integrated with the 'last known good
version', the one that is still in Factory, then it can be integrated
into that and then thus moves on, ahead of the rest of GNOME, at the
next cycle.
But this implies that gimp has it's own staging branch, thus one is
feeding the "ever expanding number of staging branches" monster. One
cannot pull a part of a staging branch without placing the pulled
pieces into a staging tree of its own and building and testing that
staging tree against a "frozen" reference branch.
No monsters under this bed: gimp is on one of the branches.  It is built
to 'fast track', 'known good' and 'next integration'.

That's it.
...
...
So if we tilt the above tree and look at it sideways, it almost looks
like git integration:
new stuff                        merge success
          ---------------------------------------------*--------
         /          \
        /            \pull new gimp
       /              \                last known good
   --------------------*-------------------------------R.I.P.
True, but one still has to build and test the cherry picked stuff,
i.e. that's where the need for yet another staging project is
created. This rests on the basic assumption that only stuff built and
tested against the reference branch can be merged.
Again: No additional staging project.  Just the three build targets.
And the *build* is what is handed down or rejected.

The 'last known good' is the reference tree you mention.
The 'fast track' is updates with (heuristically) low impact.
The 'integration' is upgrades with (heuristically) high impact.

Things on the fast track propagate quickly to become the next 'known
good' state.  As there is no skip lining at integration decision points,
they will trigger rebuilds of all dependent packages in all branches.
if one of them turns the update down (by not using it as the base for
their fast track), communication and negotiation and fixes are needed,
corresponding to what would normally happen where the two 'normally'
meet.

Once the 'fast track' is connected all the way from all tips to the
root, it is the next 'known good'.

Likewise 'integration'.

You also mentioned base libs getting ready at the same time as some app.
Well, one of the two will be further up in the tree, and will arrive at
the end earlyer.  They at the same time get into their respectvie next
build, which then can be handed down.  If we assume a tree of 6 steps
and a weekly cadence that's a maximum of 6 weeks, unless the change is
on a fast track.

(Lightbulb) now I get something I didn't see before :)

In a situation like this, say you worked something all the way from B to
ABCDE, whcih was painfull and too several cycles.

                   D\ /E
       \B    \C      *      \F
        \     \       \DE    \
   ------*-----*-------*------*-------
     A     AB    ABC     ABCDE   ABCDEF

Now A is already working on the next big thing, and they could do it
faster, but not as in fast track they might want to catch up.  They
don't, B is going in the next ABCDEF.

So what I realized is that this is rather layers of upgrades,
generations, integration versions, releases, cadence ids, whatever you
may call it.

In this model, you create new builds in your branch, based on a latest
common release, and these are handed downstream and because this is
cadenced, each cycle will produce an integrated result at the end.

So it's not two layers (fast track and next generation) but no harm is
done, because the number remains reasonably low:

The number of layers is the number of cycles a branch is releasing a new
base for their downstream, be it fast track (update) or full integration
(upgrade).  And that number should be limited.  If a branch is piling up
releases that are never picked up and never make it all the way to the
user, something is going wrong.

Btw, if the builds are tagged by 'quality', a power beta tester user can
request beta level for some branch.  An end user can request release
quality for some branch.  Developers could also pick alpha level code
for testing, before they tag it beta.

S.

--
Susanne Oberhauser                     SUSE LINUX Products GmbH
+49-911-74053-574	               Maxfeldstraße 5
Processes and Infrastructure           90409 Nürnberg
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg)
-- 
To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org
To contact the owner, e-mail: opensuse-factory+owner@opensuse.org