Mailinglist Archive: opensuse-buildservice (64 mails)

< Previous Next >
Re: [opensuse-buildservice] Revision history with wrong files
  • From: Marcus Hüwe <suse-tux@xxxxxx>
  • Date: Wed, 25 Mar 2020 18:28:43 +0100
  • Message-id: <20200325172843.qs4iy7wpezmjeu27@linux>
Hi,

On 2020-03-24 10:33:43 +0100, Hans-Peter Jansen wrote:
when browsing through the revision history of a package, OBS tend to either
having trouble with some conflicts:

https://build.opensuse.org/package/show/devel:languages:python/python-PyYAML?
rev=33

or displays wrong files:

https://build.opensuse.org/package/show/devel:languages:python/python-PyYAML?
rev=32

Hmm this not "wrong" but maybe "unexpected" from a user's POV. What the webui
does, is the following: d:l:p/python-PyYAML@rev32 is a branch, hence, the
webui expands the branch against o:S/python-PyYAML@HEAD (which is rev22 at the
moment). That's why you see the "wrong"/"unexpected" files.
If you now visit the rev=33 page, the webui applies the same logic but in this
case the expansion fails (due to a conflict).
Now, if you want to inspect the expanded "history" of d:l:p/python-PyYAML, it
is not apparent in which files you are interested (IMHO).

Let's briefly have simplified look on how branches work on a conceptual level
(note: the revision timelines might be more helpful the accompanying text and
the technical details).

First some definitions...

A file set is a set of files and is associated with a hash. The hash globally
identifies "the" file set. For a file set $f$, the associated hash is denoted
by $h(f)$.

A revision/rev is used to identify a file set within a project/package.

A file set's associated hash is a revision.

A commit of a file set to a project/package yields a numeric revision.
Note: this revision may identify a _different_ file set within the
project/package (see branches below).


###
Example:
Let's assume we start with a new package called "A" in the project "P".
Next, we commit the local files "foo", "bar" and "baz" to P/A:

f1 := {foo, bar, baz} # f1 is a file set
commit(P, A, f1) -> 1 # 1 is the numeric revision

The numeric revision 1 _and_ the hash $h(f1)$ can be used to identify the
file set f1 within P/A. For instance,

osc api /source/P/A?rev=1
osc api /source/P/A?rev=h(f1)

will show the same file set (technical side note: the returned xmls differ
slightly).

Next, we commit another file set:

commit(P, A, {foo, x}) -> 2 # 2 is the numeric revision

Finally, we commit our initial file set $f1$ again:

commit(P, A, f1) -> 3 # 3 is the numeric revision

Now,

osc api /source/P/A?rev=1
osc api /source/P/A?rev=h(f1)
osc api /source/P/A?rev=3

will all show the same file set.

###

Let's continue with the definitions.

A file set $f$ is called a branch iff $f$ contains a file called _link, which
has a specific structure.
A _link file comprises an origin project (denoted by $originPrj(f)$), an
origin package (denoted by $originPkg(f)$) and a baserev (denoted by
$baserev(f)$), which is a revision (wrt. origin project/origin package).

(For simplicity, the "linkrev" (xml "rev" attribute) etc. are omitted.)

###
Example: Simple Branch Creation and Expanding

Recall the P/A package from above. The revision timeline looks like this

P/A@r1 P/A@r2 P/A@r3

The latest/HEAD revision in P/A is 3. Next, we want to create a branch
of P/A in package "B" in project "X". The (simplified) branching logic
looks like this

branch(tprj, tpkg, oprj, opkg):
# tprj is the target project
# tpkg is the target package
# oprj is the origin project
# opkg is the origin package
- of' := the file set associated with the latest/HEAD revision of oprj/opkg
- of := expand(of') # expanded origin file set of
# oprj/opkg@HEAD
- f := copy(of)
- f += '_link' # add _link file
- originPrj(f) := oprj # set origin project
- originPkg(f) := opkg # set origin package
- baserev(f) := h(of) # set baserev
- return commit(tprj, tpkg, f) # commit new file set

(where expand(...) is discussed below)

After executing

branch(X, B, P, A) -> 1

our revision timeline looks like this


P/A@r1 P/A@r2 P/A@r3
|
|
|------ X/B@r1


Now,

osc api /source/X/B?rev=1

yields the file set {foo, bar, baz, _link}. As a user, you are probably not
interested in this file set. Instead, you want the "expanded" file set,
which can be obtained via "osc api /source/X/B?rev=1&expand=1".

The very simplified expand algorithm looks like this:

expand(f, lof=NONE):
# f is a file set
# lof is an optional file set
- if '_link' not in f then
# f is no branch => return the file set
return f
else
# f is a branch => expand it
if lof is NONE
- lof := the file set associated with the latest/HEAD revision in
originPrj(f)/originPkg(f)
endif
- elof := expand(lof)
- bof := the file set associated with baserev(f) in
originPrj(f)/originPkg(f)
- f' := copy(f)
- f'->remove('_link')
- return merge(f', elof, bof)
endif

where merge takes three file sets (none of them contains a _link file),
performs a 3-way merge of the files from all sets (via diff3) and returns
the "merged" file set (which also contains no _link).
(Note: "bof" is no branch (has no _link file) by construction.)

Long story short, "osc api /source/X/B?rev=1&expand=1" yields the expanded
file set ef := {foo, bar, baz}, which is the result of
expand({foo, bar, baz, _link}, NONE), and we have $ef = f1$.


Next, consider the situation where P/A evolves.

commit(P, A, {foo, bar}) -> 4 # the file "baz" is removed

The revision timeline looks like this:

P/A@r1 P/A@r2 P/A@r3 P/A@r4
|
|
|------ X/B@r1

Now,

osc api /source/X/B?rev=1

still yields the file set {foo, bar, baz, _link} (as before). However, the
expanded file set changes:

osc api /source/X/B?rev=1&expand=1

yields {foo, bar}.

As you can see, the expanded file set of X/B@r1 can _change_. That's what you
observe in your d:l:p/python-PyYAML@r32 and d:l:p/python-PyYAML@r33 examples.


In the remainder, let's discuss candidates for the "expected" expanded file
set.

First, we commit a new file "xxx" to X/B:

f2 := {foo, bar, xxx}
commit(X, B, f2) -> 2

The revision timeline looks like this:


P/A@r1 P/A@r2 P/A@r3 P/A@r4
| |
| |
|------ X/B@r1 |------ X/B@r2


Now,

osc api /source/X/B?rev=2

yields {foo, bar, xxx, _link} and

osc api /source/X/B?rev=2&expand=1

yields $f2$.

Next, we commit a new file "yyy" to P/A:

commit(P, A, {foo, bar, yyy}) -> 5

The revision timeline looks like this:


P/A@r1 P/A@r2 P/A@r3 P/A@r4 P/A@r5
| |
| |
|------ X/B@r1 |------ X/B@r2

###


Finally, we can discuss the potential options for the "expected" expanded file
set for X/B@r1, which could be displayed in the webui.

Let of_i denote the file set in P/A that is identified by revision i
(i = 3, 4, 5).
Let f_1 denote the file set in X/B that is identified by revision 1.
(f_1 = {foo, bar, baz, _link})

#
# Option a: expand against the latest/HEAD revision of the origin package
#

expand(f_1, of_5) = expand(f_1, NONE) = {foo, bar, xxx, yyy}

That's the status quo. Probably "unexpected" from a user's POV.

#
# Option b: expand against the file set of P/A that could have been seen before
# committing X/B@r2
#

expand(f_1, of_4) = {foo, bar}

Personally, that's what I would probably expect (assuming the usual commit
workflow: osc up; <change files>; osc ci (the "osc up" results in {foo, bar})).

Potential issue: this does not work if X/B@r2 fixes a conflict (in this
case we could "go back" in the P/A timeline and "try" to expand against
the intermediate file sets until we "reach" the baserev).

#
# Option c: return the expanded file set that was committed in X/B@r1
#

expand(f_1, of_3) = {foo, bar, baz}

(that is, we expand against the baserev)

Advantage: this always works (there are no conflicts by construction)


What would you "expect"?:)


Marcus
--
To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@xxxxxxxxxxxx
To contact the owner, e-mail: opensuse-buildservice+owner@xxxxxxxxxxxx

< Previous Next >
Follow Ups
References