Re: [opensuse-buildservice] linking to packages built with the build service

1 Sep 2007

      On Tue, Mar 06, 2007 at 04:17:07PM +0100, Marcus Rueckert wrote:
...
On 2007-03-02 15:14:43 +0100, Robert Schiele wrote:
...
Currently there is no way to link to a package built on the build service
because if we do so this link will break on every rebuild since the release
number will change.
Because of that I recommmend to automatically create symbolic links without
the version number as we have in the update directories on ftp.suse.com.  For
example:
zypper.rpm -> zypper-0.6.15-0.1.i586.rpm
If we do so we can just link to
http://software.opensuse.com/download/.../packagename.rpm.  This link will
never break due to an automatic rebuild.
but it i will make the redirector harder.
The redirector works fine with symlinks, as far as I can see.

The scanner we use to update the redirector database simply ignores
symlinks. The redirector canonicalizes every path before the database
lookup. Thus, the database doesn't care about symlinks.

We currenty have a lot of them. Most are in a "full-names-something"
directory of some released products. Which may be interesting or not.

Then, they are used in some convenience places like linking foo-current
to something else, and the like. They could be replaced with Apache
redirects, in some cases, I guess, but it would make them "invisible",
which be against their purpose.

Now, symlinks also start to occur now through the update/10.3/rpm tree,
where we traditionally offer those "unversioned" symlinks, in exactly
the way Robert proposes.
...
we aim for a redirector that
doesnt need local file access anymore but works with a sql db as
backend. to check what file is currently behind the symlink we would
But hoo... what a mess of work that would be :-)

Veto.

I neither aim, nor do I agree that we _should_ aim at such a setup,
because I know that it'd be substantial amount of work. 

Here are my thoughts behind this:

It would be required to come up with a database representing all needed
data, and to maintain it, in a way that it stays pretty consistent with
the backend file storage -- otherwise it'll only make things harder. A
number of new tools will be needed to handle tasks which are now done by
existing tools (like rsync).

Of course, the whole matter _would_ be much easier if we would deal with a
different kind of files, which doesn't change at such a fast pace, being
so sensitive against small inconsistencies, at the same time. Like
released distributions. But we have buildservice repositories and
security updates, which keep on coming much faster than mirrors are fed.

Running without file storage would btw prevent useful things like:
* serving files which are on no mirror. Factory snapshots, for instance...
* serving certain stuff (metadata) directly (thus, never redirecting) to 
  prevent issues with inconsistent repositories and similar. 
* serving small files directly, because it is faster for us and for the
  clients than querying the database and doing redirection for them
* debug and understand the system by someone else, in a reasonable time :-)

I would compare the effort to reimplement the required functionality
from scratch with, for instance, the Apache's mod_proxy* and mod_cache*
rewrite. I watch this project since years, and it is interesting to see
that it literally takes years to mature. HTTP may seem simple, but ...
...
need to read the link and than lookup the path to the real file in the
DB. of course we could transfer that symlink -> file mapping into the DB
aswell. but that would complicate the query. and for performance reasons
i would like to keep the query count per request low (ideally 1, most
likely it will be 2)
Performance-wise, I am pretty sure that everything which increases the
complexity on the database end is prone to become a problem. I'm glad
when we survive with what we have ;)

Alas, when _I_ think about symlinks ;-) I have a totally different level
of optimization in mind. I would rather like to get rid of them to avoid
the necessity of the FollowSymlinks option, which requires Apache to
stat all directories above the file. I _don't_ even want to think about
multiple queries to the database, and things like that. We get by with a
single query now (plus queries counting packages downloaded from the
buildservice repositories).

IMO, development time can be well spent in things like improved handling
of failure conditions, and a better redirection scheme. There is still
no way to maintain the mirror database than with a mysql commandline
client. Some poor idiot has to live with that: me...  We have a lot to
do there. And: Synchronise repo pushes with rsync pull runs, so they
don't stop on each other's toes.  Implement fallback mirror redirection,
mirror preference by network prefix, memcache query lookups for most
freqently requested objects. Count redirects so we can analyze what's
going on.

Really, I don't see a major rewrite of the redirector and database
backend feasible in the near future. High effort, low win.

What would be the advantage, anyway, other than saving a few bucks for
disks? I don't see it.

Peter
-- 
"WARNING: This bug is visible to non-employees. Please be respectful!"

SUSE LINUX Products GmbH
Research & Development

Re: [opensuse-buildservice] linking to packages built with the build service

Dr. Peter Poeml