Mailinglist Archive: mirror (8 mails)

< Previous Next >
Re: [suse-mirror] rsync module for the sources - now there
Peter Poeml (poeml@xxxxxxx) wrote on 30 March 2009 20:49:
On Mon, Mar 30, 2009 at 02:45:22PM -0300, Carlos Carvalho wrote:
It's rather cumbersome to have to do separate syncs for parts of the
same repository.

Yes, I see that. I am not sure though if it would be better if there was
only one rsync module for the entire tree, because, you would still need
to set up different syncs, because there are parts of the tree that
change frequently (updates) and other parts that change nearly never
(released products). It wouldn't make sense to sync the released
products every four hours, and in addition to that, we would not be able
to deal with this, with our resources.

I agree that putting them in full-with-factory is not the best idea.
However sources are not different from the rest: part doesn't change,
part changes often, for example in factory. So update frequency is not
a reason to separate them.

How about creating another module: full-with-factory-and-sources? This
way you'll be sure that only those who *really* want them will bother
you.

Module contents and size are not a problem if there is choice and
explanation of the tree architecture. Choice allows mirrors to use a
module that fits their interest directly; explanation allows them to
use any module that has the contents they want and exclude what they
don't want. Therefore there's no conflict between having many mirrors
and much content; let the mirrors decide. And it's not rocket science,
it's standard practice for most mirrors of all distributions,
particularly for hardware architectures.

About update frequency, I usually sync a release only once, when it
appears, and never again, because they should NOT change. What do you
mean by "nearly never"? Aren't ALL changes done in updates?????
Anyway, if changes do happen they should be announced here.

This separation between releases and the rest needs manual
intervention only at release times and should be enough to avoid
overloading stage.

A trigger-based sync mechanism might be a way around this. I have some
things in mind, and know some ways how other projects deal with this,
but other than ideas there is not much resources to work on this.

Perhaps the easiest and most effective way is a social one: mirror
tiering. You chose the bigger, better connected and better managed
mirrors spread around the world, and ask them a commitment in being
tier-1 mirrors for opensuse. They'd need to have at least
full-with-factory-and-sources [oh!! :-)], plus factory ppc, and allow
public access via rsync. Only these would have access to stage, the
others would use the tier-1s, so that you keep the crowd off your
machine. Tiering would give you a solid distribution network without
consuming your resources. This is what most distros do. There'd be no
changes in using mirror brain to monitor all mirrors and sending
clients to them. You could perhaps also count on the tier-1s to
implement some of the technical methods below.

In the context of reducing rsync load on the master, triggering is
only useful if it avoids full rsyncs. The only way to avoid it is to
deal with the changes only. This can be done in several ways. One is
what kernel.org does, emailing only the changes. We use it here, keeps
us very close indeed to the master with negligible load.

Another possibility, as you say, is to have write access, which is
equivalent to having an account on the machine. We also do it here;
sourceforge is a very big example. They're very good at keeping a 10
times bigger-than-a-distro tree in sync with minimal load.

A third method is to use rsync in a better way. We don't do disk
scanning here when we update; only the master is hammered :-) However
if you give me a list of files in your site, such as the one created
by find or

rsync localhost::a-[hidden]module-with-everything > filelist

then we'll do *no* disk scans at *either* end, and only pull the
necessary files. We do this for another distro... Even better, if you
give a list of checksums we'll use it both for updating and for
verifying that our repo is correct. We also do it with another distro.
The disadvantage of this method is that it needs a complicated script,
so mirrors are unlikely to use it.
--
To unsubscribe, e-mail: mirror+unsubscribe@xxxxxxxxxxxx
For additional commands, e-mail: mirror+help@xxxxxxxxxxxx

< Previous Next >
List Navigation