On Sun, Jul 25, 2010 at 7:30 PM, Carlos E. R. email@example.com wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 2010-07-25 19:52, Greg Freemyer wrote:
On Sun, Jul 25, 2010 at 6:19 AM, Carlos E. R. firstname.lastname@example.org wrote:
How about someone making a daemon that shares via some p2p protocol the /var/cache/zypp/packages/* contents amongst all local networked computers, so that what one downloads from outside can be reused by any other opensuse computer in the same network, automatically?
Don't look at me, I have no idea how to do that ;-)
Ignoring lots of back and forth and acknolowging I don't know what sort of loads the official repo servers currently see:
I agree with Carlos because:
Zypper dup / yast2 wagon have made the online upgrade method almost painless, certainly easier than upgrading via DVD.
I still prefer using the DVD. Just now I'm upgrading an 11.0 to 11.2, and the damn thing says it can not find the "content" file on the oss repo - which is not true, I pinged the repo from the same machine and verified the directory on another. Online upgrade is very unreliable.
But I suspect there are a lot of users like myself. At my office I have 15 or 20 opensuse installations. Currently only 2 have been upgraded to 11.3, both without using the DVD.
But in order to cut down on load on the mirrors, as well as my Internet connection this very weekend I'm downloading the 11.3 DVD(s) so I can start upgrading the remaining machines.
And then you will find that packages not in the DVD will not be upgraded, but removed instead >:-)
If I could have simply (and I mean simply as in via YaST) have designated one of my upgraded machines as a repo forwarder, I could have skipped the DVD download.
Correct. I did not think of this idea, but it certainly could be done and is very interesting as well.
I envision it working similar to DNS. You request a package from the designated repo machine. If it doesn't have it, then it requests it from a bigger repo, etc.
If a tiered approach like that could be setup, then even more major download sites could share the load just as DNS load is heavily distributed.
It's not peer to peer as Carlos described, but it is a big step forward and as zypper dup gets more and more popular it seems to be a solution that is needed, as opposed to just a nice to have.
As to staleness or malware issues, I think as Carlos proposed that key metadata such as MD5 should come only from the official repos, but the packages themselves could come from anywhere as long as the md5 matches.
Well, currently the metadata doesn't come from the mirrors, but from the central redirector itself. This is done so that the mirror network gets automatic security: if one is a rogue server, it is inmediately detected. Each oS trying to install something from there will detect it and fail - unless it defines that mirror as _the_ download server.
So that doesn't need to change, and we get the same security with the tiered or p2p layout :-)
fyi: Their are no known cases of malware being created in such a way as to have the file have a predetermined md5. It is theoretically possible, but no one has done it yet. But if a solution like porposed above is pursued, possibly a more robust signature than md5 should be used. ie. SHA-1 or SHA-256.
PGP is used already :-)
By the way: zypper/yast can currently use aria2c as a downloader, which I believe uses or can use metalink data for each package. And a metalink can give info on web mirror, ftp mirrors, but also p2p links. If I'm correct, we already have what is needed in zypper/yast to use a secure torrent network to get our updates!
It just need some modifications.
Is torrent a good solution for small packages?
ie. With large files, torrent breaks the file into 2MB segments and gains speed by doing lots of chunks in parrellel, but my impression is it takes a few minutes to really get up to speed.
Since a lot of packages are less than 2MB, I could imagine torrents being extremely inefficient because of high per package download overhead, but I'm guessing.
(FTP suffers from this as well, but not as bad a torrents from what I've seen. FTP has a few seconds of overhead for every file download initiated, so downloading a couple thousand packages would add on a couple clock hours of time just due to the the ftp per file negotiation.)
I still favor a tiered approach that can have an optimized package initiation protocol. When a repo refresh takes place, does the per package hash come down from the redirector, or is it done one package at time as the installs/upgrades are taking place?