Feature added by: Jason Newton (jenewton) Feature #307984, revision 1 Title: Zeroconf LAN P2P assisted updating openSUSE-11.3: Unconfirmed Priority Requester: Desirable Requested by: Jason Newton (jenewton) Description: Have several opensuse boxen that you need to update? Don't like the dent updating leaves in your quota or the sluggish behavior of your connection while your machines saturate it getting the same files over and over again? Just want the total job time to be reduced? Want to do this without a headache and enable those less technically inclined to reap the benefits as well? P2p automatic discovery over the local lan with cached packages (with a short ~ 1-3 day TTL) is the idea given that LAN bw is much cheaper/faster than WAN. Package hashes already take care of security problems and this should be an opt in feature the user is prompted with at install time (somewhere in the process) informing users that the option can reduce download times and bandwidth requirements in an openSUSE networked environment (as well as allowing users to help opensuse save on server side resources). The difference between here and local mirror repositories is that caching is done lazily (I actually don't know if suse's local repos are lazy - fetching package only when needed), the user merely sets a boolean preference at install time or anytime in yast, and that everything takes care of itself afterwards - ie things just work, but they work much faster. There should also be no repos added so if a laptoper goes to another network for whatever reason, he doens't get popups or other problems if his system refreshes the repo cache or if he decides he wants to continue an upgrad or start a new install. This obviously doesn't help the cases of only one oS computer of a particular version and has limited benefit (noarch only?) to mixed cpu arch envornments but on the bright side at least sha1 hashes as UUIDs make local-lookups easy. Cooperative caching should also be looked into to make sure the dogpile effect doesn't occur from multiple boxen trying for the same package at the same time. Also, I should mention the approach I see in my mind given a TTL/uuid based system cache system works nicely with users with different computers updating at slightly different times and through automated ways. Why again though? Show me numbers! Numbers eh... I'm in a house with 4 opensuse boxes (on factory so we get a big block of updates from time to time). Lets create a variable N so we see how things scale. Lets also define N=4 for my particular case of a 4 computer opensuse LAN. Lets also define the following: X as the number of packages an update brings. Et = expected time to download Ez = expected package size. Ei = expected time to install a package Now normally when I upgrade, I upgrade all my boxes at the same time, usually through word of mouth, my friends do so too - this includes the roommate who has another 2 oS computers. If all packages are downloaded at full WAN bandwidth rate, that's R/N rate at which each computer on the shared WAN link dls. *LAN is considered free So in the uncooperative case: How long will it take you to download your updates? N*Et*X units of time. How much time will be spent installing the applications to the system? X*Ei units of time. The total amount of WAN bandwidth used is N*X*Ez. Well anyway, the total time for each installation to complete is N*Et*X + X*Ei units of time. This sucks, we're redoing the downloads. Now lets see the local cooperative caching case: How long will it take you to download your updates? Et*X. How much time will be spent installing the applications to the system? X*Ei The total amount of WAN bandwidth used is X*Ez. The total time for each installation to complete is Et*X + X*Ei. Much better, no? No more of that N factor. Plugging in numbers (for my case)... Lets take an X for two cases, 300 packages, and the factory weekly norm of 1200. Note that I derive these all the way at the bottom of the page. These are simply to give a feel for the above models. Et = 11.66 seconds, expected time to download Ez = 7.6 MiB, expected package size. In my case I think Ei=Et = 11.66 seconds R then equals .65MiB/s (note that this is the bitrate at which you are able to dl packages from a suse repo, this is always <= the true amount of download you have). For the 1200 package case: Without cooperative caching: 4*11.66*1200 + 1200*11.66 ~ 19.5 hrs With cooperative caching: 11.66*1200 + 1200*11.66 ~ 7.8 hrs For the 300 package case: Without cooperative caching : 4*11.66*300 + 300*11.66 ~ 5 hrs With cooperative caching: 11.66*300 + 300*11.66 ~ 2hrs Efficiency over the naive alternative is defined as the ratio of time spent in the uncooperative case over the cooperative case: (N*Et*X+Ei*X)/(Et*X+Ei*X) for the number's I've given above, the efficiency is 250% and only 40% of the original time is required for the same tasks. So there you have it, great efficiency gains, users getting back to real work faster, suse working smoother, and novel saving some cash on bandwidth. **Here is a table of probabilities, expected download times for packages descritied into three different size ranges / classes. I've just noted this emphirically over time as I always run zypper by hand on a local terminal. small <3 MiB med = 5-20 MiB huge =20-200 MiB Et = Expected time of a package download, p = probability of a package size class Et(small)=1s, p(small )=most, lets say ~2/3 Et(med)=8s, p(med X)=quite a few, < p(small), so lets say ~ 1/3 Et(huge)=5 mins= 300s, p(huge)= 5 / 300, 20 / 1200 ~= 1.66/100 Et for any given package is then ~: 11.66 seconds Expected package size, Ez, is: p(small)*1.5+p(med)*12.5+p(huge)*110 = 7.6 MiB We can then say my expected R is Ez/Et = 7.6/11.6, or about .65 MiB/second. Pretty good estimations of R as despite having a 2MiB line, I drift between 400k and 2MiB for package upgrades alot, it really is either one or the other with the former occuring most of the time. ***Note I'm simply using class midpoint for the above. -- openSUSE Feature: https://features.opensuse.org/307984