Feature changed by: Alex Tsariounov (tsariounov) Feature #306896, revision 14 Title: Zypp-proxy - A proxy cache server for zypper updates Hackweek IV: Unconfirmed Priority Requester: Important Requested by: Alex Tsariounov (tsariounov) Description: This project will create zypp-proxy which is a server proxy used for caching update packages that are used by machines on the local network on a locally designated host machine that acts as a proxy to the openSUSE updates repositories. The project is similar in function and requirements to the Debian project apt-proxy, details of which can be found here: http://apt-proxy.sourceforge.net/ This project is useful for those who run many local (both physical and virtual) openSUSE machines and like to keep them up to date with updates; however, they do not wish to waste bandwidth for downloading the same updates over and over to all local machines whether when keeping exisiting machines up to date, or building new machines and having to re-download all updates yet again. For some people, simply mirroring the entire openSUSE updates repository is sufficient to provide local network updates; however, for most, since they do not use near as many packages as that repository provides, doing this simply wastes disk space. These people will find zypp-proxy most useful. Both server and client setup will be quite simple. The server will use the public openSUSE updates repositories to check for updates. The clients will point to the local server (the proxy) machine for updates rather than the public servers. When a client requests an update, the zypp-proxy server first checks if the public server has a more up to date package that what it has cached locally. If the public server doesn't, then zypp-proxy serves the locally cached package. If the public server does have a more up to date package, then zypp-proxy first downloads it to its local cache and then serves it to the client. How many old versions of packages to keep will be a configurable. The first implementation will support openSUSE 11.1 only, with support for other openSUSE releases following suit. Discussion: #3: Alex Tsariounov (tsariounov) (2009-07-16 18:25:40) I believe, and I could be wrong, that SMT actually creates a complete mirror of the updates repo locally. This may be ok for a datacenter SLES customer or install, but since openSUSE's repos are so much bigger, this will trade network bandwidth wastage for disk space wastage. Either one does not sit well with the primary target of openSUSE who is the Linux enthusiast. Secondarily, SMT's name is "Subscription Management Tool", for openSUSE there are no subscriptions, so the name becomes misleading. Third, SMT is built and installed as an Add-on product, this complexity is not needed for a simple proxy server. A simple rpm install is all that should happen. Having said that, perhaps there is some code that can be shared. Does SMT use libzypp? I was planning on using libzypp and hence implementing zypp-proxy in either C++ or python. Python is preferred but I don't know the status of libzypp's python bindings. Perhaps SMT can stand some modificaitons to not creaet a complete mirror of the updates repository, but only mirror the updates that are actcually used? #9: Peter Bowen (pzb) (2009-07-19 09:04:00) (reply to #3) How would this be different that just squid? If you are only opportunisticly (or passively) caching the data, then this seems just like a normal HTTP cache. #10: Alex Tsariounov (tsariounov) (2009-07-19 22:27:37) (reply to #9) There are many reasons why zypp-proxy is different from squid. Most of them hinge on the fact that zypp-proxy understands packages. First, squid caches all http objects, not just packages. If you clean out the cache for privacy, you'll lose your pacakge cache. Zypp-proxy caches only packages, so there's not need to clean out the cache, it keeps it clean automatically as per the next item. Second, since squid does not know anything about packages, you cannot keep for example the last 3 versions of packages in the cache. Zypp- proxy does that automatically, it shoud default to keeping the last 3 versions, but you can set that as a confgurable to only keep the latest version or the last 10 versions around. You can potentially also do things like freeze a pacakge or a set of package or even a pattern at a specific version level, or a pattern of version levels. This last bit is out of scope for this hack week project though. Third, squid is hard to set up. How do you specify how much disk space to use, how often to clean out the cache, what to cache, etc? Zypp- proxy's goal is to be a zero-conf app in that you will only need to install it and start using it. It can be such because it's purpose is so specific, unlike squid. #4: Federico Lucifredi (flucifredi) (2009-07-17 21:43:15) We have considered and are planning to open up SMT more to the community, and as such to be able to leverage it for openSUSE as well. SMT has always been entirely GPL, so there are no licensing issues at all. SMT-11 has mirror filtering, so the full-repo question is no longer relevant. A proxy re-implementation from scratch is a waste of time, to be perfectly honest, and certainly one that we as Novell should not spend time on. If you want to work on a cache for openSUSE, you should really speak to the SMT team on how to best contribute to make SMT useful for the community distribution as well. Duncan is probably your best bet for guidance there. #5: Alex Tsariounov (tsariounov) (2009-07-17 23:01:51) Hi Frederico, I have a couple of questions for you. How does "mirror filtering" work? What I have in mind for zypp-proxy is that only updates that are actually used by clients are cached. This minimizes disk usage. This is also has the nice property of having an automatic configuration, so for example, the admin does not have to set up any kind of "mirroring rules" for the server. How are you going to address that SMT stands for "Subscription," and on openSUSE there are no subscriptions? This will create user confusion. Are you going to remove the burden of SMT being an Add-on product? IMHO, there really is no need to go to that extent to install a caching server. Simply making the package (in the case of zypp-proxy it would only be one package), or a pattern of packages if you use more than one package, as you do for SMT, would be sufficient to install the server. For example, if I want to install squid, i simply say "# zypper in squid", that's all, and possibly squid is more complex than SMT, and for sure it is more complex than what zypp-proxy would be. I have waited for a long time for a caching updates server to become availabe for openSUSE. This type of function is fundamental to a disto, and I am confused somewhat that it still does not exist. Apt- proxy was in Debian from the beginning because there was and continues to be a need for it. The same with openSUSE. Even yum has a caching mechanism for Fedora. Just search online for others looking for this functionality on openSUSE, you will find a lot of emails, just as I did. I think SMT as a very nice addition for our SLES/SLED product lines. However, the zypp-proxy project is my itch and I do not see how SMT can solve it utill I have understanding of the questions I posed above. Thanks. #6: Federico Lucifredi (flucifredi) (2009-07-17 23:49:27) (reply to #5) Alex, I cannot stop you from creating more duplication, that's the way the community works - but to do so internally, with Novell-sponsored time, itch or not, is simply nonsense. I would *strongly* encourage that you use your ITO for something actually useful, and since Duncan wants to get community involvement in SMT, that would be something where you can scratch your itch in a constructive way. The naming is a minor question. Packaging SMT so that it can be used for openSUSE as well, that is the interesting bit we need to tackle. Marketing or naming is not a valid reason to start something else. Filtering works that you select patterns or severity levels for what needs to be mirrored. If you want to look into automating selection of dependancies, that may be interesting as well -- if you can make it happen. Proxy caches are fundamental to a distro used in production. As a company, we try to have distros used in production to be our paid for offering, since the business unit both you and I work for still has to break even. That is why SMT for the openSUSE community has been something that has had to wait... but if you want to help on this topic, we can definitely use a hand! #7: Federico Lucifredi (flucifredi) (2009-07-17 23:50:24) (reply to #6) select patterns meaning selecting *name* patterns. Not zypper patterns. #8: Alex Tsariounov (tsariounov) (2009-07-18 00:50:08) (reply to #6) Seems that the wind has let down on the zypp-proxy sails. However, I don't see that SMT's mirror filtering is close to the cache-proxy model. I suppose I don't see the use case. The use case for the cache- proxy is as follows: I have two identical virtual machines on a fresh proxy server, I update one of the VM's and all the updates get cached, I update the next virtual machine and no external network bandwidth gets used, and so on. A configurable on the server sets how many old versions of packages survive the periodic clean up thread. Do you have a preliminary schedule for the openSUSE release of SMT? Would your team be open to implentation of the cache-proxy model? And, finally, Duncan, do you have a git tree somewhere with the SMT code so I can take a peek? Thanks. + #11: Alex Tsariounov (tsariounov) (2009-07-19 22:56:10) (reply to #6) + Frederico, I do not see zypp-proxy as duplication. But even it if is, + we have a number of projects in suse that "duplicate" each other to + some extent, and that's ok since they usually cater to different + audiences. The audience for zypp-proxy is different that for SMT. SMT + caters toward the enterprise subscription customer. Zypp-proxy caters + more toward the individual user and developer. Zypp-proxy is different + enough from SMT to be very useful indeed, and certainly it is not + "nonsense," as you say. Just look at Debian's apt-proxy; just look at + people asking for it online and being puzzled why it's not available + and why no one is working on it. Why did you not set up SMT from the + beginning with this type of functionality, after all, the need was + known a long time ago. + Naming is actually an important question, it is not minor. And while + naming or marketing may not be a valid reason for starting something + else, the technical reason usually is, at least for engineering. So + far, you have not shown that SMT, even for the public openSUSE release, + will contain the functionality that I described for zypp-proxy. -- openSUSE Feature: https://features.opensuse.org/306896