Peter Poeml wrote:
Hello, good afternoon :-)
There are some new ideas and changes about implementation details of the concept of failover [1].
Before, our idea was to modify MediaCurl class [2] to parse a list of mirrors like this [3].
But now, new idea is to use only metalink files ([4] [5]) and don't use anymore this type of mirror lists.
Libzypp could download files using an external program (ej. aria2c [6]) to avoid I am not sure if it is a good idea. This would make sense only if you can reimplement all the progress callbacks, authorization callbacks, error handling and error callbacks MediaCurl right now provides. In libzypp, it seems to be much more complicated.
It is, Media subsystem is not a simple backend. download.opensuse.org is an http only world. Media:: It is an abstraction layer for accesing http, ftp, iso, nfs, harddisks, local filesystem etc. Adding a http header is not even in the API!
Our question is:
How can we easily add a new handler (fetching files with an external program) with as few intrusion into libzypp's media handling as possible?
I would say the best way would be to reimplement MediaHandler just like MediaDISK and MediaCurl do. Then replace the code in MediaManager to use your new MediaHandler for http. However, as I said, you will need to take care of proxy, progress, etc. You may look into MediaCurl for examples on how to report that.
Is there a central place where one could hook into?
It would be great if it is possible to use e.g. aria2c as external downloader, which already implements nearly everything that we need. It meanwhile seems to me that implementing more stuff in libzypp will not only reinvent the wheel in many regards but also increase the media handling's complexity even further.
I also think so. However, it is aria2c a library or just a dummy command line tool?
An underlying assumption that I have is that there always is some kind of package caching directory where files are downloaded (and used later), so it wouldn't matter if the files are put there by libcurl or by the external process. Is this assumption correct?
No. There are multiple directories where files are copied once and over (because the layers). Media has always an attach point (MediaManager::localRoot() ) This is passed from the media manager to the specific media handler. Some implementaton omit them, for example, MediaDIR, which handles access to local files, just passes up url_r.getPathName() as the url is always a local path and there is no need to copy. MediaDIR::MediaDIR( const Url & url_r, const Pathname & /*attach_point_hint_r*/ ) : MediaHandler( url_r, url_r.getPathName(), However this directory is only valid for the lifetime of the Media access. Once the media is closed the directory is deleted (not in the MediaDIR case). The caching is handled at a different level. Right now the MediaCurl has support that, if the file it is downloading already exists in its attach point, then the right if-modified-since is used. That works if you download the file twice in the same session. However we don't use this feature, or in other words, usually no files are present when downloading files. In the upper layer, MediaSetAccess handles different Media attachments for one url with different media numbers. And the Fetcher layer manages a queue of requests, taking care that either the complete queue is consistent or not. Fetcher downloads to a tmp directory, and Media too (depending on the handler), then Media files are put into Fetcher's tmp directory. Fetcher takes care to look if there are files with same checksum in other Fetcher caches already. If yes, they get harlinked into Fetcher's tmp directory. At the end, the directories are swaped with the Fetcher target directory (which usually is a repo cache directory), so either the metadata is complete or not, there is no middle point.
Why this change?
- Because we think than we are reinventing the wheel in things like:
* Parse HTTP codes and act according to these. * To choose the fastest mirror.
How to implement?
I think we can do the implementation with some changes:
- Modify media/MediaAccess.cc [7] and check if there are available tools like aria2c in the target system. - If we get a negative response we must use MediaCurl (like now). - If we get a positive response we can use a "new" class called MediaArise (or something like this) which uses aria2c to download files using Medialink from network.
What do you think about this idea? Any comments for this implementation? Dr. Poeml, please, feel free to correct me if there are any error.
Obviously, any suggestion or comment will be more than welcome :-).
Thanks :-)
Gerard
Overall I like the idea. We had a good experience using external tools for parsing repos already. And we do for rpm too.
Duncan -- To unsubscribe, e-mail: zypp-devel+unsubscribe@opensuse.org For additional commands, e-mail: zypp-devel+help@opensuse.org