[opensuse-packaging] Packaging big files
What is the best way to package big/redistribution restricted files? Packman seems to use http://sourceforge.net/projects/autodownloader, but it has the problem that it isn't integrated with RPM. From an user POV you still get updates automatically, but you can't use rpm -V to check the installation integrity. There is anything better? Any plan for repositories with metadata that point to BitTorrent packages? A patched rpmbuild that adds the RPM headers for unpackaged files that get downloaded in a %pre scriptlet? In the worst case... from a quick look I didn't saw anything, but there is any library to easily access to the RPM DB so I could patch autodownloader? addFile(package, file_path)? ...and, we have any official "limit" to the file size of a package? When it's a problem for mirrors? -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
Hi, On Tue, Jun 09, 2009 at 08:49:40 +0200, Cristian Morales Vega wrote:
What is the best way to package big/redistribution restricted files? Packman seems to use http://sourceforge.net/projects/autodownloader, but it has the problem that it isn't integrated with RPM. From an user POV you still get updates automatically, but you can't use rpm -V to check the installation integrity.
There is anything better? Any plan for repositories with metadata that point to BitTorrent packages? A patched rpmbuild that adds the RPM headers for unpackaged files that get downloaded in a %pre scriptlet? In the worst case... from a quick look I didn't saw anything, but there is any library to easily access to the RPM DB so I could patch autodownloader? addFile(package, file_path)?
You could package the big files somewhere else (separate build service instance), and have users subscribe to the repository. That way, the space would be used on a different system than ours and we wouldn't be bothered. You could also provide the large files (in unpackaged form) elsewhere, and have them downloaded via %post scriptlets. If there's mirroring, you could create metalinks and use aria2c for downloading, which takes care of content verification and mirror load balancing. (You could package the metalink into your buildservice package as well.)
...and, we have any official "limit" to the file size of a package? When it's a problem for mirrors?
Well, everything which contributes to size contributes to the problem. Which size are you talking about, roughly? If we know what you are up to, we can deal with it for instance by excluding files from ending up on the "normal" mirrors, and instead mirroring them elsewhere. Depending on the popularity, it'll be worthwhile to go this path. Popular content deserves to be mirrored, so widespread mirroring *might* make sense. In contrast, if the content is hardly used then it's good to make sure it won't end up on too many mirrors (resulting only in waste of space and bandwidth). Feel free to contact me with details. Thanks for your considerate post! Peter -- "WARNING: This bug is visible to non-employees. Please be respectful!" SUSE LINUX Products GmbH Research & Development
2009/6/25 Peter Poeml <poeml@suse.de>:
Hi,
On Tue, Jun 09, 2009 at 08:49:40 +0200, Cristian Morales Vega wrote:
What is the best way to package big/redistribution restricted files? Packman seems to use http://sourceforge.net/projects/autodownloader, but it has the problem that it isn't integrated with RPM. From an user POV you still get updates automatically, but you can't use rpm -V to check the installation integrity.
There is anything better? Any plan for repositories with metadata that point to BitTorrent packages? A patched rpmbuild that adds the RPM headers for unpackaged files that get downloaded in a %pre scriptlet? In the worst case... from a quick look I didn't saw anything, but there is any library to easily access to the RPM DB so I could patch autodownloader? addFile(package, file_path)?
You could package the big files somewhere else (separate build service instance), and have users subscribe to the repository. That way, the space would be used on a different system than ours and we wouldn't be bothered.
You could also provide the large files (in unpackaged form) elsewhere, and have them downloaded via %post scriptlets. If there's mirroring, you could create metalinks and use aria2c for downloading, which takes care of content verification and mirror load balancing. (You could package the metalink into your buildservice package as well.)
The files are already hosted in a lot of places, so it's a shame not to use them. But I would like a way that after installation there is no difference from a real RPM with the files packaged. I could put %ghost file entries and download the real files in a scriplet (or autodownloader, as Packman does)... but that would be still not the same. I know everything about these files, hash included, so I would like a way that puts that info into the RPM DB. Something that would list all the correct info when I do a "rpm -qvl" or a "rpm -V".
...and, we have any official "limit" to the file size of a package? When it's a problem for mirrors?
Well, everything which contributes to size contributes to the problem. Which size are you talking about, roughly? If we know what you are up to, we can deal with it for instance by excluding files from ending up on the "normal" mirrors, and instead mirroring them elsewhere. Depending on the popularity, it'll be worthwhile to go this path. Popular content deserves to be mirrored, so widespread mirroring *might* make sense. In contrast, if the content is hardly used then it's good to make sure it won't end up on too many mirrors (resulting only in waste of space and bandwidth).
Feel free to contact me with details.
Thanks for your considerate post!
I though about this when I saw FreeSpace 2. The enhanced game contect is already 1,4 GiB uncompressed (perhaps 1 GiB compressed). Add 900 MiB for the FreeSpace 1 port... and even 250 MiB more for cutscenes. That's without thinking about other MODs. For this specific case I'm not really thinking about putting all this in the OBS, but made me think about the problem. But the big data content is an increasing problem with games. A more real example would be Warzone 2100, since 2.2 version it added support for videos... 162 MiB. That's a number that makes one wonder if it's ok, needs any special care or what. Do you have any stats for the downloads of the previous versions of the game? If isn't very popular perhaps a README.openSUSE saying that the user can download the videos file and put it in his home dir would be enough. -- To unsubscribe, e-mail: opensuse-packaging+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-packaging+help@opensuse.org
On Thu, Jun 25, 2009 at 07:12:17 +0200, Cristian Morales Vega wrote:
The files are already hosted in a lot of places, so it's a shame not to use them. But I would like a way that after installation there is no difference from a real RPM with the files packaged. I could put %ghost file entries and download the real files in a scriplet (or autodownloader, as Packman does)... but that would be still not the same. I know everything about these files, hash included, so I would like a way that puts that info into the RPM DB. Something that would list all the correct info when I do a "rpm -qvl" or a "rpm -V".
Then I'd rather recommend to package the files as RPM packages. In addition, a repository would make sense for clients to install and update from. If there are packages somewhere, it is trivial to create a repository with the createrepo tool, and the repository could be hosted in a different location than the RPM packages (but link to them). But it sounds as if the files are not distributed as packages so far, but rather in a different form, right? You could go as far as packaging a .repo file in an openSUSE build service package, which is installed into /etc/zypp/repos.d so to automatically subscribe a users to the "foreign" repository hosting the packaged large files.
I though about this when I saw FreeSpace 2. The enhanced game contect is already 1,4 GiB uncompressed (perhaps 1 GiB compressed). Add 900 MiB for the FreeSpace 1 port... and even 250 MiB more for cutscenes. That's without thinking about other MODs.
With files that large, downloading presents a challenge in itself, because calling wget in %post won't work well. I'd recomment to solve this in the same way as openSUSE does for CD and DVD images, which is to provide metalinks and use a metalink client for downloading, which automatically handles mirror issues and also adds downloading via BitTorrent into the picture. You would need to set up MirrorBrain on a server that hosts the files, collect URLs of existing mirrors, and let the MirrorBrain server do the rest. Starting with openSUSE 11.2, the metalink client aria2c will be on every system anyway, and automatically used by YaST/zypper. On earlier openSUSE's you could still draw aria2c in via package dependencies, and do things in %post with it. With files that large, it's worth going this route.
For this specific case I'm not really thinking about putting all this in the OBS, but made me think about the problem. But the big data content is an increasing problem with games. A more real example would be Warzone 2100, since 2.2 version it added support for videos... 162 MiB. That's a number that makes one wonder if it's ok, needs any special care or what. Do you have any stats for the downloads of the previous versions of the game? If isn't very popular perhaps a README.openSUSE saying that the user can download the videos file and put it in his home dir would be enough.
I would need to grep the logs. Unfortunately, we don't log build service downloads anymore, because then the question would be easy to answer (we even used to have the stats available right in the build service website). The download stats could and should be revived; there ideas how to do it but nobody had the time so far to really work on it. Yesterday, these were all downloads I see grepping for "warzone" (shortened from the access_log): GET /repositories/games/openSUSE_11.0/x86_64/warzone2100-2.1.3-1.2.x86_64.rpm HTTP/1.1" 302 "ZYpp 4.28.1 (curl 7.18.1)" unixheads.net NA:CA GET /repositories/games/openSUSE_11.0/i586/warzone2100-2.1.3-1.2.i586.rpm HTTP/1.1" 302 "ZYpp 4.28.1 (curl 7.18.1)" mirror.leaseweb.com EU:AT GET /repositories/games/openSUSE_11.1/i586/warzone2100-2.1.3-1.2.i586.rpm HTTP/1.1" 302 "ZYpp 5.30.3 (curl 7.19.0) openSUSE-11.1-i586" ftp.twaren.net AS:UZ GET /repositories/games/openSUSE_11.1/x86_64/warzone2100-2.1.3-1.2.x86_64.rpm HTTP/1.1" 302 "ZYpp 5.30.3 (curl 7.19.5) openSUSE-11.1-x86_64" ftp.uni-heidelberg.de EU:FR GET /repositories/games/openSUSE_11.1/x86_64/warzone2100-2.1.3-1.2.x86_64.rpm HTTP/1.1" 302 "ZYpp 5.30.3 (curl 7.19.0) openSUSE-11.1-x86_64" anorien.csc.warwick.ac.uk EU:GB GET /repositories/games/openSUSE_11.1/i586/warzone2100-2.1.3-1.2.i586.rpm HTTP/1.1" 302 "ZYpp 5.30.3 (curl 7.19.0) openSUSE-11.1-i586" anorien.csc.warwick.ac.uk EU:IT GET /repositories/games/openSUSE_11.1/i586/warzone2100-2.1.3-1.2.i586.rpm HTTP/1.1" 302 "ZYpp 5.30.3 (curl 7.19.0) openSUSE-11.1-i586" ftp5.gwdg.de EU:FR The mirror database knows 16 mirrors that have the files. The logged downloads are all the same version. I don't know the game or its existing versions, so I can't derive much from the log. Peter -- "WARNING: This bug is visible to non-employees. Please be respectful!" SUSE LINUX Products GmbH Research & Development
participants (2)
-
Cristian Morales Vega
-
Peter Poeml