[opensuse-factory] ludicrous software management
I just booted in order to update to the latest factory ftp tree. I opened YaST to check and see which installation source was set, because I already found out that my usual mirror is out of sync. Couldn't do it. zypp-updater was running and preventing access to the DB. So, I killed zypp-updater and tried again. Now Yast is stuck downloading God only know what files, when all I wanted to know was what installation source was set. Simply ludicrous software management system. -- "Rejoice and be glad, because great is your reward in heaven." Matthew 5:12 NIV Team OS/2 ** Reg. Linux User #211409 Felix Miata *** http://mrmazda.no-ip.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On Thu, Nov 30, 2006 at 10:12:02AM -0500, Felix Miata wrote:
I just booted in order to update to the latest factory ftp tree. I opened YaST to check and see which installation source was set, because I already found out that my usual mirror is out of sync.
Couldn't do it. zypp-updater was running and preventing access to the DB. So, I killed zypp-updater and tried again. Now Yast is stuck downloading God only know what files, when all I wanted to know was what installation source was set.
Simply ludicrous software management system.
You are getting progress bars, right? Its likely just your slow network connection. Ciao, Marcus --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
Marcus Meissner schrieb:
You are getting progress bars, right?
Its likely just your slow network connection.
But it's a deadlock problem and a must-fix for the next release. There is _no_way_ to get the configuration fixed if it's broken (e.g. by a bad and slow mirror, bad connection or the like). - If the connection to a mirror is slow, there is no way to disable it because YaST doesn't give the user access to it before the metadata are downloaded, which can take up to 24 hours if an unlucky user managed to get a _really_ slow mirror like ftp.opensuse.org. - If the user tries to trick YaST into giving him access by going offline, it doesn't work either because then YaST thinks it's a good idea to hide the remote sources. The only way I'm aware of is a brute-force attack, going into /var/lib/zypp/db/sources and deleting the XML file directly. It's broken and can't stay this way. Andreas --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On Thu, Nov 30, 2006 at 04:22:49PM +0100, Andreas Hanke wrote:
Marcus Meissner schrieb:
You are getting progress bars, right?
Its likely just your slow network connection.
But it's a deadlock problem and a must-fix for the next release.
There is _no_way_ to get the configuration fixed if it's broken (e.g. by a bad and slow mirror, bad connection or the like).
- If the connection to a mirror is slow, there is no way to disable it because YaST doesn't give the user access to it before the metadata are downloaded, which can take up to 24 hours if an unlucky user managed to get a _really_ slow mirror like ftp.opensuse.org.
- If the user tries to trick YaST into giving him access by going offline, it doesn't work either because then YaST thinks it's a good idea to hide the remote sources.
The only way I'm aware of is a brute-force attack, going into /var/lib/zypp/db/sources and deleting the XML file directly.
It's broken and can't stay this way.
Is there a bugreport for this? Ciao, Marcus --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
Marcus Meissner schrieb:
Is there a bugreport for this?
It's a combination of multiple things. (1) Installation sources in offline mode https://bugzilla.novell.com/show_bug.cgi?id=223600 (2) Metadata shouldn't be refreshed when starting yast2 inst_source Currently not reported (AFAIK). (3) download.opensuse.org often redirects to ftp.opensuse.org Currently not reported (AFAIK). --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On Thursday 30 November 2006 17:26, Marcus Meissner wrote:
Is there a bugreport for this? #222222 is similar, but for 3rd party repo (I have not tested with the remote factory).
Andras -- Quanta Plus developer - http://quanta.kdewebdev.org K Desktop Environment - http://www.kde.org
On 2006/11/30 16:13 (GMT+0100) Marcus Meissner apparently typed:
On Thu, Nov 30, 2006 at 10:12:02AM -0500, Felix Miata wrote:
I just booted in order to update to the latest factory ftp tree. I opened YaST to check and see which installation source was set, because I already found out that my usual mirror is out of sync.
Couldn't do it. zypp-updater was running and preventing access to the DB. So, I killed zypp-updater and tried again. Now Yast is stuck downloading God only know what files, when all I wanted to know was what installation source was set.
Simply ludicrous software management system.
You are getting progress bars, right?
Sure, little narrow ones that show less than half the URL that I have to keep widening by more than 100% each time they close and reopen. The bigger question is why does anything need to be downloaded just so that I can check to see what my installation source is set to? Also, why didn't the updater quit when I brought it up from the taskbar and selected to quit? Why does it have to lock the database so tight I can't even look at it?
Its likely just your slow network connection.
My network connection is about as fast as a network connection can be. The problem is mirrors in sync are in such short supply that it's pointless to try anywhere but ftp.gwdg.de for factory ftp, which is hopelessly slow nearly always. Now while I was waiting for a response the downloading has halted with this message in a window: ERROR: There were errors when restoring the source configuration. Not all sources are available for configuration. http://ftp.gwdg.de/pub/linux/suse/opensuse/distribution/SL-OSS-factory/inst-...: Cannot create the installation source. Do you want to immediately remove these sources? Yes No :-( -- "Rejoice and be glad, because great is your reward in heaven." Matthew 5:12 NIV Team OS/2 ** Reg. Linux User #211409 Felix Miata *** http://mrmazda.no-ip.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
Hi, On Thu, 30 Nov 2006, Felix Miata wrote:
Now while I was waiting for a response the downloading has halted with this message in a window:
ERROR:
There were errors when restoring the source configuration. Not all sources are available for configuration.
http://ftp.gwdg.de/pub/linux/suse/opensuse/distribution/SL-OSS-factory/inst-...: Cannot create the installation source.
Don't use this symlink. It will disappear, and no other server is carrying it. Use http://ftp-1.gwdg.de/pub/opensuse/distribution/SL-OSS-factory/inst-source ftp-1 is my best server these days for all suse and opensuse directories, both regarding performance and actuality. Cheers -e -- Eberhard Moenkeberg (emoenke@gwdg.de, em@kki.org) --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On 2006/11/30 16:39 (GMT+0100) Eberhard Moenkeberg apparently typed:
Use http://ftp-1.gwdg.de/pub/opensuse/distribution/SL-OSS-factory/inst-source
ftp-1 is my best server these days for all suse and opensuse directories, both regarding performance and actuality.
I switched to this for this update. I started over 5 hours ago. It still shows >2 hours remaining. I can't tell exactly how dismal this is because yast doesn't bother to tell the download rate, but I suspect it's in the modem speed range. From good performing mirrors I can download an average CD iso in less than 20 minutes. I should have been done >4 hours ago. -- "Rejoice and be glad, because great is your reward in heaven." Matthew 5:12 NIV Team OS/2 ** Reg. Linux User #211409 Felix Miata *** http://mrmazda.no-ip.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
Hi, On Thu, 30 Nov 2006, Felix Miata wrote:
On 2006/11/30 16:39 (GMT+0100) Eberhard Moenkeberg apparently typed:
Use http://ftp-1.gwdg.de/pub/opensuse/distribution/SL-OSS-factory/inst-source
ftp-1 is my best server these days for all suse and opensuse directories, both regarding performance and actuality.
I switched to this for this update. I started over 5 hours ago. It still shows >2 hours remaining. I can't tell exactly how dismal this is because yast doesn't bother to tell the download rate, but I suspect it's in the modem speed range. From good performing mirrors I can download an average CD iso in less than 20 minutes. I should have been done >4 hours ago.
Believe me, this is the best you can do. I have a bandwidth shaping of 800 MBit/sec since 8 minutes (500 MBit/sec before), and about 2200 sessions currently. The server is delivering 73 MByte/sec at the moment - so please calculate if "your part" is fair. BTW: the remaining 27 MByte/sec get delivered by ftp.gwdg.de. ;-)) I can only guarantee the server stability, not the individual throughput. But the summary. Yes, I can. ;->> Cheers -e -- Eberhard Moenkeberg (emoenke@gwdg.de, em@kki.org) --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Eberhard Moenkeberg wrote:
Hi,
On Thu, 30 Nov 2006, Felix Miata wrote:
On 2006/11/30 16:39 (GMT+0100) Eberhard Moenkeberg apparently typed:
Use http://ftp-1.gwdg.de/pub/opensuse/distribution/SL-OSS-factory/inst-source ftp-1 is my best server these days for all suse and opensuse directories, both regarding performance and actuality. I switched to this for this update. I started over 5 hours ago. It still shows >2 hours remaining. I can't tell exactly how dismal this is because yast doesn't bother to tell the download rate, but I suspect it's in the modem speed range. From good performing mirrors I can download an average CD iso in less than 20 minutes. I should have been done >4 hours ago.
Believe me, this is the best you can do. I have a bandwidth shaping of 800 MBit/sec since 8 minutes (500 MBit/sec before), and about 2200 sessions currently. The server is delivering 73 MByte/sec at the moment - so please calculate if "your part" is fair.
BTW: the remaining 27 MByte/sec get delivered by ftp.gwdg.de. ;-))
I can only guarantee the server stability, not the individual throughput. But the summary. Yes, I can. ;->>
As a side note, and to lighten the load on gwdg.de ;) http://ftp.skynet.be/pub/ is damn fast too for - - stable releases: http://ftp.skynet.be/pub/ftp.opensuse.org/opensuse/distribution/SL-10.1/ - - online updates: http://ftp.skynet.be/pub/ftp.suse.com/suse/update/ - - 10.2 RC CD ISOs: http://ftp.skynet.be/pub/ftp.opensuse.org/opensuse/distribution/openSUSE-10.... - - Packman mirror: http://ftp.skynet.be/pub/packman/ - - guru mirror: http://ftp.skynet.be/pub/suser-guru/ - - build service mirror: http://ftp.skynet.be/pub/software.opensuse.org/ For some of those repositories, you also have the bigger boys in the European mirror scene: ftp.belnet.be and ftp.heanet.ie - -- -o) Pascal Bleser http://linux01.gwdg.de/~pbleser/ /\\ <pascal.bleser@skynet.be> <guru@unixtech.be> _\_v The more things change, the more they stay insane. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) iD8DBQFFb2GSr3NMWliFcXcRAlG5AKCgIx/watOcv7XdRRrxu06kRrpsegCaA1by aSgAoC0yGhn3yPMEHJgla48= =/Vv+ -----END PGP SIGNATURE----- --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
Hi, On Thu, 30 Nov 2006, Pascal Bleser wrote:
Eberhard Moenkeberg wrote:
On Thu, 30 Nov 2006, Felix Miata wrote:
On 2006/11/30 16:39 (GMT+0100) Eberhard Moenkeberg apparently typed:
Use http://ftp-1.gwdg.de/pub/opensuse/distribution/SL-OSS-factory/inst-source
ftp-1 is my best server these days for all suse and opensuse directories, both regarding performance and actuality.
I switched to this for this update. I started over 5 hours ago. It still shows >2 hours remaining. I can't tell exactly how dismal this is because yast doesn't bother to tell the download rate, but I suspect it's in the modem speed range. From good performing mirrors I can download an average CD iso in less than 20 minutes. I should have been done >4 hours ago.
Believe me, this is the best you can do. I have a bandwidth shaping of 800 MBit/sec since 8 minutes (500 MBit/sec before), and about 2200 sessions currently. The server is delivering 73 MByte/sec at the moment - so please calculate if "your part" is fair.
BTW: the remaining 27 MByte/sec get delivered by ftp.gwdg.de. ;-))
I can only guarantee the server stability, not the individual throughput. But the summary. Yes, I can. ;->>
As a side note, and to lighten the load on gwdg.de ;)
http://ftp.skynet.be/pub/ is damn fast too for
- - stable releases: http://ftp.skynet.be/pub/ftp.opensuse.org/opensuse/distribution/SL-10.1/
- - online updates: http://ftp.skynet.be/pub/ftp.suse.com/suse/update/
- - 10.2 RC CD ISOs: http://ftp.skynet.be/pub/ftp.opensuse.org/opensuse/distribution/openSUSE-10....
- - Packman mirror: http://ftp.skynet.be/pub/packman/
- - guru mirror: http://ftp.skynet.be/pub/suser-guru/
- - build service mirror: http://ftp.skynet.be/pub/software.opensuse.org/
For some of those repositories, you also have the bigger boys in the European mirror scene: ftp.belnet.be and ftp.heanet.ie
Interesting. What is the delivery bandwidth of ftp.skynet.be? Maybe Christoph should extend the redirect algorithm at download.opensuse.org to respect outgoing bandwidth... Cheers -e -- Eberhard Moenkeberg (emoenke@gwdg.de, em@kki.org) --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On Fri, Dec 01, 2006 at 12:05:33AM +0100, Eberhard Moenkeberg wrote: [...]
Interesting. What is the delivery bandwidth of ftp.skynet.be?
Maybe Christoph should extend the redirect algorithm at download.opensuse.org to respect outgoing bandwidth...
Sure, that would be nice to have -- but how should d.o.o get an idea of the current load on a mirror? Any suggestions? :) Best, Christoph --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
Hi. On Fri, 1 Dec 2006, Christoph Thiel wrote:
On Fri, Dec 01, 2006 at 12:05:33AM +0100, Eberhard Moenkeberg wrote:
Interesting. What is the delivery bandwidth of ftp.skynet.be?
Maybe Christoph should extend the redirect algorithm at download.opensuse.org to respect outgoing bandwidth...
Sure, that would be nice to have -- but how should d.o.o get an idea of the current load on a mirror? Any suggestions? :)
By the number of taken redirections within the last "time window". We could further refine that with a load measure of the servers at a counted number of sessions. BTW: even 3000 http sessions are no problem at ftp-1.gwdg.de - network delivery currently is 78 MB/sec, while disk I/O is 7 MB/sec. The huge RAM/buffer cache of 32 MB really is the biggest helper... Cheers -e -- Eberhard Moenkeberg (emoenke@gwdg.de, em@kki.org) --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
Hi Eberhard, On Fri, Dec 01, 2006 at 01:02:14AM +0100, Eberhard Moenkeberg wrote:
Interesting. What is the delivery bandwidth of ftp.skynet.be?
Maybe Christoph should extend the redirect algorithm at download.opensuse.org to respect outgoing bandwidth...
Sure, that would be nice to have -- but how should d.o.o get an idea of the current load on a mirror? Any suggestions? :)
By the number of taken redirections within the last "time window". We could further refine that with a load measure of the servers at a counted number of sessions.
Might be an option, but I would rathe like to have the mirrors somehow notify the redirector about their status. Ie. be able to automatically influence it.
BTW: even 3000 http sessions are no problem at ftp-1.gwdg.de - network delivery currently is 78 MB/sec, while disk I/O is 7 MB/sec. The huge RAM/buffer cache of 32 MB really is the biggest helper... ^^^^^ I wish I had 32 MB as well ;)
Best, Christoph --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
succes of RC1 is incredible. My bittorent azureus has already 13 clients and I served 8 times my download of the 5 cd last week, for beta, 4 days after the release I had no more clients... jdd -- http://www.dodin.net http://dodin.org/mediawiki/index.php/GPS_Lowrance_GO --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On 2006/11/30 23:16 (GMT+0100) Eberhard Moenkeberg apparently typed:
On Thu, 30 Nov 2006, Felix Miata wrote:
On 2006/11/30 16:39 (GMT+0100) Eberhard Moenkeberg apparently typed:
Use http://ftp-1.gwdg.de/pub/opensuse/distribution/SL-OSS-factory/inst-source
ftp-1 is my best server these days for all suse and opensuse directories, both regarding performance and actuality.
I switched to this for this update. I started over 5 hours ago. It still shows >2 hours remaining. I can't tell exactly how dismal this is because yast doesn't bother to tell the download rate, but I suspect it's in the modem speed range. From good performing mirrors I can download an average CD iso in less than 20 minutes. I should have been done >4 hours ago.
Believe me, this is the best you can do. I have a bandwidth shaping of 800 MBit/sec since 8 minutes (500 MBit/sec before), and about 2200 sessions currently. The server is delivering 73 MByte/sec at the moment - so please calculate if "your part" is fair.
I'm not sure how to calculate. ETA is now down to about 20 minutes, which means the total time for almost 800 packages will have amounted to almost 7 hours by the time it finishes. By comparison, 800 is probably not a lot less than how many packages were required yesterday on a fresh install from mirrors.kernel.org that took less than 2 hours for the whole installation.
BTW: the remaining 27 MByte/sec get delivered by ftp.gwdg.de. ;-))
I have no doubt you do the best you can given the demand and what you have to work with. I think the real problem is professed development mirrors whose sync frequency is just fine for releases, but inadequate for development trees. The closest mirror to me is ftp.cise.ufl.edu. It's performance is great, if it has what I need, but it doesn't carry factory. Mirrors.kernel.org usually has good download performance, but it's sync behavior for factory is bad. I wanted the newest kernel this AM and went there to find it. I had no problem fetching it, but it had a dep on some newer perl package. I went to find that package, and there was _no_ version of it on that mirror. That's when I decided to try ftp-1.gwdg.de. We need testing caliber mirrors only for factory, but don't have it. This is what drives people to overload gwdg. :-( -- "Rejoice and be glad, because great is your reward in heaven." Matthew 5:12 NIV Team OS/2 ** Reg. Linux User #211409 Felix Miata *** http://mrmazda.no-ip.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
Hi, On Thu, 30 Nov 2006, Felix Miata wrote:
On 2006/11/30 23:16 (GMT+0100) Eberhard Moenkeberg apparently typed:
On Thu, 30 Nov 2006, Felix Miata wrote:
On 2006/11/30 16:39 (GMT+0100) Eberhard Moenkeberg apparently typed:
Use http://ftp-1.gwdg.de/pub/opensuse/distribution/SL-OSS-factory/inst-source
ftp-1 is my best server these days for all suse and opensuse directories, both regarding performance and actuality.
I switched to this for this update. I started over 5 hours ago. It still shows >2 hours remaining. I can't tell exactly how dismal this is because yast doesn't bother to tell the download rate, but I suspect it's in the modem speed range. From good performing mirrors I can download an average CD iso in less than 20 minutes. I should have been done >4 hours ago.
Believe me, this is the best you can do. I have a bandwidth shaping of 800 MBit/sec since 8 minutes (500 MBit/sec before), and about 2200 sessions currently. The server is delivering 73 MByte/sec at the moment - so please calculate if "your part" is fair.
I'm not sure how to calculate. ETA is now down to about 20 minutes, which means the total time for almost 800 packages will have amounted to almost 7 hours by the time it finishes.
By comparison, 800 is probably not a lot less than how many packages were required yesterday on a fresh install from mirrors.kernel.org that took less than 2 hours for the whole installation.
Keep reserved about YasT's time estimations. It is good they are present, but they are most of the time of low value. At the end they are good. ;-)
BTW: the remaining 27 MByte/sec get delivered by ftp.gwdg.de. ;-))
I have no doubt you do the best you can given the demand and what you have to work with. I think the real problem is professed development mirrors whose sync frequency is just fine for releases, but inadequate for development trees. The closest mirror to me is ftp.cise.ufl.edu. It's performance is great, if it has what I need, but it doesn't carry factory.
It is planned since long to distribute factory via push service (like repositories currently) - this will give the chance for every mirror to be as actual as possible. The opensuse server would initiate the rsync run as soon as any package has finished compiling successfully. Let's pray to get it coming soon, or let's shout very very nasty to the SUSE guys who are responsible for it. Looking at the time frame, I would prefer to shout.
Mirrors.kernel.org usually has good download performance, but it's sync behavior for factory is bad. I wanted the newest kernel this AM and went there to find it. I had no problem fetching it, but it had a dep on some newer perl package. I went to find that package, and there was _no_ version of it on that mirror. That's when I decided to try ftp-1.gwdg.de.
We need testing caliber mirrors only for factory, but don't have it. This is what drives people to overload gwdg. :-(
If you have patience, you will stand it and it will succeed. It will perform better at European night and weekend. In summary, the GWDG servers are delivering almost 200 TB per month. This rate will get doubled (theoretically - I will have to see if the servers will perform accordingly) in October 2007. Cheers-e -- Eberhard Moenkeberg (emoenke@gwdg.de, em@kki.org) --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
Hi, A suggestion for openSUSE10.3. I've noticed that <filelists.xml.gz> is refreshed by Yast YOU if it has changed on the server. It takes me about 4 minutes to download from <mirror.pacific.net.au> and is currently 2.9MB in size. I had a look at this compressed text file and observed that it contains a very large proportion of redundant data. /a/b/c/d/e/file1 /a/b/c/d/e/file2 etc If this file could be compressed further by omitting the redundant data the compressed version would download much faster and may improve the end users experience of updating their system. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
Keith Goggin schrieb:
I've noticed that <filelists.xml.gz> is refreshed by Yast YOU if it has changed on the server. It takes me about 4 minutes to download from <mirror.pacific.net.au> and is currently 2.9MB in size.
It's really insane. This XML stuff has made SUSE distros basically unusable without a broadband connection. 3MB before the distro is even released - crazy!
I had a look at this compressed text file and observed that it contains a very large proportion of redundant data.
/a/b/c/d/e/file1 /a/b/c/d/e/file2 etc
If this file could be compressed further by omitting the redundant data the compressed version would download much faster and may improve the end users experience of updating their system.
Yes, there are much more efficient ways to store these data than repomd does. But hey, XML is so much cooler. Maybe one of those talented persons who designed repomd should have a look at /var/lib/locatedb. It's able to store equivalent information about 2 Linuxes, 1 Windows system and a lot of user files in just 2.5MB. Thanks to not using XML. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On Friday 01 December 2006 12:54, Andreas Hanke wrote:
Keith Goggin schrieb:
I've noticed that <filelists.xml.gz> is refreshed by Yast YOU if it has changed on the server. It takes me about 4 minutes to download from <mirror.pacific.net.au> and is currently 2.9MB in size.
It's really insane. This XML stuff has made SUSE distros basically unusable without a broadband connection. 3MB before the distro is even released - crazy!
I had a look at this compressed text file and observed that it contains a very large proportion of redundant data.
/a/b/c/d/e/file1 /a/b/c/d/e/file2 etc
If this file could be compressed further by omitting the redundant data the compressed version would download much faster and may improve the end users experience of updating their system.
Yes, there are much more efficient ways to store these data than repomd does. But hey, XML is so much cooler.
Maybe one of those talented persons who designed repomd should have a look at /var/lib/locatedb. It's able to store equivalent information about 2 Linuxes, 1 Windows system and a lot of user files in just 2.5MB. Thanks to not using XML.
Err ... Yes, it's just that I'm trying not to use emotive language :-) --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On Fri, 01 Dec 2006 02:54:41 +0100, Andreas Hanke <andreas.hanke@gmx-topmail.de> wrote:
It's really insane. This XML stuff has made SUSE distros basically unusable without a broadband connection. 3MB before the distro is even released - crazy!
My diy-linux w/uclibc uses 20MB of disk, including perl. My SUSE 10.2 "minimal" install used 900MB of disk. That's some load of #@!$%! --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
John Kelly wrote:
On Fri, 01 Dec 2006 02:54:41 +0100, Andreas Hanke <andreas.hanke@gmx-topmail.de> wrote:
It's really insane. This XML stuff has made SUSE distros basically unusable without a broadband connection. 3MB before the distro is even released - crazy!
My diy-linux w/uclibc uses 20MB of disk, including perl. My SUSE 10.2 "minimal" install used 900MB of disk. That's some load of #@!$%!
It seems you might be interested in the MiniSuSE project: http://en.opensuse.org/MiniSUSE Lukas
On Friday 01 December 2006 01:27, Lukas Ocilka wrote:
John Kelly wrote:
On Fri, 01 Dec 2006 02:54:41 +0100, Andreas Hanke
<andreas.hanke@gmx-topmail.de> wrote:
It's really insane. This XML stuff has made SUSE distros basically unusable without a broadband connection. 3MB before the distro is even released - crazy!
My diy-linux w/uclibc uses 20MB of disk, including perl. My SUSE 10.2 "minimal" install used 900MB of disk. That's some load of #@!$%!
It seems you might be interested in the MiniSuSE project: http://en.opensuse.org/MiniSUSE
Lukas
That's for sure, and even more for MicroSUSE http://en.opensuse.org/MicroSUSE Although I would like to see John making MiniSUSE project to take off, his micro linux seems to be closer to MicroSUSE that is in the moment probably in greater need for people that can adapt present source to use uclibc. -- Regards, Rajko M. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Andreas Hanke wrote:
Keith Goggin schrieb:
I've noticed that <filelists.xml.gz> is refreshed by Yast YOU if it has changed on the server. It takes me about 4 minutes to download from <mirror.pacific.net.au> and is currently 2.9MB in size.
It's really insane. This XML stuff has made SUSE distros basically unusable without a broadband connection. 3MB before the distro is even released - crazy!
Let's have a look at some numbers ;) On my repository for 10.1: repo type | size (bytes) | MB | files - ----------+--------------+-------+--------------------------- yast2 | 1831446 | 1.74 | packages, packages.en rpm-md | 1454322 | 1.37 | primary.xml.gz, filelists.xml.gz yast2 repositories make up a somewhat larger download (21%), actually. When you look at Factory: repo type | size (bytes) | MB | files - ----------+--------------+-------+--------------------------- yast2 | 17876184 | 17.04 | packages, packages.en rpm-md | 20263138 | 19.32 | primary.xml.gz, filelists.xml.gz Here rpm-md is a little larger (12%).
I had a look at this compressed text file and observed that it contains a very large proportion of redundant data. /a/b/c/d/e/file1 /a/b/c/d/e/file2 etc
It's not redundant data. Those are file paths. Part of it are redundant, sure (e.g. /usr/share, /usr/lib, ...). As in yast2 repository files (packages, packages.en) -- the yast2 repository files have a lot more redundant data as they're not even compressed.
If this file could be compressed further by omitting the redundant data the compressed version would download much faster and may improve the end users experience of updating their system.
That's exactly what gzip is already doing on those rpm-md XML files.
Yes, there are much more efficient ways to store these data than repomd does. But hey, XML is so much cooler.
XML has its advantages. It's very easy to parse (but not necessarily easy to parse with good performance, SAX/DOM/StAX/...). Using DOM for such files uses insane amounts of memory. Parsing yast2 metadata isn't that complex but you definitely have to write your own error-prone parser instead of just using StAX. It took quite a few attempts and fixes to write the yast2 parser in smart properly, as the various "tags" are not documented (and suddenly new tags appeared, etc...).
Maybe one of those talented persons who designed repomd should have a look at /var/lib/locatedb. It's able to store equivalent information about 2 Linuxes, 1 Windows system and a lot of user files in just 2.5MB. Thanks to not using XML.
One might argue that xml.gz _is_ a binary format ;) gzip should already optimize away the XML tags and attributes. filelists.xml.gz contains all files that are in Factory in 19.32 MB and files of RPMs for 1023 different projects in 1.37 MB in my repository. Note that using bzip2 would be more effective (with filelists.xml.gz from Factory): Compression | File size | diff | ratio - --------------------+-----------+------+------- uncompressed | 155920703 | +92% | 0.00 original (gzip) | 12908161 | 0 | 12.07 gzip -9 | 12871908 | -1% | 12.11 gzip -9 --rsyncable | 13158574 | +2% | 11.84 bzip2 | 10848980 | -16% | 14.37 lrzip -M | 4879263 | -63% | 31.95 (diff is compressed compared to original .gz) (ratio is compressed compared to uncompressed) lrzip performs best (to say the least) but takes a long time, especially compared to gzip, and a lot of RAM (-M uses all available RAM, performed on a 2GB box), even on my dual-core AMD64. The ratio is pretty impressive though (see http://ck.kolivas.org/apps/lrzip/README). bzip2 is also significantly slower than gzip and uses somewhat more CPU but still gives a 16% gain. Now with the yast2 "packages" file: Compression | File size | diff | ratio - --------------------+-----------+------+------- original (no compr.)| 15547367 | 0 | 0.00 gzip -9 | 2340627 | -85% | 6.64 bzip2 -9 | 1797400 | -89% | 8.64 lrzip -M | 1393749 | -92% | 11.15 Note: you cannot directly compare the yast2 and rpm-md files (except for overall size) as they contain somewhat different subsets of data of a repository. Obviously there's still a lot to gain from compressing yast2 repository files -- even with a quick gzip -9 the file is 85% smaller. Who files the bug for requesting that feature ? ;) But to summarize, the compression already optimizes away the redundant data, especially the XML tags and attributes. It comes pretty much at no cost as of the download size. Of course, when it is parsed, it first has to be uncompressed (or streamed). The parsing performance highly depends on the method (SAX, DOM, StAX, ...). An interesting approach is to store that parsed data into a relational database, which is what ZMD is doing (sqlite DB). A database can use indexes to speed up queries a lot, as compared to having to load the whole data into memory and processing the query yourself. But not don't ask me why ZMD is a lot slower than zypp or smart... maybe the cost of inserting all that data into the sqlite database ? Or the synchronization with zypp/yast ? cheers - -- -o) Pascal Bleser http://linux01.gwdg.de/~pbleser/ /\\ <pascal.bleser@skynet.be> <guru@unixtech.be> _\_v The more things change, the more they stay insane. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) iD8DBQFFb92/r3NMWliFcXcRAlAIAJ9Zd5EVozn2gLWZbcxC+yRflvEnXQCfSxbn fS1AZmOlw0VeDTUfSzD1FGI= =LNhD -----END PGP SIGNATURE----- --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On Friday 01 December 2006 09:46, Pascal Bleser wrote:
Let's have a look at some numbers ;)
[...] From your numbers it is pretty clear that there are some straightforward optimization ways: - use bzip2 for repodata: downloading 2 MB of extra data is still slower than uncompressing a file with bzip2 compared to gzip - use compression for yast repositories - suggest using yast repositories for SUSE packages I know this is not that easy to do for repo-md, if other distributions should agree with it as well, but for yast repositories it is something that should be considered right from the start for 10.3. About XML parsing performance, here are some numbers (AMD64, 3200+): - parsing a 17MB (!) HTML file in Quanta Plus is 19seconds. Own parser, but I'm sure it is more complex and does much more things than an XML parser needed for YaST. - reading 2MB of XML file from several hunderds of files using Qt's DOM functions takes 328ms. So let's say 16MB would take 2 seconds. I did not messure this in YaST, but if it's slower, there is a big problem as Qt's DOM implementation is not considered to be fast. ;-) Andras -- Quanta Plus developer - http://quanta.kdewebdev.org K Desktop Environment - http://www.kde.org
Pascal Bleser schrieb:
Let's have a look at some numbers ;)
On my repository for 10.1: repo type | size (bytes) | MB | files ----------+--------------+-------+--------------------------- yast2 | 1831446 | 1.74 | packages, packages.en rpm-md | 1454322 | 1.37 | primary.xml.gz, filelists.xml.gz
yast2 repositories make up a somewhat larger download (21%), actually.
The yast2 metadata are not compressed. If you compress the yast2 metadata or uncompress the repomd XML files, you will see a significant difference. Not to mention that e.g. /var/lib/locatedb, compressed with gzip, is just 700 KB here - holding filelist information about a whole lot more of stuff. Is it really necessary to repeat the full path of every single file in the metadata and not just the difference to some sort of base path? Does the depsolver really need the full filelists? It doesn't even use them most of the time: - Seaching for files inside packages doesn't work with repomd although the information is available in the metadata - Detecting filesystem clashes doesn't work with repomd although the information is available in the metadata So I can only conclude that the information is most of the time not even used. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
On Fri, Dec 01, 2006 at 02:54:41AM +0100, Andreas Hanke wrote:
Keith Goggin schrieb:
I've noticed that <filelists.xml.gz> is refreshed by Yast YOU if it has changed on the server. It takes me about 4 minutes to download from <mirror.pacific.net.au> and is currently 2.9MB in size.
It's really insane. This XML stuff has made SUSE distros basically unusable without a broadband connection. 3MB before the distro is even released - crazy!
I had a look at this compressed text file and observed that it contains a very large proportion of redundant data.
/a/b/c/d/e/file1 /a/b/c/d/e/file2 etc
If this file could be compressed further by omitting the redundant data the compressed version would download much faster and may improve the end users experience of updating their system.
Yes, there are much more efficient ways to store these data than repomd does. But hey, XML is so much cooler.
Maybe one of those talented persons who designed repomd should have a look at /var/lib/locatedb. It's able to store equivalent information about 2 Linuxes, 1 Windows system and a lot of user files in just 2.5MB. Thanks to not using XML.
Yeah, I saw this too and thought about at least using some kind of directory prefix or similar handling. XML really makes it easy to be stupid in data design :/ Ciao, MArcus --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
Op vrijdag 1 december 2006 08:56, schreef Marcus Meissner:
Maybe one of those talented persons who designed repomd should have a look at /var/lib/locatedb. It's able to store equivalent information about 2 Linuxes, 1 Windows system and a lot of user files in just 2.5MB. Thanks to not using XML.
Yeah, I saw this too and thought about at least using some kind of directory prefix or similar handling.
XML really makes it easy to be stupid in data design :/
What about md5sum-ming the filelist, and use the md5sum as key? If a new version of a package is released with the same filelist only the md5sum needs to be transferred. For big packages the compression might be around 100% ;) -- Richard Bos Without a home the journey is endless --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Richard Bos wrote:
Op vrijdag 1 december 2006 08:56, schreef Marcus Meissner:
Maybe one of those talented persons who designed repomd should have a look at /var/lib/locatedb. It's able to store equivalent information about 2 Linuxes, 1 Windows system and a lot of user files in just 2.5MB. Thanks to not using XML. Yeah, I saw this too and thought about at least using some kind of directory prefix or similar handling.
XML really makes it easy to be stupid in data design :/
What about md5sum-ming the filelist, and use the md5sum as key? If a new version of a package is released with the same filelist only the md5sum needs to be transferred. For big packages the compression might be around 100% ;)
Already done. For yast2 repositories, package managers look at the serial number in media.1/media (the 2nd line), so only the media.1/media file is downloaded (very small, just a few bytes (22 bytes in my repository)). I use a timestamp for that. For rpm-md repositories, package managers look at the repodata/repomd.xml file. It has SHA checksums and timestamps of the other files (filelists.xml.gz and primary.xml.gz), e.g.: - ---8<------------------------------------------------------------ <?xml version="1.0" encoding="UTF-8"?> <repomd xmlns="http://linux.duke.edu/metadata/repo"> <data type="filelists"> <location href="repodata/filelists.xml.gz"/> <checksum type="sha">...</checksum> <timestamp>1165003298</timestamp> <open-checksum type="sha">...</open-checksum> </data> <data type="primary"> <location href="repodata/primary.xml.gz"/> <checksum type="sha">...</checksum> <timestamp>1165003298</timestamp> <open-checksum type="sha">...</open-checksum> </data> </repomd> - ---8<------------------------------------------------------------ That file is somewhat larger as for yast2 repositories, but still very small (951 bytes in my repository). yast2 and smart look at the serial numbers/timestamps of those files and compare it with the repository metadata they downloaded the last time. They will download the full metadata if and only if the remote serial/timestamp differs. cheers - -- -o) Pascal Bleser http://linux01.gwdg.de/~pbleser/ /\\ <pascal.bleser@skynet.be> <guru@unixtech.be> _\_v The more things change, the more they stay insane. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) iD8DBQFFcL3Sr3NMWliFcXcRAiqXAJ9l5t7iClRQxzoO581P4nJb7idRZwCgo35P ANBcC9X3wA6HVj04BX3zUc8= =DQx6 -----END PGP SIGNATURE----- --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
Pascal Bleser schrieb:
What about md5sum-ming the filelist, and use the md5sum as key? If a new version of a package is released with the same filelist only the md5sum needs to be transferred. For big packages the compression might be around 100% ;)
Already done.
No, that's not what was asked for. Just as an example for repomd: You have a repository with package A-1 and B-1. createrepo writes the filelists of both packages into filelists.xml. Now A gets upgraded to A-1.1. Which is a great thing, because the package manager has to download the whole filelists.xml again, even though package B has not been touched at all. What you describe covers only the fact that the metadata aren't downloaded again if nothing changed. But if even the slightest thing changed, everything is downloaded from scratch, even the parts that have not been changed. Solutions: - Use a smarter protocol that can "fix" this design problem, e.g. rsync instead of http/ftp. - Think about a smarter metadata format. SUSE had one (the old plain-text patchinfos) and threw it away in favour of repomd. - Extend the repomd format to suck less, e.g. by splitting filelists.xml into filelists-dec06.xml, filelists-jan07.xml, filelists-feb07.xml so that at least the unchanged filelists from previous months aren't downloaded over and over again. --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 The Saturday 2006-12-02 at 01:11 +0100, Andreas Hanke wrote: ...
What you describe covers only the fact that the metadata aren't downloaded again if nothing changed. But if even the slightest thing changed, everything is downloaded from scratch, even the parts that have not been changed.
Mmm. That would need some kind of database update, transmit only the items modified. Better by whole sections. Perhaps the coding effort would be thought a waste with present day wideband... for those fortunate to have it, of course. Divide the metadata in portions, would be a compromise. - -- Cheers, Carlos E. R. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Made with pgp4pine 1.76 iD8DBQFFcO7gtTMYHG2NR9URAgTsAJ9OtRZs5DQi9dlEWzEaG9jd66/7vQCfcNf9 Pm1KCawdOFImSkl4l1mFHjs= =GXW+ -----END PGP SIGNATURE----- --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
Op zaterdag 2 december 2006 01:11, schreef Andreas Hanke:
No, that's not what was asked for.
Just as an example for repomd:
You have a repository with package A-1 and B-1. createrepo writes the filelists of both packages into filelists.xml.
Now A gets upgraded to A-1.1. Which is a great thing, because the package manager has to download the whole filelists.xml again, even though package B has not been touched at all.
What you describe covers only the fact that the metadata aren't downloaded again if nothing changed. But if even the slightest thing changed, everything is downloaded from scratch, even the parts that have not been changed.
Solutions:
- Use a smarter protocol that can "fix" this design problem, e.g. rsync instead of http/ftp.
- Think about a smarter metadata format. SUSE had one (the old plain-text patchinfos) and threw it away in favour of repomd.
- Extend the repomd format to suck less, e.g. by splitting filelists.xml into filelists-dec06.xml, filelists-jan07.xml, filelists-feb07.xml so that at least the unchanged filelists from previous months aren't downloaded over and over again.
That's why I proposed to md5summing the filelist. A package with multi versions has often the same files. Especially in the released distribution as only security or important bug fixes are applied to packages. That means that the filelist of packages A-1.1 and A-1.2 is the same. It also means that the md5sum of the filelist of those 2 packages are the same. This could even be valid for multiple version of distribution. So when the client has a database with the md5sum of the filelist as key and as "payload" the filelist, the client can easily determine the filelist belonging to a package. How this should work in details, I have not worked out, but it might save a lot of bandwith. -- Richard Bos Without a home the journey is endless --------------------------------------------------------------------- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-factory+help@opensuse.org
participants (14)
-
Andras Mantia
-
Andreas Hanke
-
Carlos E. R.
-
Christoph Thiel
-
Eberhard Moenkeberg
-
Felix Miata
-
jdd
-
John Kelly
-
Keith Goggin
-
Lukas Ocilka
-
Marcus Meissner
-
Pascal Bleser
-
Rajko M
-
Richard Bos