Ulrich Windl wrote:
On 23 May 2006 at 14:09, jdd wrote:
If I follow well the thread it seems this is a meta-data download problem.
If it's a download problem, why would the CPU be at 100%?
That sounds a lot like XML repository metadata parsing.
libzypp/ZMD is most probably not parsing the data stream as it is
being downloaded, so I presume it's download everything (for one
repository) first, then parse.
Would be interesting to do some profiling on parse-metadata.
Anything available for Mono ?
What XML parsing model is being used there, SAX, DOM, StAX ?
Most probably not DOM...
When I look at my ("guru") RPM-MD repository for 10.0 (which is large,
but a lot smaller than the FTP tree):
== compressed:
primary.xml.gz = 1,058,326 (bytes)
filelists.xml.gz = 1,029,756
other.xml.gz = 497,642
== uncompressed:
primary.xml = 6,174,750
filelists.xml = 11,032,393
other.xml = 2,950,989
== compression ratio:
primary.xml = 5.83
filelists.xml = 10.42
other.xml = 5.92
Now when I look at SL-10.1/inst-source/suse/repodata:
== compressed:
primary.xml.gz = 8,056,136 (bytes)
filelists.xml.gz = 17,474,199
other.xml.gz = 53,265,854
BTW, other.xml.gz is *huge* (contains %changelog information) - I
don't know whether libzypp/zmd download and/or use "other.xml.gz"
though. smart (http://smartpm.org) doesn't.
== uncompressed:
primary.xml = 47,127,500
filelists.xml = 211,884,423
other.xml = 206,534,422
== compression ratio:
primary.xml = 5.83
filelists.xml = 12.12
other.xml = 3.87
Assuming that other.xml is not being used by libzypp/ZMD, I would
guess the following memory usage with DOM:
- primary.xml: 47MB on disk => 150-200MB RAM
- filelists.xml: 210MB on disk => 600-800MB RAM
Hmm.. after all... maybe it _is_ DOM ;)
Could someone with the mentioned libzypp/ZMD problems have a look at
memory/swap usage as well ?
- vmstat -n 5 999
- sar -r 5 999 (even better; sar is part of the sysstat package))
The yast2 format is possibly more efficient wrt memory and CPU.
Maybe worth investigating whether the memory+CPU problems happen with
RPM-MD repos but not with yast2 repos... ?
I'd say turn down (or even remove) all repositories, then just add the
10.1 FTP tree metadata, and run vmstat or sar to monitor CPU+memory
usage. With sysstat, it can be done like this:
sar -r -X `pidof parse-metadata` 5 999
(5 = 5 second interval, 999 = number of iterations)
You can even draw graphs from that data when you store it into a file
(-o option, it's a binary format), but you can't use -o in conjunction
with -X, so the stats would be system-wide:
mkdir ~/sar
sar -o ~/sar/sa.$(date '+%Y_%m_%d') -r -u 5 999
isag -p ~/sar
... then choose the file (click on the "-" button) and choose memory
or CPU graph (and you can even save the picture).
cheers
--
-o) Pascal Bleser http://linux01.gwdg.de/~pbleser/
/\\