Mailinglist Archive: opensuse-factory (757 mails)

< Previous Next >
Re: [opensuse-factory] Bug 177758
  • From: Pascal Bleser <pascal.bleser@xxxxxxxxx>
  • Date: Tue, 23 May 2006 18:20:21 +0200
  • Message-id: <44733645.2090604@xxxxxxxxx>
Ulrich Windl wrote:
> On 23 May 2006 at 14:09, jdd wrote:
>> If I follow well the thread it seems this is a meta-data
>> download problem.
> If it's a download problem, why would the CPU be at 100%?

That sounds a lot like XML repository metadata parsing.
libzypp/ZMD is most probably not parsing the data stream as it is
being downloaded, so I presume it's download everything (for one
repository) first, then parse.

Would be interesting to do some profiling on parse-metadata.
Anything available for Mono ?

What XML parsing model is being used there, SAX, DOM, StAX ?

Most probably not DOM...

When I look at my ("guru") RPM-MD repository for 10.0 (which is large,
but a lot smaller than the FTP tree):
== compressed:
primary.xml.gz = 1,058,326 (bytes)
filelists.xml.gz = 1,029,756
other.xml.gz = 497,642

== uncompressed:
primary.xml = 6,174,750
filelists.xml = 11,032,393
other.xml = 2,950,989

== compression ratio:
primary.xml = 5.83
filelists.xml = 10.42
other.xml = 5.92

Now when I look at SL-10.1/inst-source/suse/repodata:
== compressed:
primary.xml.gz = 8,056,136 (bytes)
filelists.xml.gz = 17,474,199
other.xml.gz = 53,265,854

BTW, other.xml.gz is *huge* (contains %changelog information) - I
don't know whether libzypp/zmd download and/or use "other.xml.gz"
though. smart ( doesn't.

== uncompressed:
primary.xml = 47,127,500
filelists.xml = 211,884,423
other.xml = 206,534,422

== compression ratio:
primary.xml = 5.83
filelists.xml = 12.12
other.xml = 3.87

Assuming that other.xml is not being used by libzypp/ZMD, I would
guess the following memory usage with DOM:
- primary.xml: 47MB on disk => 150-200MB RAM
- filelists.xml: 210MB on disk => 600-800MB RAM

Hmm.. after all... maybe it _is_ DOM ;)

Could someone with the mentioned libzypp/ZMD problems have a look at
memory/swap usage as well ?
- vmstat -n 5 999
- sar -r 5 999 (even better; sar is part of the sysstat package))

The yast2 format is possibly more efficient wrt memory and CPU.
Maybe worth investigating whether the memory+CPU problems happen with
RPM-MD repos but not with yast2 repos... ?

I'd say turn down (or even remove) all repositories, then just add the
10.1 FTP tree metadata, and run vmstat or sar to monitor CPU+memory
usage. With sysstat, it can be done like this:

sar -r -X `pidof parse-metadata` 5 999

(5 = 5 second interval, 999 = number of iterations)

You can even draw graphs from that data when you store it into a file
(-o option, it's a binary format), but you can't use -o in conjunction
with -X, so the stats would be system-wide:

mkdir ~/sar
sar -o ~/sar/sa.$(date '+%Y_%m_%d') -r -u 5 999
isag -p ~/sar

... then choose the file (click on the "-" button) and choose memory
or CPU graph (and you can even save the picture).

-o) Pascal Bleser
/\\ <pascal.bleser@xxxxxxxxx> <guru@xxxxxxxxxxx>

< Previous Next >
This Thread