Hi,
here's my work from the weekend. Basically just using callgrind and
fixing the most egregious sources for slowness. My testcase was "zypper
lu", i.e. just reading in caches.
Result:
# LD_LIBRARY_PATH=../../inst/lib/ time ../../zypper/build/src/zypper lu
...
* Reading repository 'http://ftp.gwdg.de/pub/linux/misc/packman/suse/10.2/' cache
* Reading repository 'openSUSE-10.3-Updates' cache
* Reading repository 'openSUSE-10.3-retail 10.3' cache
* Reading installed packages [100%]
No updates found.
12.82user 0.96system 0:14.07elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+85700minor)pagefaults 0swaps
vs.
# LD_LIBRARY_PATH=./zypp time ../../zypper/build/src/zypper lu
* Reading repository 'http://ftp.gwdg.de/pub/linux/misc/packman/suse/10.2/' cache
* Reading repository 'openSUSE-10.3-Updates' cache
* Reading repository 'openSUSE-10.3-retail 10.3' cache
* Reading installed packages [100%]
No updates found.
8.51user 0.80system 0:09.52elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+69055minor)pagefaults 0swaps
4 seconds shoved off, not bad.
The patch is against current libzypp svn. It does several things:
1) ctor of Dep is slow, hence the hunk in cache/CacheTypes.{cc,h}
2) VersionedCap::encode is very slow. encode is used by the hashing
and for equality testing for the uset in CapFactory, ergo that too is
slow. Can be sped up quite some by implementing own hash() and same()
methods for the Capabilities. So, new virtual methods
CapabilityImpl::hash
CapabilityImpl::same
using encode(), but overrode for VersionCap to for explicit comparison
for same() and quicker hash computation for hash().
Accompanying changes in CapFactory to use these new methods.
3) regexp machinery is very slow. It's actually worthwhile to do
a string test if the regexp could match at all, before even trying the
regexp. In the profile only the two touched regex_match calls show up,
one must have a ':/' (the SplitSpec one), the other at least a ' ' in
the string to match. So test that before. Now not many regex calls
are left for me.
4) The SQL query for reading in the named capabilities is very slow. It's
actually faster to not connect name_id with name in the query, but
instead read in the named table separately, and doing the connection
name_id -> name by hand.
Actually the code I wrote also added an "order by name_id", and made
use of that, to spare many capfactory.fromImpl calls. It saves quite
some amount of time, or better, it would save much time, if sqlite3
wouldn't be so slow with the sorting in the query. But unfortunately
sqlite3 is so much slower with sorting that the saved number of calls
to fromImpl() (75% can be saved) can't make that up :-( Anyway reading
in the names table by hand still improves the time spent in sqlite3
quite much.
I also noticed one more thing which needs quite some time and memory.
Namely target/rpm/RpmPackageImpl.cc happily reads in the rpm changelog and
filelist when constructing the PackageImpl. Even when not needed at all.
The changelogs are usually not needed at all (certainly not for zypper lu
and most other commands). The filelists are not needed either most of the
time, certainly not in their completeness (i.e. they should at least be
filtered against isInterestingFile).
I removed reading filelist and changelogs from the rpm database and it
saved about 40 MB of RAM (!). I don't send this hunk, because for
completeness this should of course be read in lazily from the changelog()
and filenames() methods, but that would need some refactoring to still
have a handle to the database at that time. (Actually also the rpm
database should be read in from a cache and not from the rpm database
directly :-/ )
So, here it is. Have fun. If you find this usefull, I'm even offering to
clean the patch up a bit.
Oh, btw. there are not many more low hanging fruits anymore if the
changelog and filelist problem is solved. It's now much time in sqlite3,
and very much time spent all over the place just allocating objects and
strings, moving, copying and comparing them.
Ciao,
Michael.
Index: zypp/capability/CapabilityImpl.cc
===================================================================
--- zypp/capability/CapabilityImpl.cc (revision 7408)
+++ zypp/capability/CapabilityImpl.cc (working copy)
@@ -10,6 +10,7 @@
*
*/
#include <iostream>
+#include
#include "zypp/base/Logger.h"
#include "zypp/base/Regex.h"
@@ -46,6 +47,18 @@ namespace zypp
return encode() < rhs->encode();
}
+ std::size_t CapabilityImpl::hash() const
+ {
+ std::size_t ret = __gnu_cxx::hash()( encode().c_str() );
+ return ret;
+ }
+
+ bool CapabilityImpl::same (const constPtr &rhs) const
+ {
+ /* refers and kind are known to be the same */
+ return encode() == rhs->encode();
+ }
+
///////////////////////////////////////////////////////////////////
//
// METHOD NAME : CapabilityImpl::capImplOrderLess
@@ -167,11 +180,14 @@ namespace zypp
}
//split: name:/absolute/path
- static const str::regex rx( "([^/]*):(/.*)" );
- str::smatch what;
- if( str::regex_match( name_r, what, rx ) )
+ if (isSplitSpec (name_r))
{
- return new capability::SplitCap( refers_r, what[1], what[2] );
+ static const str::regex rx( "([^/]*):(/.*)" );
+ str::smatch what;
+ if( str::regex_match( name_r, what, rx ) )
+ {
+ return new capability::SplitCap( refers_r, what[1], what[2] );
+ }
}
//name: name
@@ -292,7 +308,8 @@ namespace zypp
// improve regex!
static const str::regex rx( "(.*[^ \t])([ \t]+)([^ \t]+)([ \t]+)([^ \t]+)" );
str::smatch what;
- if( str::regex_match( strval_r,what, rx ) )
+ if( strval_r.find(' ') != std::string::npos
+ && str::regex_match( strval_r,what, rx ) )
{
Rel op;
Edition edition;
Index: zypp/capability/CapabilityImpl.h
===================================================================
--- zypp/capability/CapabilityImpl.h (revision 7408)
+++ zypp/capability/CapabilityImpl.h (working copy)
@@ -110,6 +110,9 @@ namespace zypp
virtual std::string index() const
{ return encode(); }
+ virtual size_t hash() const;
+ virtual bool same (const constPtr &rhs) const;
+
public:
/** Solver hack. */
struct SplitInfo
@@ -318,6 +321,14 @@ namespace zypp
typedef std::setCapabilityImpl::Ptr CapabilityImplPtrSet;
+ inline bool exactly_same_caps_p (const CapabilityImpl::constPtr & lhs,
+ const CapabilityImpl::constPtr & rhs )
+ {
+ return ( lhs->refers() == rhs->refers()
+ && lhs->kind() == rhs->kind()
+ && lhs->same(rhs) );
+ }
+
/////////////////////////////////////////////////////////////////
} // namespace capability
///////////////////////////////////////////////////////////////////
Index: zypp/capability/VersionedCap.cc
===================================================================
--- zypp/capability/VersionedCap.cc (revision 7408)
+++ zypp/capability/VersionedCap.cc (working copy)
@@ -9,6 +9,8 @@
/** \file zypp/capability/VersionedCap.cc
*
*/
+#include <iostream>
+#include
#include "zypp/capability/VersionedCap.h"
using namespace std;
@@ -35,6 +37,38 @@ namespace zypp
return ret;
}
+ bool VersionedCap::same (const CapabilityImpl_constPtr &_rhs) const
+ {
+ intrusive_ptr<const Self> rhs( asKind<Self>(_rhs) );
+ if (!rhs) {
+ /* Gnah! VersionCap::kind() is "NamedCap", hence we might come
+ here with a NamedCap rhs, which then is not the same as us. */
+ intrusive_ptr<const NamedCap> rhs2( asKind<NamedCap>(_rhs) );
+ if (!rhs2) {
+ std::cerr << "Argh! : " << this->kind() << " <-> " << _rhs->kind() << std::endl;
+ return false;
+ }
+ return name() == rhs2->name();
+ }
+ if (name() != rhs->name())
+ return false;
+ const Edition::MatchRange & other = rhs->_range;
+ if (_range.op != other.op)
+ return false;
+ const Edition &ed1 = _range.value;
+ const Edition &ed2 = other.value;
+ return (ed1.epoch() == ed2.epoch()
+ && ed1.version() == ed2.version()
+ && ed1.release() == ed2.release());
+ }
+
+ size_t VersionedCap::hash() const
+ {
+ size_t ret = __gnu_cxx::hash()( name().c_str() );
+ ret ^= __gnu_cxx::hash()( _range.value.version().c_str() );
+ return ret;
+ }
+
std::string VersionedCap::index() const
{ return name(); }
Index: zypp/capability/VersionedCap.h
===================================================================
--- zypp/capability/VersionedCap.h (revision 7408)
+++ zypp/capability/VersionedCap.h (working copy)
@@ -56,6 +56,9 @@ namespace zypp
/** Edition. */
virtual Edition edition () const;
+ virtual size_t hash() const;
+ virtual bool same (const CapabilityImpl_constPtr &rhs) const;
+
protected:
/** Implementation dependent value. */
virtual const Edition::MatchRange & range() const;
Index: zypp/cache/CacheTypes.cc
===================================================================
--- zypp/cache/CacheTypes.cc (revision 7408)
+++ zypp/cache/CacheTypes.cc (working copy)
@@ -67,7 +67,8 @@ namespace zypp
if ( klass == "kind" )
_kind_cache[id] = Resolvable::Kind(name);
if ( klass == "deptype" )
- _deptype_cache[id] = name;
+ /* As Dep has no default ctor we can't use the [] accessor. */
+ _deptype_cache.insert (make_pair (id, Dep(name)));
}
MIL << "archs: " << _arch_cache.size() << endl;
@@ -130,9 +131,9 @@ namespace zypp
Dep CacheTypes::deptypeFor( const data::RecordId &id )
{
- std::map::const_iterator it;
+ std::map::const_iterator it;
if ( (it = _deptype_cache.find(id) ) != _deptype_cache.end() )
- return Dep(it->second);
+ return it->second;
else
{
ERR << "deptype: " << id << endl;
@@ -142,10 +143,10 @@ namespace zypp
data::RecordId CacheTypes::idForDeptype( const Dep & dep )
{
- std::map::const_iterator it;
+ std::map::const_iterator it;
for ( it = _deptype_cache.begin(); it != _deptype_cache.end(); ++it )
{
- if ( dep.asString() == it->second )
+ if ( dep == it->second )
return it->first;
}
ZYPP_THROW(Exception("Inconsistent deptype"));
Index: zypp/cache/CacheTypes.h
===================================================================
--- zypp/cache/CacheTypes.h (revision 7408)
+++ zypp/cache/CacheTypes.h (working copy)
@@ -136,7 +136,7 @@ namespace zypp
std::map _rel_cache;
std::map _kind_cache;
- std::map _deptype_cache;
+ std::map _deptype_cache;
std::map _arch_cache;
Pathname _dbdir;
};
Index: zypp/repo/cached/RepoImpl.cc
===================================================================
--- zypp/repo/cached/RepoImpl.cc (revision 7408)
+++ zypp/repo/cached/RepoImpl.cc (working copy)
@@ -333,7 +333,7 @@ void RepoImpl::read_capabilities( sqlite
//
// }
// }
- sqlite3_command select_named_cmd( con, "select v.refers_kind, n.name, v.version, v.release, v.epoch, v.relation, v.dependency_type, v.resolvable_id from named_capabilities v, names n, resolvables res where v.name_id=n.id and v.resolvable_id=res.id and res.repository_id=:repo_id;");
+ sqlite3_command select_named_cmd( con, "select v.refers_kind, v.name_id, v.version, v.release, v.epoch, v.relation, v.dependency_type, v.resolvable_id from named_capabilities v, resolvables res where res.repository_id=:repo_id and v.resolvable_id=res.id;");
sqlite3_command select_file_cmd( con, "select fc.refers_kind, dn.name, fn.name, fc.dependency_type, fc.resolvable_id from file_capabilities fc, files f, dir_names dn, file_names fn, resolvables res where f.id=fc.file_id and f.dir_name_id=dn.id and f.file_name_id=fn.id and fc.resolvable_id=res.id and res.repository_id=:repo_id;");
@@ -348,6 +348,16 @@ void RepoImpl::read_capabilities( sqlite
sqlite3_command select_other_cmd( con, "select oc.refers_kind, oc.value, oc.dependency_type, oc.resolvable_id from other_capabilities oc, resolvables res where oc.resolvable_id=res.id and res.repository_id=:repo_id;");
+ std::map namemap;
+ sqlite3_command get_names_cmd( con, "select id,name from names;" );
+ {
+ sqlite3_reader reader = get_names_cmd.executereader();
+ while ( reader.read() )
+ {
+ namemap[reader.getint(0)] = reader.getstring(1);
+ }
+ }
+
{
debug::Measure mnc("read named capabilities");
select_named_cmd.bind(":repo_id", repo_id);
@@ -355,27 +365,59 @@ void RepoImpl::read_capabilities( sqlite
// FIXME Move this logic to tick()?
Date start(Date::now());
+ Capability oldcap;
+ Resolvable::Kind oldrefer;
+ Rel oldrel;
+ //std::string oldname;
+ int oldname = -1;
+ std::string oldver, oldrelease;
+ int oldepoch = 0;
+
+
while ( reader.read() )
{
ticks.tick();
Resolvable::Kind refer = _type_cache.kindFor(reader.getint(0));
+ int name = reader.getint(1);
Rel rel = _type_cache.relationFor(reader.getint(5));
- data::RecordId rid = reader.getint64(7);
-
if ( rel == zypp::Rel::NONE )
{
- capability::NamedCap *ncap = new capability::NamedCap( refer, reader.getstring(1) );
- zypp::Dep deptype = _type_cache.deptypeFor(reader.getint(6));
- nvras[rid].second[deptype].insert( capfactory.fromImpl( capability::CapabilityImpl::Ptr(ncap) ) );
+ if (oldname != name || rel != oldrel || refer!=oldrefer)
+ {
+ oldrel = rel;
+ oldrefer = refer;
+ oldname = name;
+ capability::NamedCap *ncap = new capability::NamedCap( refer, namemap[name] );
+ oldcap = capfactory.fromImpl ( capability::CapabilityImpl::Ptr(ncap) );
+ }
}
else
{
- capability::VersionedCap *vcap = new capability::VersionedCap( refer, reader.getstring(1), /* rel */ rel, Edition( reader.getstring(2), reader.getstring(3), reader.getint(4) ) );
- zypp::Dep deptype = _type_cache.deptypeFor(reader.getint(6));
- nvras[rid].second[deptype].insert( capfactory.fromImpl( capability::CapabilityImpl::Ptr(vcap) ) );
+ std::string ver = reader.getstring(2);
+ std::string release = reader.getstring(3);
+ int epoch = reader.getint(4);
+ if (oldname != name || rel != oldrel || refer!=oldrefer
+ || oldver != ver
+ || oldrelease != release
+ || oldepoch != epoch)
+ {
+ oldrel = rel;
+ oldrefer = refer;
+ oldname = name;
+ oldver = ver;
+ oldrelease = release;
+ oldepoch = epoch;
+
+ capability::VersionedCap *vcap = new capability::VersionedCap( refer, namemap[name], /* rel */ rel, Edition( ver, release, epoch ) );
+ oldcap = capfactory.fromImpl( capability::CapabilityImpl::Ptr(vcap) );
+ }
}
+
+ zypp::Dep deptype = _type_cache.deptypeFor(reader.getint(6));
+ data::RecordId rid = reader.getint64(7);
+ nvras[rid].second[deptype].insert(oldcap);
}
}
Index: zypp/CapFactory.cc
===================================================================
--- zypp/CapFactory.cc (revision 7408)
+++ zypp/CapFactory.cc (working copy)
@@ -38,7 +38,8 @@ namespace
{
size_t operator() ( const CapabilityImpl::Ptr & p ) const
{
- return __gnu_cxx::hash()( p->encode().c_str() );
+ //return __gnu_cxx::hash()( p->encode().c_str() );
+ return p->hash();
}
};
@@ -46,9 +47,10 @@ namespace
{
bool operator() ( const CapabilityImpl::Ptr & lhs, const CapabilityImpl::Ptr & rhs ) const
{
- return ( lhs->encode() == rhs->encode()
- && lhs->kind() == rhs->kind()
- && lhs->refers() == rhs->refers() );
+ return ::zypp::capability::exactly_same_caps_p (lhs, rhs);
+ /*return ( lhs->refers() == rhs->refers()
+ && lhs->kind() == rhs->kind()
+ && lhs->encode() == rhs->encode() );*/
}
};
--
To unsubscribe, e-mail: zypp-devel+unsubscribe@opensuse.org
For additional commands, e-mail: zypp-devel+help@opensuse.org