Mailinglist Archive: opensuse-project (537 mails)

< Previous Next >
[opensuse-project] GSoC: new python obs library (osc code cleanup)
  • From: Marcus Hüwe <suse-tux@xxxxxx>
  • Date: Tue, 29 Mar 2011 18:50:06 +0200
  • Message-id: <20110329165006.GB4221@linux>
Hi,

my name is Marcus Hüwe and I'm currently enrolled at the University of
Paderborn where I'm studying computer science. I spend most of my
freetime with hacking on osc and from time to time on other
buildservice components.

My plan for this year's GSoC is to write a new python obs library
which is independent of osc and can be used by other clients as well.
For the details see the attached "python_obs_lib.txt" file.


Marcus
Abstract
The goal of this GSoC project is to do a code cleanup and partial rewrite
of osc. Currently most of the osc code is written specifically for osc
itself (mostly without keeping reuse in mind). So instead of doing an
osc "internal" code cleanup our plan is to refactor the code into an osc
independent python library. Our intend is that this library can be used
by osc and other python clients as well (for instance some GUI client).
This proposal is based on
http://en.opensuse.org/openSUSE:GSOC_2011_Ideas#Code_love_for_osc

Basic requirements for the new obs library:
* robustness:
The library should be able to validate xml api responses against a xml
schema/dtd. This way all client applications can be sure that the response
is valid and all mandatory elements are available (at the moment osc doesn't
support such a validation - so either some "manually" validation is done on
a case by case basis or none validation is done (which lead to some runtime
errors from time to time)). A validation error/other error (like a http error)
will be propagated to the caller (via an exception).
Note: providing schemata for all api responses is not within the scope of this
project. Rather we want to provide validation support in the library so that
the schemata can be added later.
* easy to use responses:
Most of osc's current api methods are everything but convenient: in most
cases a raw xml string is returned which has to be parsed by the caller. It
would be much more convenient to return an object which is backed up by the
specific xml response, example:
- the old version of "show_package_meta(...)" returned a simple xml string
- the new version should return an object "pkgmeta" which can be used
"easily":
pkgmeta.getName() => returns the name
pkgmeta.getProjectName() => returns the project name
# maybe we should also support something like:
pkgmeta.getProject() => returns prjmeta object
pkgmeta.getDescription() => returns the package description
pkgmeta.has_develproject() => True or False
pkgmeta.getDevelproject() => name of develproject or '' (empty string)
- this comes handy especially when working with the "files meta":
filesmeta.getRev() => returns revision string
filesmeta.has_linkinfo() => True or False
for file in filesmeta.getFiles():
file.getName() => returns filename
file.getMD5() => returns md5sum
...
(for a more concrete example see appendix [1])
If we use such objects we should also provide a "toXML()" method which
"serializes" the object into the original xml. In case of the pkgmeta object
it might also come handy if the object provides a "save()" method which
stores the metadata (if changed) on the server.
* use of callbacks/visitor pattern:
Using such a mechanism ensures that all kind of clients (CLI, GUI) can use
the library (prominent examples where this mechanism will be used are the
"update()" and "commit()" methods of the "Project" and "Package" classes).
* support different http libraries:
The library shouldn't force users to use a specific http(s) library
implementation.
Instead it should provide an "abstract" class/interface which defines some
general methods. This way User A can use the library with urllib2 and
User B can use libcurl.
Nevertheless the library's default implementation will be based on urllib2.
Additionally it's possible to use some non standard authentification methods
like oauth (the user just needs to write the "auth handler" and pass it to our
library).

Timeline/development process:
* April 25 - May 23 (Community Bonding Period):
Since I'm quite familiar with the openSUSE/openSUSE BuildService community
I'll use the "Community Bonding Period" mainly for reading documentation and
figuring out which library can be used for the xml validation.
Addtionally I'll work on a more fine-grained library layout (like module
structure, class diagram (if needed) etc.)
* May 23 - July 11/15 (Interim Period):
- depending on the choice of the xml library the first task might be to
write some helpers/utilities for it
- basic library methods should be working at the end of this period:
- api response validation
- the library methods return objects (where applicable) as described above
- testsuite for this "basic" methods
- begin with refactoring the "Project" and "Package" classes (these classes
are used to handle working copies) (this is the part where the callbacks
will be used)
* July 15 - August 15 (Interim Period):
- finish the refactoring of the "Project" and "Package" classes (max. one
week)
- port osc code to the new library (or this will be done in an ealier phase on
a per feature basis (this means: if feature/method A was added to the
library
the corresponding osc code will be ported directly))
The goal is to finish project before the "suggested pencils down" date (August
15)
so that we have enough time for testing (and the resulting bugfixing:) ) before
the "firm pencils down" date approaches (August 22).

Appendix

[1]:
server response:
<directory name="osc" rev="8365ebaa753f4adad12bec315f54b0a0"
srcmd5="8365ebaa753f4adad12bec315f54b0a0">
<linkinfo baserev="9554a9bfe6bb7bebaf968c93758c9ba7"
lsrcmd5="eb00aafcc08646f33e9fbb7a10e0c88d" package="osc"
project="openSUSE:Factory" srcmd5="9554a9bfe6bb7bebaf968c93758c9ba7" />
<entry md5="e04daeed192c7092ce9ab92e16870485" mtime="1292638008"
name="osc-0.130.1.tar.gz" size="243123" />
<entry md5="d6aa1610f3d114ce8570cdb1fa9c3ff9" mtime="1292638010"
name="osc.changes" size="68819" />
</directory>

Instead of returning the raw xml to the caller we will "deserialize" the
xml into an object (in the following this object is called "fmeta"):
fmeta.getName() => 'osc'
fmeta.has_linkinfo() => True
fmeta.getLinkinfo().getProject() => 'openSUSE:Factory'
fmeta.getFiles() => [<entry_object_1>, <entry_object_2>] # instead of using a
list we could also
# use another object
which provides methods like obj.getFile(name='osc.changes')
# => <entry_object_2>
entry_object_1.getName() => 'osc-0.130.1.tar.gz'
< Previous Next >