Home Home
Sign up | Login

Author Archive

[gsoc] osc2 client – summary of week 5

June 26th, 2012 by


here’s a small summary of the 5th (coding) week. Last week I spent
most of my time with working on the cpio module. Finally I ended
up with a complete rewrite of osc’s “old” cpio module. The
cpio implementation details are taken from the cpio 2.11 package.
Currently only the “new ascii” format is supported and we can
only handle regular files (this is sufficient for our needs). Here’s
a small example how to use it:

# cpio_open is just a convenience method for CpioArchive(filename=fname)
with cpio_open(fname) as archive:
    for archive_file in archive:
        # print filename
        print archive_file.hdr.name
        # print contents
        print archive_file.read()

It’s also possible to pass a file-like object to a CpioArchive
instance (in this case we do not need to use the with statement).
Also it can be “easily” enhanced to support different cpio formats
(one just have to write the specific ArchiveReader and ArchiveWriter
classes:) ). The code is available at [1] and the testcases at [2].

Todo for this week:
* finish the build module
* start with working on the package fetching code


[1] https://github.com/openSUSE/osc2/blob/master/osc/util/cpio.py
[2] https://github.com/openSUSE/osc2/blob/master/test/util/test_cpio.py

[gsoc] osc2 client – summary of week 3 and 4

June 19th, 2012 by


here’s a small summary of the 3rd and 4th (coding) week . First of all I
wasn’t able to do lots of work in week 3 and 4 so I’m still working on
the new build module for the osc2 library.
The initial plan was to copy/reuse some of the existing modules from the
(old) “osc” like the cpio [1] and packagequery modules. But I decided to
refactor/rewrite cpio module for the following reasons:

  • “save” disc space:
    In our scenario we retrieve a cpio archive from the api (which contains
    binary packages for example). The old cpio module expects a filename
    in order to “unpack” the archive – that is the file has to be stored on
    disc first. Consequently approximately 2 * of free disc
    space is needed.
    The new idea is that we pass a file-like object (in our case an object
    which inherits from “AbstractHTTPResponse”) to the cpio module and
    unpack the archive “on the fly” (without storing the http response
    on disc first).
  • have testcases:
    The old cpio module has no testcases (because some time ago I didn’t
    follow the TDD approach;) ). For nearly all modules in osc2 there exist
    testcases (white-box tests) thus it would be nice if we have some
    testcases for the cpio module, too (theoretically we could add some
    black-box tests (the current methods aren’t really testable thus
    white-box tests aren’t possible)).
  • have a nice pythonic interface:
    The new interface will look like this:

    from cpio import cpio_open
    # let "f" be a file-like object (for instance a http response)
    with cpio_open(fobj=f) as cpio_archive:
        for a_file in cpio_archive:
            # store file (with correct permissions etc.) in os.curdir
            # alternatively it's also possible to read some data (instead of
            # writing it to disc) via a_file.read(len)

    We will also support a plain filename in cpio_open.

Currently the cpio module will only support the “new ascii” (ascii SVR4
no CRC) format and regular files (that’s sufficient for our needs). But
it will be possible to simply pass in a class for a different format
(that is no code has to be altered in order to support a new format).

Finally this will be finished by the end of this week.
If you have any questions or suggestions please tell me:)


[1] https://github.com/openSUSE/osc/blob/master/osc/util/cpio.py

[gsoc] osc2 client – summary of week 1 and 2

June 4th, 2012 by


first of all I’m happy that I was accepted for GSoC again. The goal of this year’s project is to enhance
the existing osc2 library which was developed during last GSoC. Additionally I’m going to work on a new osc2 client. For the details have a look at the proposal (or just ask:) ).
Here’s a small summary of the first and second (coding) week. Unfortunately I was a bit busy with university (as usual…) and I just implemented the missing code for the “Request” class. Now it is possible to accept, decline etc. reviews and requests. Example

req = Request.find('123')
req.accept(comment='looks good')
# or accept the second review
req = Request.find('42')
review = req.review[1]
req.accept(comment='ok', review=review)

As usual the test driven development approach is used which worked quite good in the last year.
Todo for this week:

  • add a build module to the osc2 library which can be used to build a package (basically a wrapper around the “build” script)

Thoughts about using test driven development for my gsoc project

September 11th, 2011 by


as you might know I participated in the GSoC this year. When the coding period
started my mentors and I decided to use the “test driven development” (TDD)
approach to develop the python obs library. In the following I’ll summarize
why I think using this approach was a good idea and how it helped me to write
the code.

  • It helps with designing a class interface
    With TDD you usually write the testcases _before_ the actual code. When
    doing so you already get a feeling if the design of the interface or method
    is practical because you use the interface multiple times in your testcases.
    One of the first coding tasks was to write a class for managing (editing,
    saving etc.) project/package xml metadata. For instance a common use case
    is to add a new repository the project’s metadata so I wrote a testcase for
    it. The first version looked something like this:

    prj = RemoteProject('foobar')
    repo = prj.add_element('repository', name='openSUSE_11.4')
    repo.add_element('arch', 'x86_64')
    repo.add_element('path', project='openSUSE:11.4', repository='standard')

    I think this doesn’t really look pythonic (but of course this is just a
    matter of taste) so finally I ended up with the following:

    prj = RemoteProject('foobar')
    repo = prj.add_repository(name='openSUSE_11.4')
    repo.add_path(project='openSUSE:11.4', repository='standard')

    (of course the add_* methods aren’t statically coded in the RemoteProject’s
    class – instead we use a “ElementFactory” which is returned by an overridden
    __getattr__ (for the details have a look at the code:) ))
    Without TDD I probably would have implemented the first version and
    afterwards I had realized that I didn’t like it…

  • It helps structuring the code
    Let’s consider some “bigger” method which needs quite some logic like
    the wc.package.Package class’ update method (the update method is used to
    update an osc package working copy). Before writing the testcases I started
    to think about how the update method can be structured and what parts can
    reside in its own (private) method (probably a natural thing which has
    nothing to do with TDD). I ended with the following rough layout:

    • calculate updateinfo: a method which calculates which files have to be
      updated (in particalur it returns an object which encapsulates newly added
      filenames, deleted filenames, modified filenames, unchanged filenames etc.)
    • perform merges: a method which merges the updated remote file with the
      local file
    • perform adds: simply adds the new files to the working copy
    • perform deletes: deletes local files which don’t exist anymore in the
      remote repo

    Then I started to write some testcases for the calculate_updateinfo method
    and implemented it. Next I wrote testcases for the various different update
    scenarios and implemented the methods and so on. It is probably much
    easier to write testcases for many small methods than for a few “big” methods.
    From time to time I realized that I forgot to test some “special cases”, so
    I added a new testcase, fixed the code and ran the testsuite again. The cool
    thing is if the testsuite succeeds (that is the fix doesn’t break any of
    the existing testcases + the newly added testcase succeeds) one gains
    confidence that the fix was “correct”.

  • It speeds up the actual coding
    My overall impression is that TDD “speeds” up the actual coding a bit. While writing
    the testcases I also thought how it could be implemented. So when all testcases
    were done I had rough blueprint of the method in my mind and “just” transformed
    it into code. For instance it didn’t take much time to write the initial
    version of calculate_updateinfo method.
    But of course this doesn’t work for all methods. Stuff like the Package class’
    update method took quite some time (and thinking!) even though I already wrote
    some testcases. The main reason was the fact that the update should be
    implemented as a single “transaction” (the goal was that the working copy
    isn’t corrupted if the update was interrupted). As you can see TDD is no
    black magic approach which makes everything easier – thinking is still
  • It helps to avoid useless/unused code paths
    I just wrote the code to comply with the existing testcases – no other
    (optional) features were added. Sometimes I realized that some feature was
    missing. In this case I added another testcase and implemented the missing
    feature. So the rule was whenever a new feature was required a testcase had to
    exist (either a testcase which directly tests the modified method or a testcase
    which tests a method which implicitly calls the modified method).
  • It helps to overcome one’s weaker self
    From time to time I had to write some trivial class or method where I
    thought it isn’t worth the effort to write testcases for it. A perfectly
    prominent example was the wc.package.UnifiedDiff class (it’s only purpose is
    to do a “svn diff”-like file diff). At the beginning I wanted to start coding
    without writing testcases because I thought it’s enough to test its base class
    (that’s the place were the interesting things happen and the rest is just
    Luckily I abandoned this idea and wrote the testcases. It turned out that it
    was a good idea because this “trivial” visualization I had in mind was more
    complicated than I initially thought;)
    What I learned from this example is that it is most likely better to write
    testcases because the class/method might evolve and might get more complicated.

Finally here’s a small statistic about osc2’s current code coverage (generated
with python-nose’s nosetest):

Name                Stmts   Miss  Cover
osc                     1      0   100%
osc.build              79      4    95%
osc.core               19      2    89%
osc.httprequest       180     19    89%
osc.oscargs           145      1    99%
osc.remote            278     11    96%
osc.source             68      2    97%
osc.util                1      0   100%
osc.util.io            85      8    91%
osc.util.xml           23      0   100%
osc.wc                  1      0   100%
osc.wc.base           173     20    88%
osc.wc.convert         71      6    92%
osc.wc.package        792     39    95%
osc.wc.project        397     20    95%
osc.wc.util           387     28    93%

(line numbers and non osc2 modules are removed)

As a conclusion I would say that using the TDD approach was a good idea and
helped a lot. So you might want to give it a try too – it probably won’t harm:)

Last but not least I want to thank my mentors Sascha Peilicke and
Marcus ‘darix’ Rueckert for their time and tremendous help (meetings,
suggestions, interesting links etc.) during the GSoC. Thanks!

[gsoc] osc code cleanup – summary of week 11

August 8th, 2011 by


here’s a small summary of the 11th (coding) week. This week I spent
most of my time with working on the wc code.


  • project wc: added commit and update methods
  • lots of wc code refactoring


  • project wc: commit only specific files for a package instead of the
    complete package (the package wc class already supports this)
    (use case: osc ci pkg1/file pkg1/foo pkg2/bar pkg3)
  • convert old working copies to the new format
  • package wc: update: add support to specify stuff like “expand”, “linkrev”
  • project wc: add a revert method (to restore a package wc with state ‘!’)
  • project/package wc: support diff
  • package wc: implement a pull method (does the same as “osc pull”)


[gsoc] osc code cleanup – summary of week 10

August 1st, 2011 by


here’s a small summary of the 10th (coding) week. This week I spent
most of my time with working on the package working copy class’
commit method. This also involved some code refactoring.
Basically commit and update are handled as “transactions” and all
relevant transaction data is stored in a single xml file
(pkg/.osc/_transaction/state). This way an update/commit is more or
less an atomic operation.
For more information have a look at [1] (class XMLTransactionState
+ subclasses and commit/update method).


  • add update and commit methods for the project class


[1] https://gitorious.org/osc2/osc2/blobs/master/osc/wc/package.py

[gsoc] osc code cleanup – summary of week 8

July 17th, 2011 by


here’s a small summary of the 8th (coding) week. This week I spent
most of my time with rewriting the working copy code.

  • added support to add and delete packages
  • added some “abstractions” for the tracking file format:
    currently packages and files are tracked in a xml file
  • thought about the package update algorithm. Basically
    it’ll work like this (verfy simplified version):
    – perform update in a tmpdir (phase 1)
    – if the tmp update finished, copy/rename all files to
    the wc (phase 2)
    If the update is interrupted in phase 1 the wc wasn’t touched
    at all and nothing should be broken.
    If the update is interrupted in phase 2 the wc is _inconsistent_
    but a subsequent “update” call can resume the update and everything
    should be consistent afterwards (in this case only files are


  • implement update + commit algorithm

If everything works as expected most parts of working copy code
cleanup should be finished after this week.


[gsoc] osc code cleanup – summary of week 7

July 10th, 2011 by


here’s a small summary of the 7th (coding) week. This week I spent
most my time with the project and package classes which manage osc’s
working copies.


  • basic working copy layout
  • checks to detect broken/corrupt working copies
  • locking support (in order to lock a working copy (for instance when
    doing a commit or an update))


  • add “core” methods like update, commit, diff etc.
  • (auto-) repair methods (to fix broken working copies)


[gsoc] osc code cleanup – summary of week 6

July 4th, 2011 by


here’s a small summary of the 6th (coding) week. Unfortunately I had
to spent more time with university stuff than I expected – that’s why
I didn’t finish the complete todo for this week.
I did some code restructuring and started to work on the class for the
source route.
– rewrite the project and package working copy classes:

  • the new working copy format will be incompatible with the current
  • the basic layout will look like this:
    —> .osc/ (stores prj _and_ pkg metadata)
    —> pkg1
    #       |
    #        —> <files>

    —> pkgN
    #       |
    #        —> <files>
    So all metadata is stored in the prj/.osc dir instead of prj/pkg/.osc
    The advantage is that we can support a complete package
    “restore” (without the need to download the package again):
    cd prj; rm -r pkg; osc revert/restore pkg;
    (that’s possible because the metadata is stored in the prj/.osc
  • to convert old project/package working copies to the new format
    the “osc repairwc” command can be used (at least that’s the plan)

Feedback is always welcome.


[gsoc] osc code cleanup – summary of week 5

June 26th, 2011 by


here’s a small summary of the 5th (coding) week. I’ve spent most
of the time with the url-like argument parser (more information can be
found here and here). Additionally I cleaned up/reworked the
remote file classes (now we have: RORemoteFile and RWRemoteFile).
I also added an AbstractHTTPResponse and HTTPError class to the httprequest
module (the main purpose of the AbstractHTTPResponse is to encapsulate a
“concrete” http response (for instance it can be used as a wrapper around
urllib(2)’s addinfourl class)).
TODO for this week:

  • write a search module in order to find packages, projects, requests etc.
  • maybe we also need something like a source module (mainly to access the
    /source route)
  • think about working copy code cleanup/internal restructuring