[opensuse-buildservice] Thoughts about using test driven development for my gsoc project

11 Sep 2011

      Hi,

as you might know I participated in the GSoC this year. When the coding period
started my mentors and I decided to use the "test driven development" (TDD)
approach to develop the python obs library. In the following I'll summarize
why I think using this approach was a good idea and how it helped me to write
the code.

* It helps with designing a class interface
  With TDD you usually write the testcases _before_ the actual code. When
  doing so you already get a feeling if the design of the interface or method
  is practical because you use the interface multiple times in your testcases.
  Example:
  One of the first coding tasks was to write a class for managing (editing,
  saving etc.) project/package xml metadata. For instance a common use case
  is to add a new repository the project's metadata so I wrote a testcase for
  it. The first version looked something like this:

  prj = RemoteProject('foobar')
  repo = prj.add_element('repository', name='openSUSE_11.4')
  repo.add_element('arch', 'x86_64')
  repo.add_element('path', project='openSUSE:11.4', repository='standard')

  I think this doesn't really look pythonic (but of course this is just a
  matter of taste) so finally I ended up with the following:

  prj = RemoteProject('foobar')
  repo = prj.add_repository(name='openSUSE_11.4')
  repo.add_arch('x86_64')
  repo.add_path(project='openSUSE:11.4', repository='standard')

  (of course the add_* methods aren't statically coded in the RemoteProject's
  class – instead we use a "ElementFactory" which is returned by an overridden
  __getattr__ (for the details have a look at the code:) ))
  Without TDD I probably would have implemented the first version and
  afterwards I had realized that I didn’t like it...

* It helps structuring the code
  Let's consider some "bigger" method which needs quite some logic like
  the wc.package.Package class' update method (the update method is used to
  update an osc package working copy). Before writing the testcases I started
  to think about how the update method can be structured and what parts can
  reside in its own (private) method (probably a natural thing which has
  nothing to do with TDD). I ended with the following rough layout:
      o calculate updateinfo: a method which calculates which files have to be
        updated (in particalur it returns an object which encapsulates newly
        added filenames, deleted filenames, modified filenames, unchanged
        filenames etc.)
      o perform merges: a method which merges the updated remote file with the
        local file
      o perform adds: simply adds the new files to the working copy
      o perform deletes: deletes local files which don’t exist anymore in the
        remote repo

  Then I started to write some testcases for the calculate_updateinfo method
  and implemented it. Next I wrote testcases for the various different update
  scenarios and implemented the methods and so on. It is probably much
  easier to write testcases for many small methods than for a few "big"
  methods.
  From time to time I realized that I forgot to test some "special cases", so
  I added a new testcase, fixed the code and ran the testsuite again. The cool
  thing is if the testsuite succeeds (that is the fix doesn’t break any of
  the existing testcases + the newly added testcase succeeds) one gains
  confidence that the fix was "correct". 

* It speeds up the actual coding
  My overall impression is that TDD “speeds” up the actual coding a bit.
  While writing the testcases I also thought how it could be implemented. So
  when all testcases were done I had rough blueprint of the method in my mind
  and "just" transformed it into code. For instance it didn’t take much time
  to write the initial version of calculate_updateinfo method.
  But of course this doesn’t work for all methods. Stuff like the Package
  class' update method took quite some time (and thinking!) even though I
  already wrote some testcases. The main reason was the fact that the update
  should be implemented as a single "transaction" (the goal was that the
  working copy isn't corrupted if the update was interrupted). As you can see
  TDD is no black magic approach which makes everything easier – thinking is
  still required:)

* It helps to avoid useless/unused code paths
  I just wrote the code to comply with the existing testcases – no other
  (optional) features were added. Sometimes I realized that some feature was
  missing. In this case I added another testcase and implemented the missing
  feature. So the rule was whenever a new feature was required a testcase had
  to exist (either a testcase which directly tests the modified method or a
  testcase which tests a method which implicitly calls the modified method).

* It helps to overcome one's weaker self
  From time to time I had to write some trivial class or method where I
  thought it isn’t worth the effort to write testcases for it. A perfectly
  prominent example was the wc.package.UnifiedDiff class (it's only purpose is
  to do a "svn diff"-like file diff). At the beginning I wanted to start coding
  without writing testcases because I thought it's enough to test its base
  class (that’s the place were the interesting things happen and the rest is
  just "presentation/visualization").
  Luckily I abandoned this idea and wrote the testcases. It turned out that it
  was a good idea because this "trivial" visualization I had in mind was more
  complicated than I initially thought;)
  What I learned from this example is that it is most likely better to write
  testcases because the class/method might evolve and might get more
  complicated.

Finally here's a small statistic about osc2's current code coverage (generated
with python-nose's nosetest):

Name                Stmts   Miss  Cover
---------------------------------------
osc                     1      0   100%
osc.build              79      4    95%
osc.core               19      2    89%
osc.httprequest       180     19    89%
osc.oscargs           145      1    99%
osc.remote            278     11    96%
osc.source             68      2    97%
osc.util                1      0   100%
osc.util.io            85      8    91%
osc.util.xml           23      0   100%
osc.wc                  1      0   100%
osc.wc.base           173     20    88%
osc.wc.convert         71      6    92%
osc.wc.package        792     39    95%
osc.wc.project        397     20    95%
osc.wc.util           387     28    93%

(line numbers and non osc2 modules are removed)

As a conclusion I would say that using the TDD approach was a good idea and
helped a lot. So you might want to give it a try too – it probably won't harm:)

Last but not least I want to thank my mentors Sascha Peilicke and
Marcus 'darix' Rueckert for their time and tremendous help (meetings,
suggestions, interesting links etc.) during the GSoC. Thanks!

Marcus
-- 
To unsubscribe, e-mail: opensuse-buildservice+unsubscribe@opensuse.org
For additional commands, e-mail: opensuse-buildservice+help@opensuse.org

Marcus Hüwe

tags

participants (1)