Hi,
as you might know I participated in the GSoC this year. When the coding period
started my mentors and I decided to use the “test driven development” (TDD)
approach to develop the python obs library. In the following I’ll summarize
why I think using this approach was a good idea and how it helped me to write
the code.
- It helps with designing a class interface
With TDD you usually write the testcases _before_ the actual code. When
doing so you already get a feeling if the design of the interface or method
is practical because you use the interface multiple times in your testcases.
Example:
One of the first coding tasks was to write a class for managing (editing,
saving etc.) project/package xml metadata. For instance a common use case
is to add a new repository the project’s metadata so I wrote a testcase for
it. The first version looked something like this:
prj = RemoteProject('foobar')
repo = prj.add_element('repository', name='openSUSE_11.4')
repo.add_element('arch', 'x86_64')
repo.add_element('path', project='openSUSE:11.4', repository='standard')
I think this doesn’t really look pythonic (but of course this is just a
matter of taste) so finally I ended up with the following:
prj = RemoteProject('foobar')
repo = prj.add_repository(name='openSUSE_11.4')
repo.add_arch('x86_64')
repo.add_path(project='openSUSE:11.4', repository='standard')
(of course the add_* methods aren’t statically coded in the RemoteProject’s
class – instead we use a “ElementFactory” which is returned by an overridden
__getattr__ (for the details have a look at the code:) ))
Without TDD I probably would have implemented the first version and
afterwards I had realized that I didn’t like it…
- It helps structuring the code
Let’s consider some “bigger” method which needs quite some logic like
the wc.package.Package class’ update method (the update method is used to
update an osc package working copy). Before writing the testcases I started
to think about how the update method can be structured and what parts can
reside in its own (private) method (probably a natural thing which has
nothing to do with TDD). I ended with the following rough layout:
- calculate updateinfo: a method which calculates which files have to be
updated (in particalur it returns an object which encapsulates newly added
filenames, deleted filenames, modified filenames, unchanged filenames etc.)
- perform merges: a method which merges the updated remote file with the
local file
- perform adds: simply adds the new files to the working copy
- perform deletes: deletes local files which don’t exist anymore in the
remote repo
Then I started to write some testcases for the calculate_updateinfo method
and implemented it. Next I wrote testcases for the various different update
scenarios and implemented the methods and so on. It is probably much
easier to write testcases for many small methods than for a few “big” methods.
From time to time I realized that I forgot to test some “special cases”, so
I added a new testcase, fixed the code and ran the testsuite again. The cool
thing is if the testsuite succeeds (that is the fix doesn’t break any of
the existing testcases + the newly added testcase succeeds) one gains
confidence that the fix was “correct”.
- It speeds up the actual coding
My overall impression is that TDD “speeds” up the actual coding a bit. While writing
the testcases I also thought how it could be implemented. So when all testcases
were done I had rough blueprint of the method in my mind and “just” transformed
it into code. For instance it didn’t take much time to write the initial
version of calculate_updateinfo method.
But of course this doesn’t work for all methods. Stuff like the Package class’
update method took quite some time (and thinking!) even though I already wrote
some testcases. The main reason was the fact that the update should be
implemented as a single “transaction” (the goal was that the working copy
isn’t corrupted if the update was interrupted). As you can see TDD is no
black magic approach which makes everything easier – thinking is still
required:)
- It helps to avoid useless/unused code paths
I just wrote the code to comply with the existing testcases – no other
(optional) features were added. Sometimes I realized that some feature was
missing. In this case I added another testcase and implemented the missing
feature. So the rule was whenever a new feature was required a testcase had to
exist (either a testcase which directly tests the modified method or a testcase
which tests a method which implicitly calls the modified method).
- It helps to overcome one’s weaker self
From time to time I had to write some trivial class or method where I
thought it isn’t worth the effort to write testcases for it. A perfectly
prominent example was the wc.package.UnifiedDiff class (it’s only purpose is
to do a “svn diff”-like file diff). At the beginning I wanted to start coding
without writing testcases because I thought it’s enough to test its base class
(that’s the place were the interesting things happen and the rest is just
“presentation/visualization”).
Luckily I abandoned this idea and wrote the testcases. It turned out that it
was a good idea because this “trivial” visualization I had in mind was more
complicated than I initially thought;)
What I learned from this example is that it is most likely better to write
testcases because the class/method might evolve and might get more complicated.
Finally here’s a small statistic about osc2’s current code coverage (generated
with python-nose’s nosetest):
Name Stmts Miss Cover
---------------------------------------
osc 1 0 100%
osc.build 79 4 95%
osc.core 19 2 89%
osc.httprequest 180 19 89%
osc.oscargs 145 1 99%
osc.remote 278 11 96%
osc.source 68 2 97%
osc.util 1 0 100%
osc.util.io 85 8 91%
osc.util.xml 23 0 100%
osc.wc 1 0 100%
osc.wc.base 173 20 88%
osc.wc.convert 71 6 92%
osc.wc.package 792 39 95%
osc.wc.project 397 20 95%
osc.wc.util 387 28 93%
(line numbers and non osc2 modules are removed)
As a conclusion I would say that using the TDD approach was a good idea and
helped a lot. So you might want to give it a try too – it probably won’t harm:)
Last but not least I want to thank my mentors Sascha Peilicke and
Marcus ‘darix’ Rueckert for their time and tremendous help (meetings,
suggestions, interesting links etc.) during the GSoC. Thanks!