Home Home > 2010 > 10 > 30 > Merging SVN Repositories Explained
Sign up | Login

Deprecation notice: openSUSE Lizards user blog platform is deprecated, and will remain read only for the time being. Learn more...

Merging SVN Repositories Explained

October 30th, 2010 by

Adding files to a SVN server is usually a task done in seconds. However, having several independent SVN repositories and wanting to “combine” them, this is not trivial—especially if you want to preserve the history.

The doc team had had three different, independent repositories on BerliOS (opensuse-ha-doc, opensuse-docmaker, and opensuse-lfl) all holding separate information. This was a bit silly, so my task was to consolidate them into opensuse-doc by keeping all history.

A SVN history is nice as you can see what you or others have done with the files. Loosing the history is the same as starting from scratch — which can be sometimes a good idea. In that case, the doc team wanted to preserve the history as we use it very often. So I had to think of a solution how to merge different SVN repositories into one. I came up with a solution containing the following steps:

  1. Create a test repository
  2. Create the dump files
  3. Filter the dump files and extract trunk only
  4. Rename trunk content
  5. Edit the dump files
  6. Load dump files into test repository

Several tools are needed: svnadmin and svndumpfilter are already available from the subversion package. Additionally, Darix recommended the svndumptool from devel:tools:scm:svn (thanks Darix!). A very convenient tool when dealing with SVN dumps.

Apart from this, all three SVN repositories consist of the usual trunk, branches, and a tags directories.

Step 1: Creating Test Repository

Before I got my hands dirty, I needed a test repository. Luckily, the subversion package contains everything what I need.  In my case, I’ve used the hotcopy command to create this test repository. I’ve logged in to BerliOS and made a hotcopy of my target SVN repository:

# ssh svn.berlios.de
# svnadmin hotcopy /svnroot/repos/opensuse-doc opensuse-doc

I did the same for my other repositories as well, just for safety reasons:

# svnadmin hotcopy /svnroot/repos/opensuse-lfl opensuse-lfl
# svnadmin hotcopy /svnroot/repos/opensuse-docmaker opensuse-docmaker
# svnadmin hotcopy /svnroot/repos/opensuse-ha-doc opensuse-ha-doc

Later, I will use the opensuse-doc hotcopy to test my modifications before I apply them to the production SVN repository. With this method, I wanted to make sure everything is correct. Of course, you have to be sure that nobody commits to your repositories, otherwise any later commits won’t be incorporated.

Step 2: Creating Dump Files

Dump files contain everything what’s inside a repository including SVN properties. However, they don’t contain the configuration or repository hooks from the repository. If you want to keep them, you’ll have to save them manually.

It’s very easy to create a dump file. In BerliOS, everything lives under /svnroot/repos/PROJECT:

# svnadmin dump /svnroot/repos/opensuse-lfl > opensuse-lfl.dump
# svnadmin dump /svnroot/repos/opensuse-docmaker > opensuse-docmaker.dump
# svnadmin dump /svnroot/repos/opensuse-ha-doc > opensuse-ha-doc.dump

The above lines create a dump file for each of my SVN repositories. As the BerliOS server doesn’t have the svndumptool command, I have to copy them to my own machine:

toms@earth:~ > scp shell.berlios.de:~/opensuse-*.dump .

Step 3: Extracting trunk

In most SVN repositories, tags and branches contain information which were at some point in time copied (for tags) or copied and modified later (for branches). I haven’t found a satisfying solution to use svndumptool and take into account these two directories. Anyway, I’ve decided to just concentrate on trunk and manually adjust tags and branches later. This makes it easier when dealing with dump files.

To extract only parts of a dump file, the svndumpfilter command is the right tool. Combined with the needed options and subcommands, I’ve used:

toms@earth:~ > svndumpfilter --renumber-revs --drop-empty-revs \
    include trunk < opensuse-docmaker.dump > docmaker-trunk.dump
toms@earth:~ > cat dumps/opensuse-ha-doc.dump | \
    svndumpfilter --renumber-revs --drop-empty-revs  include "trunk" | \
    svndumpfilter exclude trunk/package > ha-doc-trunk.dump
toms@earth:~ > svndumpfilter --renumber-revs --drop-empty-revs \
    include trunk < opensuse-lfl.dump > lfl-trunk.dump

The commands are not exactly the same, as the SVN repositories are slightly different. For example, the opensuse-ha-doc repo contained a trunk/package directory which I don’t want. Therefor you see the exclude command in svndumpfilter. The resulting dump files contain only trunk, nothing else.

Step 4: Renaming trunk Directory

The last step created dump files which contain only the trunk directory. The dump files can already be loaded into the target SVN repository. However, this is only partly successful. After loading you have to rename and move subdirectories around which is unconvenient. To avoid this cumbersome task, I’ve used the svndumptool command which does the job directly on the dump file!

For example, the former docmaker repository has to appear under trunk/tools/docmaker, not directly under trunk. This can be done with the merge subcommand of svndumptool:

toms@earth:~ > svndumptool merge -o docmaker-trunk-rename.dump \
   -i docmaker-trunk.dump \
   -s '^trunk' 'trunk/tools/docmaker' \
   -d trunk/tools/docmaker

The -d option creates the trunk/tools/docmaker directory in case it doesn’t exist in your target. The -i and -o options are the input and output stream of the dump files, -s contains a regular expression how to rename the source (^trunk) into the target (trunk/tools/docmaker). The other repositories are similar:

toms@earth:~ > svndumptool merge -o ha-doc-trunk-rename.dump  \
  -i ha-doc-trunk.dump \
  -d trunk/documents/ha/ \
  -s '^trunk/books/en' 'trunk/documents/ha/en' \
  -s '^trunk/books' 'trunk/documents'
toms@earth:~ > svndumptool merge -o lfl-trunk-rename.dump  \
  -i lfl-trunk.dump \
  -d trunk/documents/lfl/ \
  -s '^trunk/books/en' 'trunk/documents/lfl/en' \
  -s '^trunk/books' 'trunk/documents'

After this step I have three files, named *-trunk-rename.dump which contains the renamed directories.

Step 5: Editing Dump File

This step is a bit tricky and I haven’t found a better solution yet. Probably this can also be done with cat and sed magic, but I preferred vi. The task is to list the content of the dump files and decide which directory entries need to be deleted. This step is absolutely necessary to avoid any SVN error (something like “directory/file already exists”) when trying to load the dump file into the repository.

  1. Get an overview of the contents:
    toms@earth:~ > svndumptool ls ha-doc-trunk-rename.dump
    /trunk
    ...
    toms@earth:~ > svndumptool ls docmaker-trunk-rename.dump
    /trunk
    /trunk/documents

    Usually this is /trunk and probably some other directories (depending on the structure).

  2. Modify the dump file(s) and remove any directories which are available in the target SVN repository.
    The renamed directories from the last step contains a single trunk “node” which has to be removed. The same applies for trunk/documents. Both exist already on the BerliOS server. Manually edit the dump files and remove the following lines:

    Node-path: trunk
    Node-action: add
    Node-kind: dir
    Prop-content-length: 10
    Content-length: 10
    PROPS-END
    ...
    Node-path: trunk/documents
    Node-action: add
    Node-kind: dir
    Prop-content-length: 10
    Content-length: 10
      [[ Some other content could be available here ]]
    PROPS-END
  3. Save the dump file

Be careful! It is very important not to destroy any structure in the dump file!

I’ve checked the result with “svndumptool ls DUMPFILE”. No lonely trunk directory should be shown anymore. To be on the safe side, I’ve checked the two files with svndumptool check -A DUMPFILE, too.

Step 6: Loading Dump Files into Test Repository

The last step created the dump files with the correct structure. As this was done on my own machine, I have to copy the files back to BerliOS (remember to use the shell.berlios.de server):

toms@earth:~ > scp *-rename.dump  shell.berlios.de:

It’s almost done! Logged into BerliOS and loaded the dump files in my hotcopy test repository (see step 1):

# svnadmin load opensuse-doc  < docmaker-trunk-rename.dump
# svnadmin load opensuse-doc  < ha-doc-trunk-rename.dump
# svnadmin load opensuse-doc  < lfl-trunk-rename.dump

After playing around a bit, everything was successful. So I repeated it, but replaced the opensuse-doc hotcopy with the correct path /svnroot/repos/opensuse-doc.

The remaining pieces are the tags and branches directory. As revision numbers have been changed, the revision number in the original repository and the revision number in opensuse-doc are obviously not the same anymore. Creating tags can only be done with a correct date, because dates are preserved. For this reason, it is convenient to still have the original repositories around to look for the exact date and log message:

$ svn copy -r "{2010-10-20T10:00:00}" -m "..." \
   $BERLIOSURL/trunk/tools/docmaker \
   $BERLIOSURL/tags/tools/docmaker/obs

The branches are a bit different. Maybe the history can be preserved with the same method as with trunk, but I haven’t tried it. Our three repositories hadn’t had meaningful information in branch, or it wasn’t worth the effort.

Conclusion

The steps above showed how to merge different SVN repository into one while keeping the history.  Unfortunately, this task could be easier. Although svndumpfilter is very easy, it is also very limited. Especially when you have to modify the dump file itself, svndumpfilter won’t rescue you. It would be very good, if svndumpfilter could be extended to work with dump files in the same way as svndumptool. That would be fun. 🙂

Both comments and pings are currently closed.

Comments are closed.