SVN Importer Tips for converting to Subversion

Intro to the SVN Importer

I’ve previously posted about the SVN Importer tool here and hoped at some point to follow up on my experiences converting from specific version control tools.  Well, after a StarTeam conversion project last year that was easily an order of magnitude larger than any other conversion project I’ve ever done, I think I’m fairly well qualified to write on the topic.  I had previously done some small conversions using StarTeam 2005 (aka version 11) but for this project, the customer was using StarTeam 2009  (aka version 12.5).  Oh, and I when I say this effort was big, I mean REALLY big: the largest project had almost 20 million file revisions and the whole system had around 50 million file revisions.

Groundwork

The first thing I noticed in doing other smaller conversions using the SVN Importer is that StarTeam lacks certain critical functions in its command line interface (CLI) that allow these sorts of conversions.  Because of this, the SVN Importer developers, out of necessity I believe, choose to use the StarTeam API to perform the conversion to SVN.  This requires that you have the StarTeam SDK installed on your conversion machine.  Also, if you are converting very large projects (greater than 1 million file revisions) as I was, it means you’ll need a 64-bit version of the SDK.  While I was able to track this down for StarTeam 2009, I don’t believe this exists in earlier versions.  You’ll also need to make sure that the correct version of the StarTeam API jar file is in the classpath of the importer and that the Lib directory of the StarTeam SDK is included in your PATH environment variable.

Once I actually got my conversions running with SVN Importer things went well converting the trunk of projects but I encountered the following error anytime I tried to convert any branches, aka derived views in StarTeam:

INFO historyLogger:84 - EXCEPTION CAUGHT: org.polarion.svnimporter.svnprovider.SvnException: Unknown branch:

Since I was familiar with the inner workings of SVN Importer and the source was freely available, I worked to debug this issue and was able to find a simple coding error that was easily corrected.  As I recall it was because the code in question was using the wrong method, with the wrong return type, to get the branch name.

Later on, I encountered another problem where the same file would be added twice in the same SVN revision in the output dump files.  When attempting to load these dumps into a SVN repository, I would see the error message ‘Invalid change ordering: new node revision ID without delete.’  After some detective work I determined that the same file was being added to revisions multiple times when there were multiple StarTeam labels (equivalent to SVN tags) for the same set of changes.  I made a small adjustment to the model for StarTeam to check if a file exists in a revision before trying to add and this resolved the issue.

Besides these more significant problems, there were a few things I wanted to improve about how the conversion process worked.  To start, the converter was performing duplicate checkouts for each file revision that was adding a good deal of extra time to the conversion process.  In addition, because the conversions I was doing were on very large repositories, over the course of a longer conversion certain StarTeam operations could fail for various reasons (for example network and/or server flakiness) and the converter was written in a such a way that a failure on any StarTeam operation would cause the whole conversion to fail.  To mitigate this issue, I wrapped each call to StarTeam in some logic to retry the operation if there was an error.  Once all these changes were made, I was ready to tear though these projects … or perhaps crawl is a better way to describe it!

Make it go

If you have ever done a version control history migration, you know that these migrations can take a long time to run as the process checks out every version of every file and constructs the new repository.  When we ran smaller tests we found the performance to be a bit slow, but nothing prepared us for the projects with millions of file revisions.

As we moved to larger and larger projects, not only did the time requirements swell, but also the hardware requirements.  While projects with tens (or even hundreds) of thousands of revisions were achievable with 8 GB RAM, we found that this was not enough RAM for projects with millions of file revisions.  This could be very frustrating because the conversions could sometimes run for over a day before erroring out and when they did there was no way to recover the conversion; you had to start all over from the beginning.  When even 16 GB was not enough for the very largest project (consisting of roughly 18 million file revisions), I even had doubts that increasing our RAM up to 32 GB would be sufficient.  Fortunately, once at 32 GB of RAM we never had to worry about RAM again.

In all, the conversion process for this largest project took almost 2 weeks (!) to complete its processing, and almost just as long to validate.  The validation portion of a conversion is probably most often overlooked, and it is mostly simple to do, but still necessary.  The process of loading very large SVN repositories takes nearly as long as the conversion process itself.  One issue that we encountered on this project was actually a limit on the filesystem inodes for ext3.  While this was simple enough to handle, I’m glad we did the validation load to test everything before moving on to the load of the production SVN system.

All in all, this StarTeam to SVN conversion effort took roughly 3 months and was not without its share of challenges but was ultimately worth the effort for the customer.  There really is no substitute for this sort of migration.  In most cases, without a migration like this, companies that need this data available will keep an older VCS running for years, with all the associated costs, in order to stay in compliance with their internal policies or external regulations.

If you’d like to know more about the code changes made to SVN Importer, here’s the situation.  I have made all of these updates available to Polarion, but as of now I don’t have an idea when these changes will be made publicly available through their SVN repository.  If you have questions about StarTeam conversions or the code changes I made, respond in the comments and I can give more detail and possibly find another way to share my changes.


Quinn Bailey

ALM solution consultant

4 thoughts on “SVN Importer – converting from Borland StarTeam”

Geb · July 14, 2014 at 8:07 am

I am going through same process and got the “EXCEPTION CAUGHT: org.polarion.svnimporter.svnprovider.SvnException: Unknown branch” exception. It would be great to get your changes as a patch or a jar or simply code change pointers. I don’t seem to find any updates on the polarion site and I am not as familiar with the inner workings of the tool as you are.

Quinn Bailey · August 1, 2014 at 10:17 am

Geb,

This error was pretty easy to fix, a one line change to the source if I recall correctly. Let me look into the code and see if I can dig up this change.

Quinn

Quinn Bailey · August 1, 2014 at 10:49 am

Geb,

OK, so the fix needs to be made on line 139 of STTransform.java:

svnModel.addFileCopyToTag(revision.getAbsolutePath(), tagName, revision.getBranch().getName(), revision.getAbsolutePath(), oldRevno);

Instead that line should read:

svnModel.addFileCopyToTag(revision.getAbsolutePath(), tagName, revision.getBranch().getBranchName(), revision.getAbsolutePath(), oldRevno);

The getName method apparently returns an empty string for the trunk or root view, whereas the addFileCopyToTag method expects all branches to be named, so the getBranchName method is needed instead.

Hope that helps and thanks for visiting!

Cheri Furno · January 26, 2016 at 2:20 pm

It appears that the changes you detail above only address one issue of the several you have listed. Is there a way to get the rest of the changes?

We are addressing this at our company this year and any additional information would be greatly appreciated.

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts

DevOps

Release Packaging for Release Management

Release Packaging to Improve Software Deployments Release Packaging a requirement for Release Management Release packaging for release management is done in a graphical method with Release Engineer.  Other solutions uses a hard-coded manifest file that Read more...

DevOps

Moving from Ant to Meister – No conversion required

Ant to Meister conversion is not required Don’t believe that an Ant to Meister conversion is needed to move to Meister Build services Our competitors would like you to believe that an Ant to Meister Read more...

DevOps

Scalable Continuous Delivery tools with 7.51

Scalable Continuous Delivery with OMS 7.5.1 OMS Meister and Release Engineer for Scalable Continuous Delivery with Jenkins Plug-in Release 7.5.1 of our DevOps Tool suite, optimized for Scalable Continuous Delivery,  is available for download. This release Read more...