If you’ve
kept a series of historical snapshots of your work in folders, fs2svn
can help you upgrade to a full-fledged Subversion version
control system.
fs2svn goes through all the folders under a given parent folder (in
filesystem order) and creates a Subversion revision for each one,
backdated to the most recent file’s last modified date. The
log message is set to the folder name.
Additions, changes, and deletions between one folder and the next
are all recorded in the repository.
The input format is very simple. It only covers the
mainline trunk, not any tags or branches (though
tags for major versions could be manually created later, if
your folder names carry enough information).
The format is so simple it could be used as a common
intermediary. If you wanted to migrate a mainline trunk from
some exotic version control system to Subversion, you could
write a script to export it to regular
folders, then use this script to import the result into
Subversion.
Deltas
As an optional feature, you can mark any subfolders as "deltas" using a configurable
naming convention. Files missing from delta folders won't be
marked as deleted in the repository. The idea is that you
had only put in the newly changed files. (If it turns
out that a file
hasn't changed, it isn't committed a second time, delta folder or not.)
Subfolders of delta folders may be regular full folders or delta
folders, dependent on the same naming convention. Whatever you
add to the name to indicate a delta will be stripped off
before the folder hits the repository.
You can configure the naming convention with one or more
--ignore-deletes-in command line options. See the examples below.
Folder Renaming: Shorthand Folders
You may find your project has a lot of pieces, and so
you'd like to create top-level folders within your project
trunk for each one. Unfortunately, in the archive folders
you're using to create the repository, they may not have the right names,
or the names are inconsistent. Or, since most of the changes just involve
one particular piece, you didn't bother creating a folder for it
on each revision. Changes to that main piece were usually stored
loose in the revision's folder. But now you'd like them to go into
a designated subfolder in the repository.
Shorthand folders let you rename
folders as the root level of the project as fs2svn reads them.
If none of those rules match, everything goes under a
default folder. All of this is configurable.
Shorthand folders deserve an example.
Say you have this:
- My Project Archive
- 2005-01-05 backup
- 2005-01-08 - exported db - delta
- 2005-02-12 - fixed typo - delta
- 2005-02-18 - fixed datatype - delta
- etc.
You want the repository to have a "www" (originally "wwwroot") folder, a "scripts"
folder, and a "db" folder (orignally "sql work" or "database"). But
in many cases, the source folders don't contain any of those
folders, and instead have loose files that really belong in "www". So make
"www" the default folder, "wwwroot" a special folder that maps to "www",
"sql work" a shorthand folder that maps to "db", "database" a shorthand
folder that maps to "db", and "scripts" a special folder (with no renaming).
The repository will look like this:
Revisions 1 and 2 (created automatically)
- add /branches
- add /tags
- add /trunk
Revision 3 (date: 2005-01-05, log: "2005-01-05 backup")
- add /trunk/www/index.html
- add /trunk/www/about.html
- add /trunk/scripts/helper.py
Revision 4 (date: 2005-01-08, log: "2005-01-08 - exported db - delta")
- add /trunk/db/db structure.sql
Revision 5 (date: 2005-02-12; log: "2005-02-12 - fixed typo - delta")
- change /trunk/www/about.html
Revision 6 (date: 2005-02-18; log: "2005-02-18 - fixed datatype - delta")
- change /trunk/db/db structure.sql
Implementation
fs2svn depends on cvs2svn, a tool for
converting CVS repositories to Subversion. When setting out to
write it, I considered creating my own Subversion dumpfile
writer from scratch, but then decided to use the one the
cvs2svn team had already written and tested.
Unfortunately, cvs2svn wasn't written to be pulled apart.
Its SVNRevision class depends on CVSRevision, which in turn
depends on everything else.
So fs2svn, in order to use any cvs2svn functionality, has
to inject its own replacement CVSRevision class into cvs2svn.
Instead of calling out to the CVS command-line tools, the new
class reads directly from the filesystem.
A disadvantage of this approach is that fs2svn is now
rather tightly coupled to cvs2svn, and may have to be updated
along with cvs2svn. It's open to debate whether this cost is
worth the perks, which include many command-line options from
cvs2svn that work for free in fs2svn. (Unfortunately,
command-line parsing was the one area I had to copy some of
cvs2svn's source, rather than just importing it.)
Sample Command Lines
This command line generates a dumpfile suitable for svnadmin load,
fills in MIME types from your Apache mime.types file, and suppresses
native line ending conversion. (Apache mime.types location is correct
for Mac OS X, at least.)
python fs2svn.py --dumpfile=../svndumpfile.txt --dump-only --username=$USER --svnadmin=/usr/local/bin/svnadmin --no-default-eol --keywords-off --mime-types=/etc/httpd/mime.types --exclude="[.].*" --exclude="[.]DS_Store" --exclude="_vti_cnf" --ignore-deletes-in="(.*?) *delta" --ignore-deletes-in="(.*?) *part" --ignore-deletes-in="from +(.*)" --shorthand-folders=shorthand-folders.txt ../folder-with-many-revision-subfolders
Making a dumpfile is often useful (you may want to perform futher
processing before importing it into a repository--beware of using a text
editor, though, because some, including BBEdit, aren't binary-safe and
will silently normalize line endings, destroying binary files.) You can
skip straight to the repository, though:
python fs2svn.py -s ../myrepository --fs-type=fsfs --username=$USER ... (continue as before)
More on Shorthand Folders (Folder Renaming)
If there's a shorthand folder config present, then there's another layer
of structure inside the revision folders that fs2svn recognizes.
For each revision folder, either
- All the children are shorthand folders, or
- None of them are, and the default shorthand folder will be assumed as a parent
Sample ShorthandFolderMapper file (set with --shorthand-folders=)
# Format is dir-name-in-filesystem:repository path (under trunk)
#
# If there's no colon, the name and path are taken to be the same.
# If the first part is empty (line starts with a colon), the second part
# specifies a default respository path. If revision dir's subfolders don't
# match the other shorthand folders, it's assumed that all its contents were meant to be
# put under this default parent.
#
# Example: ":www" means that revision folders containing no shorthand subfolders
# will have their contents placed in /trunk/www/.
# "mssql:db" means that revision folders containing a direct "mssql" child
# will have that sufolders contents placed in /trunk/db/.
# "programs-shared" means that programs-shared is recognized as a shorthand
# folder but the name is unchanged in the repository.
#
:www
wwwroot:www
www
programs-shared
mssql:db
db:db
wwwroot dev:www
wwwroot myproject:www
myproject:www
Installation
Download cvs2svn and its dependencies. The script has been
tested with Python 2.3 and higher and cvs2svn 1.2.1. Notably, you
may have to install the Python bindings for BerkeleyDB, which
cvs2svn really does depend on.
Next, name your downloaded copy of fs2svn "fs2svn.py", and
put it in the same folder as cvs2svn.py. (Or move cvs2svn.py
to the same folder as fs2svn.py. Either way works.)
Next, open a command prompt and try one of the commands
above (with the path to your actual archive folder).
Look for messages about revisions that delete files. These
are often clues that folders aren't being matched up properly
between revisions. (This tends to lead to a file being removed
in one revision and added back subsequently.) The error could
be in the revision with the deletion, or the revision where
the file was originally added, so search back through the log
for the filename. If folder names were mismatched between
revisions, fix them and run the command again. (If you've used
the command-line options to create a repository, rather than
--dump-only, you'll need to delete the new repository first.)
fs2svn doesn't change your archive folders, so you can keep
running the command until the repository looks right.
Download
[View/Download fs2svn.py] (26K) Version 1.0, released 2005-06-23.