cvs2git2svn

After discovering Ohloh, cleaning up and publishing repositories of yore seemed like a good idea. One of them was established back in the CVS newbie days, and contained lots of external binaries – Not the kind of thing you want to version control. Having used CVS, Subversion and Git (in that order), there was only one choice: Interactive rebase with Git. Also, the software was created while at CERN, so it should continue to be hosted there. And they had started a Subversion service in the meantime, so it was time to upgrade as well.

These instructions should fit for any CERN project, and can easily be modified to fit any repository. The usual warnings apply: YMMV and RTFM.

  1. Set some variables to avoid typing: svn_repo=Repository_name
    svn_user=User_with_edit_access
  2. Install the tools: sudo apt-get install cvs2svn git-core git-svn
  3. Create the cvs2git working directory: cvs2git_wd=$(mktemp -dt cvs2git.XXXXXXXXXX)
  4. Copy the contents of the repository (not a working copy) to the working directory: scp -r $svn_user@lxplus.cern.ch:/afs/cern.ch/project/svn/reps/${svn_repo}/* $cvs2git_wd. Don’t worry if /hooks is not copied – You don’t need it. If you don’t have filesystem access to the repository, you can try cvssuck. Be warned: It’s really slow.
  5. Set cvs2git global options:
    1. zcat /usr/share/doc/cvs2svn/examples/cvs2git-example.options.gz > $cvs2git_wd/cvs2git.options
    2. Modify at least ctx.username and author_transforms in $cvs2git_wd/cvs2git.options.
  6. Make the new Git repository: git_wd=$(mktemp -dt git.XXXXXXXXXX) && git init $git_wd
  7. Convert to Git (repeat for each module):
    1. Modify run_options.set_project in $cvs2git_wd/cvs2git.options
    2. Create Git import files: cd $cvs2git_wd && cvs2git --options=cvs2git.options. If you get any warnings or errors you might have to change the options again.
    3. Import to Git: cd $git_wd && cat $cvs2git_wd/cvs2svn-tmp/git-blob.dat $cvs2git_wd/cvs2svn-tmp/git-dump.dat | git fast-import
  8. Make a backup in case the rest goes hairy.
  9. If you need to (which was kind of the point of this exercise), do an interactive rebase from the first commit: git rebase -i $(git log --format=%H | tail -1).
  10. git-svn needs at least one commit to be in the Subversion repository: svn_wd=$(mktemp -dt svn.XXXXXXXXXX) && svn co --username $svn_user svn+ssh://${svn_user}@svn.cern.ch/reps/${svn_repo} $svn_wd && cd $svn_wd && touch .temp && svn add .temp && svn ci -m "git-svn dummy commit"
  11. Convert to Subversion:
    1. Prepare git-svn repository: git2svn_wd=$(mktemp -dt git2svn.XXXXXXXXXX) && git svn clone --username $svn_user svn+ssh://${svn_user}@svn.cern.ch/reps/${svn_repo} $git2svn_wd && cd $git2svn_wd
    2. Get Git commits: git fetch $git_wd
    3. Apply Git commits as master branch: git branch tmp $(cut -b-40 .git/FETCH_HEAD) && git tag -am "Last fetch" last tmp && first_commit=$(git log --format=%H | tail -1) && git checkout $first_commit . && git commit -C $first_commit
    4. Apply Git commits: git rebase master tmp && git branch -M tmp master
    5. Check if this works : git svn dcommit --rmdir --find-copies-harder --dry-run
    6. If it does, you’re good to go: git svn dcommit --rmdir --find-copies-harder

If the last step fails, the easiest way to continue is just to remove all commits from the Subversion repository, fix the Git repository, and restart at step 10.

N-way Git synchronization with extra cheese

Index

  1. Background
  2. Converting Subversion to Git
  3. Generate and version .gitignore files
  4. Git via proxy
  5. Setting up pull everywhere
  6. References

Background

I’ve got a desktop and server behind a router with a dynamic IP address at home, a desktop at work, and a laptop that floats around. I’d very much like to have the same settings on all of them, and to be able to synchronize them as easily as possible. I’ve been using Subversion for this, but recent trouble with symlinks and a long-term concern that storing the revision history centrally (even with backups now and then) is a Bad Move in the long term. So when I had to start using Git at work, and after realizing that it could solve both problems (at least in theory), I tried figuring out how to do this. After lots of tries followed by rm -rf settings/, I think I’ve got a working setup. Of course, I don’t guarantee that any of this will work for you.

Converting Subversion to Git

Install the necessary software:
sudo apt-get install git-svn

Copy the following code into a file named svn2git.sh, and run it as documented below.

svn2git.sh

#!/bin/sh
#
# NAME
#    svn2git.sh - Convert a Subversion repository to Git
#
# SYNOPSIS
#    svn2git.sh [options] <Subversion URL> 
#
# OPTIONS
#    --authors=path  Authors file
#    -v,--verbose    Verbose output
#
# EXAMPLE
#    /path/to/svn2git.sh https://example.org/foo
#
#    Create authors file for repository
#
#    /path/to/svn2git.sh -v --authors=authors.txt https://example.org/foo
#
#    Get Subversion repository to ./foo.git
#
# DESCRIPTION
#    Two-part script to migrate from Subversion to Git. First it tries to get
#    a list of the Subversion authors, so it can be formatted to fit the Git
#    commit structure. When running with the authors file, it will fetch the
#    entire Subversion revision history.
#
# BUGS
#    Email bugs to victor dot engmark at gmail dot com. Please include the
#    output of running this script in verbose mode (-v).
#
# COPYRIGHT AND LICENSE
#    Copyright (C) 2009 Victor Engmark
#
#    This program is free software: you can redistribute it and/or modify
#    it under the terms of the GNU General Public License as published by
#    the Free Software Foundation, either version 3 of the License, or
#    (at your option) any later version.
#
#    This program is distributed in the hope that it will be useful,
#    but WITHOUT ANY WARRANTY; without even the implied warranty of
#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#    GNU General Public License for more details.
#
#    You should have received a copy of the GNU General Public License
#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
#
################################################################################

# Output error message with optional error code
error()
{
    if [ -z "$2" ]
    then
        error_code=$EX_UNKNOWN
    else
        error_code=$2
    fi
    echo "$1" >&2
    exit $error_code
}

usage()
{
    error "Usage: ${cmdname} [-v|--verbose] [--authors=path] <Subversion URL>" $EX_USAGE
}

verbose_echo()
{
    if [ $verbose ]
    then
        echo "$*"
    fi
}

# Use for mandatory directory checks
# $1 is the directory path
# $2 is the (optional) error message
directory_exists()
{
    if [ ! -d $1 ]
    then
        error "No such directory '${1}'
$2" $EX_NO_SUCH_DIR
    fi
}

# Make sure an executable is available
# $1 is the path to the executable
# $2 is the (optional) error message
executable_exists()
{
    if [ ! -x $1 ]
    then
        error "No such executable '${1}'
$2" $EX_NO_SUCH_EXEC
    fi
}

PATH="/usr/bin:/bin"
cmdname=`basename $0`
directory=$PWD

# Exit codes from /usr/include/sysexits.h, as recommended by
# http://www.faqs.org/docs/abs/HTML/exitcodes.html
EX_OK=0           # successful termination
EX_USAGE=64       # command line usage error
EX_DATAERR=65     # data format error
EX_NOINPUT=66     # cannot open input
EX_NOUSER=67      # addressee unknown
EX_NOHOST=68      # host name unknown
EX_UNAVAILABLE=69 # service unavailable
EX_SOFTWARE=70    # internal software error
EX_OSERR=71       # system error (e.g., can't fork)
EX_OSFILE=72      # critical OS file missing
EX_CANTCREAT=73   # can't create (user) output file
EX_IOERR=74       # input/output error
EX_TEMPFAIL=75    # temp failure; user is invited to retry
EX_PROTOCOL=76    # remote error in protocol
EX_NOPERM=77      # permission denied
EX_CONFIG=78      # configuration error

# Custom errors
EX_UNKNOWN=1
EX_NO_SUCH_DIR=91
EX_NO_SUCH_EXEC=92

# Process parameters
until [ $# -eq 0 ]
do
    case $1 in
        -v|--verbose)
            verbose=1
            shift
            ;;
        --authors=*)
            authors_file=${directory}/$(echo "$1" | cut -c11-)
            shift
            ;;
        *)
            if [ -z $svn_url ]
            then
                svn_url=$1
                shift
            else
                # Unknown parameter
                usage
            fi
            ;;
    esac
done

if [ -z $svn_url ]
then
    # No Subversion URL provided
    usage
fi

repository_name=`basename $svn_url`

verbose_echo "Running $cmdname at `date`."

# Preliminary checks
directory_exists "$source_base"
executable_exists "/usr/bin/git"
executable_exists "/usr/bin/git-svn"
executable_exists "/usr/bin/svn"

verbose_echo "Source repository: '${svn_url}'"

if [ -z $authors_file ]
then
    # Get authors file
    authors_file="${directory}/${repository_name}-authors.txt"
    if [ -e $authors_file ]
    then
        error "Authors file '${authors_file}' already exists"
    fi
    verbose_echo "Authors file: ${authors_file}"

    svn log --quiet "${svn_url}" | grep '^r.*' | cut -d ' ' -f 3- | cut -d '|' -f 1 | sort | uniq > "${authors_file}"

    author="$(head -1 $authors_file)"
    echo "Please modify ${authors_file} to a format like"
    echo "${author}= Full Name <${author}@example.org>"
    echo "and rerun $cmdname with --authors=${authors_file}"
else
    if [ ! -e $authors_file ]
    then
        error "Authors file '${authors_file}' doesn't exist"
    fi

    git_target="${directory}/${repository_name}.git"
    if [ -e $git_target ]
    then
        error "Target repository '${git_target}' already exists"
    fi
    verbose_echo "Target repository: '${git_target}'"

    # Clone
    git-svn clone --no-metadata --authors-file="${authors_file}" --revision 1:1 "$svn_url" "$git_target" || error "Clone failed"

    # Fetch
    cd "$git_target"
    batch_start=2
    revisions=$(svn info "$svn_url" | grep '^Revision:' | awk '{print $2}')
    while [ $batch_start -le $revisions ]
    do
        batch_end=$(expr $batch_start + 990)
        if [ $batch_end -gt $revisions ]
        then
            batch_end=$revisions
        fi

        verbose_echo "Fetching revisions $batch_start through $batch_end"
        git-svn fetch --authors-file="${authors_file}" --revision $batch_start:$batch_end || error "Fetch failed"
        
        batch_start=$(expr $batch_end + 1)
    done

    git rebase git-svn

    verbose_echo "Applying svn:ignore properties"
    git-svn show-ignore >> .git/info/exclude

    verbose_echo "Removing references to Subversion"
    git config --remove-section svn-remote.svn
    rm --recursive --force .git/svn/
fi

verbose_echo "Cleaning up."
cd "$directory"

verbose_echo "${cmdname} completed at `date`."
exit $EX_OK

Now make sure you do a directory diff between the old Subversion and the new Git repositories to see if it succeeded.

Now you can get this on other machines using
git clone --origin example ssh://example.org/~/settings

Generate and version .gitignore files

This is an optional step in case you would like to version the old svn:ignore properties as .gitignore files:

exclude2gitignore.sh

#!/bin/sh
#
# NAME
#    exclude2gitignore.sh - Convert $GIT_DIR/info/exclude to corresponding
#    .gitignore files
#
# SYNOPSIS
#    exclude2gitignore.sh [options] /path/to/repository
#
# OPTIONS
#    -v,--verbose    Verbose output
#
# EXAMPLE
#    /path/to/exclude2gitignore.sh ~/foo
#
#    Create .gitignore files for the Git repository in ~/foo
#
# DESCRIPTION
#    Based on the format generated by `git-svn show-ignore`, where non-comment
#    lines indicate ignored files. Will try to put the .gitignore as close as
#    possible to the ignored file(s).
#
# BUGS
#    Email bugs to victor dot engmark at gmail dot com. Please include the
#    output of running this script in verbose mode (-v).
#
# COPYRIGHT AND LICENSE
#    Copyright (C) 2009 Victor Engmark
#
#    This program is free software: you can redistribute it and/or modify
#    it under the terms of the GNU General Public License as published by
#    the Free Software Foundation, either version 3 of the License, or
#    (at your option) any later version.
#
#    This program is distributed in the hope that it will be useful,
#    but WITHOUT ANY WARRANTY; without even the implied warranty of
#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#    GNU General Public License for more details.
#
#    You should have received a copy of the GNU General Public License
#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
#
################################################################################

# Output error message with optional error code
error()
{
    if [ -z "$2" ]
    then
        error_code=$EX_UNKNOWN
    else
        error_code=$2
    fi
    echo "$1" >&2
    exit $error_code
}

usage()
{
    error "Usage: ${cmdname} [-v|--verbose] /path/to/repository" $EX_USAGE
}

verbose_echo()
{
    if [ $verbose ]
    then
        echo "$*"
    fi
}

# Use for mandatory directory checks
# $1 is the directory path
# $2 is the (optional) error message
directory_exists()
{
    if [ ! -d $1 ]
    then
        error "No such directory '${1}'
$2" $EX_NO_SUCH_DIR
    fi
}

# Make sure an executable is available
# $1 is the path to the executable
# $2 is the (optional) error message
executable_exists()
{
    if [ ! -x $1 ]
    then
        error "No such executable '${1}'
$2" $EX_NO_SUCH_EXEC
    fi
}

PATH="/usr/bin:/bin"
cmdname=`basename $0`
directory=$PWD

# Exit codes from /usr/include/sysexits.h, as recommended by
# http://www.faqs.org/docs/abs/HTML/exitcodes.html
EX_OK=0           # successful termination
EX_USAGE=64       # command line usage error
EX_DATAERR=65     # data format error
EX_NOINPUT=66     # cannot open input
EX_NOUSER=67      # addressee unknown
EX_NOHOST=68      # host name unknown
EX_UNAVAILABLE=69 # service unavailable
EX_SOFTWARE=70    # internal software error
EX_OSERR=71       # system error (e.g., can't fork)
EX_OSFILE=72      # critical OS file missing
EX_CANTCREAT=73   # can't create (user) output file
EX_IOERR=74       # input/output error
EX_TEMPFAIL=75    # temp failure; user is invited to retry
EX_PROTOCOL=76    # remote error in protocol
EX_NOPERM=77      # permission denied
EX_CONFIG=78      # configuration error

# Custom errors
EX_UNKNOWN=1
EX_NO_SUCH_DIR=91
EX_NO_SUCH_EXEC=92

# Process parameters
until [ $# -eq 0 ]
do
    case $1 in
        -v|--verbose)
            verbose=1
            shift
            ;;
        *)
            if [ -z $repository ]
            then
                repository="${1%\/}"
                shift
            else
                # Unknown parameter
                usage
            fi
            ;;
   esac
done

verbose_echo "Running $cmdname at `date`."

directory_exists "$repository"

grep '^/' "${repository}/.git/info/exclude" | while read line
do
    ignore_path="${repository}${line}"
    verbose_echo "Starting with $ignore_path"
    ignore_name="$ignore_path"

    # Strip globs in path
    ignore_path=`dirname "$ignore_path"`
    while [ ! -e "$ignore_path" ]
    do
        ignore_path=`dirname "$ignore_path"`
    done

    # Remove path from file name (need +2 to include the end slash and to
    # compensate for 1-based indexing
    name_length=$(expr length "$ignore_name")
    path_length=$(expr length "$ignore_path" + 2)
    ignore_name=$(expr substr "$ignore_name" $path_length $name_length)

    # Complete .gitignore path
    ignore_path="${ignore_path}/.gitignore"

    verbose_echo "$ignore_name >> $ignore_path"
    echo "$ignore_name" >> "$ignore_path"
done

verbose_echo "Cleaning up."
cd "$directory"

verbose_echo "${cmdname} completed at `date`."
exit $EX_OK

Git via proxy

One of the machines involved is behind a gateway machine at work, so I had to add the following to ~/.ssh/config:

Host work
     ProxyCommand ssh -q gateway.example.org nc %h %p $*
     HostName work-pc.example.org

With this, it’s possible to refer to just “work”, and SSH commands (even via Git) will take care of connecting via the proxy.

Setting up pull everywhere

The main idea here is to set up Git “remotes” pointing to all the other machines.

To be able to get the updates from the repository in ~/settings on my.example.org, simply run the following on all machines (except, of course, the home machine):
git remote add home ssh://home-pc.example.net/~/settings

To be able to get the updates from the “work” host specified with a proxy above, just use “work” for the host name:
git remote add work ssh://work/~/settings

To be able to pull from a machine which changes IP address, you could set up a DynDNS account and use one of their recommended update scripts to be able to refer to your machine using a single DNS name.

After cloning one of the copies on all of your hosts, you should be able to do the following to get all the changes from the repositories:
git remote update && git pull
If this doesn’t work, you might have more luck fetching each repository individually, and then rebasing to it:
git fetch home && git rebase home/master

To keep a backup on a separate machine, just do a
git clone --origin example ssh://example.org/~/settings
there and set up pushing defaults on the other machines using
git config push.default matching
git remote add backup ssh://backup.example.org/~/settings

Then you can just git push backup master to backup the local master branch.

References

Subversion server using HTTPS on Ubuntu Hardy setup

Yay, it’s up and running! And here are the steps to do it, mostly copied directly from the shell as I ran them. In any case, it may or may not work for you, so make sure you check with the proper documentation if anything fails.

By the way: Back up old repositories if you have any!

  1. Install the software:
    sudo apt-get install apache2 libapache2-svn openssl ssl-cert subversion
    
  2. Create directory for server certificates:
    sudo mkdir /etc/apache2/certs
    
  3. Create password-free SSL certificate (remember what you put as “Host Name” for the next step):
    sudo /usr/sbin/make-ssl-cert /usr/share/ssl-cert/ssleay.cnf /etc/apache2/certs/server.pem
    
  4. Add the Apache certificate settings to /etc/apache2/httpd.conf (use the “Host Name” value from the previous step instead of “example.org” to avoid a warning in /var/log/apache2/error.log):
    ServerName example.org
    SSLEngine on
    SSLCertificateFile /etc/apache2/certs/server.pem
  5. Enable Apache SSL module (necessary for HTTPS):
    sudo a2enmod ssl
    
  6. Create directory for Subversion repository files:
    sudo mkdir /var/lib/svn
    
  7. If you have any old repositories, copy them to /var/lib/svn/, and make sure the Apache user can read & write them:
    sudo chown -R www-data:www-data /var/lib/svn/
    
  8. Create Apache’s Subversion password file with one user (replace username with one of your choice):
    sudo htpasswd -c /etc/apache2/dav_svn.passwd username
    
  9. Uncomment the following lines in /etc/apache2/mods-available/dav_svn.conf to point Apache to your repositories:
    <Location /svn>
      DAV svn
      SVNParentPath /var/lib/svn
      AuthType Basic
      AuthName "Subversion Repository"
      AuthUserFile /etc/apache2/dav_svn.passwd
        Require valid-user
    </Location>
    
  10. Disable the default site (it clashes with the SSL settings somehow):
    sudo a2dissite default
    
  11. Restart Apache:
    sudo /etc/init.d/apache2 restart
    
  12. Test (replace repository_name with an existing repository name):
    svn co https://localhost/svn/repository_name
    

Sources:

Subversion checkout script

Here’s a simple script to check out all subversion repositories on a remote host. It requires that you have SSH access on the host, to be able to fetch the repository names (otherwise you can hardcode them in $repositories). You’ll also need to have Perl installed.

How to use:

  1. Download checkout-all-svn.sh.
  2. chmod u+x path/to/checkout-all-svn.sh
  3. path/to/checkout-all-svn.sh -r http://example.org/svn/

Some features:

  • Works with plain /bin/sh, so it should work on any Linux / BSD distribution.
  • Works with repository names with spaces, but not yet with unusual characters.

checkout-all-svn.sh

#!/bin/sh
#
# $Id: checkout-all-svn.sh 387 2008-06-07 20:36:08Z vengmark $
#
# NAME
#    checkout-all-svn.sh - Check out all Subversion repositories.
#
# SYNOPSIS
#    checkout-all-svn.sh [options]
#
# OPTIONS
#    -v     Verbose output
#    -p     Target host SSH port
#    -u     Target host SSH user name
#    -d     Subversion repository directory on target host
#    -r     URL part before the repository name,
#           e.g. https://example.org/svn/
#
# EXAMPLE
#    ./checkout-all-svn.sh -v -p 1234 -u ssh-admin -d /var/lib/svn -r
#    https://example.com/reps/
#
# DESCRIPTION
#    Gets all your subversion repositories. If they are already
#    present, they will be updated.
#
#    The current (or specified with -u) user must have SSH access to the remote
#    host. To circumvent this you can specify the repository names in
#    $repositories, separated by newlines.
#
#    To avoid having to type your password several times, you can setup SSH
#    keys - See e.g. https://help.ubuntu.com/community/SSHHowto
#
# BUGS
#    Email bugs to victor dot engmark at gmail dot com. Please include the
#    output of running this script in verbose mode (-v).
#
# COPYRIGHT AND LICENSE
#    Copyright (C) 2008 Victor Engmark
#
#    This program is free software: you can redistribute it and/or modify
#    it under the terms of the GNU General Public License as published by
#    the Free Software Foundation, either version 3 of the License, or
#    (at your option) any later version.
#
#    This program is distributed in the hope that it will be useful,
#    but WITHOUT ANY WARRANTY; without even the implied warranty of
#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#    GNU General Public License for more details.
#
#    You should have received a copy of the GNU General Public License
#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
#
################################################################################

# Init
ifs_original="$IFS" # Reset when done

PATH="/usr/bin:/bin"
cmdname=`basename $0`
directory=`dirname $0`
target_dir=~

# Remote host
target_port=22
target_user=`whoami`

# Subversion
svn_root="/var/lib/svn"

#Error messages
errm_unknown="Unknown error in $cmdname" #Code 1
#Code 2 is reserved: http://www.faqs.org/docs/abs/HTML/exitcodes.html
usage="Usage: ${cmdname} [-v] [-p port] [-u user] [-d svn_directory] -r repositories_url" #Code 3

# Process parameters
until [ $# -eq 0 ]
do
	case $1 in
		-v)
			verbose=1
			shift
			;;
		-p)
			if [ -z "$2" ]
			then
				echo "$usage" >&2
				exit 3
			fi
			target_port=$2
			shift 2
			;;
		-u)
			if [ -z "$2" ]
			then
				echo "$usage" >&2
				exit 3
			fi
			target_user=$2
			shift 2
			;;
		-d)
			if [ -z "$2" ]
			then
				echo "$usage" >&2
				exit 3
			fi
			svn_root=$2
			shift 2
			;;
		-r)
			if [ -z "$2" ]
			then
				echo "$usage" >&2
				exit 3
			fi
			base_url=$2
			shift 2
			;;
		*)
			#Unknown parameter
			if [ $verbose ]
			then
				echo "Unknown parameter: $1" >&2
			fi
			echo "$usage" >&2
			exit 3
			;;
	esac
done

if [ ! $base_url ]
then
	echo "$usage" >&2
	exit 3
fi
base_url=`echo $base_url | sed "s/ /%20/g"` # To avoid problems with spaces

target_host=$base_url
target_host=${target_host#*//}
target_host=${target_host%%/*}
if [ $verbose ]
then
	echo "Target host: ${target_host}"
fi

repositories=`ssh -p ${target_port} ${target_user}@${target_host} $(if [ $verbose ]; then echo '-v'; else echo '-q'; fi;) -x "ls '${svn_root}'"`
error=$?
if [ $error -ne 0 ]
then
	echo "Failed to get repository names. Error code $error" >&2
	exit 1
fi
if [ $verbose ]
then
	echo "Repositories:\n${repositories}"
fi

IFS="
" # Make sure paths with spaces don't make any trouble when looping

# Concatenate URLs
rep_urls=""
for repository in $repositories
do
	repository=`echo $repository | sed "s/ /%20/g"`
	rep_urls="${rep_urls} ${base_url}${repository}"
done

IFS="$ifs_original"

svn co $rep_urls --non-interactive -r 'HEAD' `if [ ! $verbose ]; then echo '--quiet'; fi;` "${target_dir}"

# End
exit 0

Subversion backup script

Following up on the good work of Jean-Francois Roy, here’s my slightly extended version of his script to backup all Subversion repositories to a remote host.

How to use:

  1. Download backup-all-svn.sh
  2. chmod u+x path/to/backup-all-svn.sh
  3. ./backup-all-svn.sh -h target_host (can also set target port and user name)

Some features:

  • Works with plain /bin/sh, so it should work on any Linux / BSD distribution.
  • Works with repository names with spaces and non-ASCII characters.

backup-all-svn.sh

#!/bin/sh
#
# $Id: backup-all-svn.sh 387 2008-06-07 20:36:08Z vengmark $
#
# NAME
#    backup-all-svn.sh - Backup all Subversion repositories
#
# SYNOPSIS
#    backup-all-svn.sh [options]
#
# OPTIONS
#    -v     Verbose output
#    -h     Target host name (mandatory)
#    -p     Target host port
#    -u     Target host user name
#
# EXAMPLE
#    ./backup-all-svn.sh -v -h example.com -p 1234 -u johndoe
#
# DESCRIPTION
#    Backups all your subversion repositories to a remote machine.
#
#    The current user must have access to the subversion repositories.
#    To work around this, you should `sudo adduser <username> <svn-group>`
#    and `sudo chmod -R g+w /path/to/repos`.
#
#    To avoid having to type your password several times, you can setup SSH
#    keys - See e.g. https://help.ubuntu.com/community/SSHHowto
#
# BUGS
#    Email bugs to victor dot engmark at gmail dot com. Please include the
#    output of running this script in verbose mode (-v).
#
# COPYRIGHT AND LICENSE
#    Copyright (C) 2005 Jean-Francois Roy
#    Copyright (C) 2008 Victor Engmark
#
#    This program is free software: you can redistribute it and/or modify
#    it under the terms of the GNU General Public License as published by
#    the Free Software Foundation, either version 3 of the License, or
#    (at your option) any later version.
#
#    This program is distributed in the hope that it will be useful,
#    but WITHOUT ANY WARRANTY; without even the implied warranty of
#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#    GNU General Public License for more details.
#
#    You should have received a copy of the GNU General Public License
#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
#
################################################################################

# Init
ifs_original="$IFS" # Reset when done

PATH="/usr/bin:/bin"
cmdname=`basename $0`
directory=`dirname $0`

# Remote host
target_dir=".svn-backup/`date +%G-%m-%d`"
target_port=22
target_user=`whoami`

# Subversion
svn_root="/var/lib/svn"
svn_install_root="/usr/bin"

# Error messages from /usr/include/sysexits.h, recommended by
# http://www.faqs.org/docs/abs/HTML/exitcodes.html
EX_OK=0
EX_USAGE=64
EX_CANT_CREATE=73

# Custom errors
EX_NO_SUCH_DIR=91
EX_NO_SUCH_EXEC=92

usage_error()
{
	echo "Usage: ${cmdname} [-v] -h host [-p port] [-u user]" #Code 3 
	exit $EX_USAGE
}

# Process parameters
until [ $# -eq 0 ]
do
	case $1 in
		-v)
			verbose=1
			shift
			;;
		-h)
			if [ -z "$2" ]
			then
				usage_error
			fi
			target_host=$2
			shift 2
			;;
		-p)
			if [ -z "$2" ]
			then
				usage_error
			fi
			target_port=$2
			shift 2
			;;
		-u)
			if [ -z "$2" ]
			then
				usage_error
			fi
			target_user=$2
			shift 2
			;;
		*)
			#Unknown parameter
			usage_error
			;;
	esac
done

if [ -z ${target_host} ]
then
	usage_error
fi

# Use for mandatory directory checks
# $1 is the directory path
# $2 is the (optional) error message
check_directory()
{
	if [ ! -d $1 ]
	then
		echo "No such directory: '${1}'" >&2
		echo $2 >&2
		exit $EX_NO_SUCH_DIR
	fi
}

check_directory $svn_root "Please change \$svn_root to point to the directory where your Subversion repositories are."

check_directory $svn_install_root "Please change \$svn_install_root to point to the directory where Subversion is installed."

# Make sure an executable is available
# $1 is the path to the executable
# $2 is the (optional) error message
check_executable()
{
	if [ ! -x $1 ]
	then
		echo "No such executable: '${1}'" >&2
		echo $2 >&2
		exit $EX_NO_SUCH_EXEC
	fi
}

svn_install_missing="Please change \$svn_install_root to point to the directory where Subversion is installed."
check_executable ${svn_install_root}/svnlook $svn_software_missing
check_executable ${svn_install_root}/svnadmin $svn_software_missing

# Create the temporary folder
temp_dir=`mktemp -t -d ${cmdname}.XXXXXXXXXX` || exit $?

verbose_echo()
{
	if [ $verbose ]
	then
		echo "$*"
	fi
}

# Announce that we're running
verbose_echo "Running $cmdname at `date`."

# Create target directory
ssh -p ${target_port} ${target_user}@${target_host} "mkdir -p \"${target_dir}\"" || exit $?

# Loop over repositories
cd "${svn_root}"
IFS="
" # Make sure paths with spaces don't make any trouble when looping
for repository in *
do
	# Get the last revision
	revision=`${svn_install_root}/svnlook youngest "${repository}"`
	verbose_echo "Backing up repository \"${repository}\" revision ${revision}."

	# Make sure the repo is OK
	verbose_echo "Recovering the repository."
	${svn_install_root}/svnadmin recover --wait "${repository}" > /dev/null

	# Did the recover operation fail?
	if [ $? -ne 0 ]
	then
		echo "Backup failed because recovery failed." >&2
		break
	fi

	# Hotcopy
	verbose_echo "Hot-copying the repository."
	${svn_install_root}/svnadmin hotcopy --clean-logs "${repository}" "$temp_dir/${repository}"

	# Did the hotcopy fail?
	if [ $? -ne 0 ]
	then
		echo "Backup failed because hotcopy failed." >&2
		rm -Rf "$temp_dir"
		break
	fi

	# Compress the hotcopy
	verbose_echo "Compressing the repository in a tar.bz2 archive."
	archive="${repository}-r${revision}.tar.bz2"
	tar -cjpf "$temp_dir/${archive}" -C "$temp_dir" "${repository}"

	# Send it over
	verbose_echo "Copying repository archive to remote host."
	scp -P ${target_port} "$temp_dir/${archive}" "${target_user}@${target_host}:\"${target_dir}/${archive}\""

done

verbose_echo "Cleaning up."
rm -Rf $temp_dir
IFS="$ifs_original"

# End
verbose_echo "${cmdname} completed at `date`."

exit $EX_OK

Recursive symbolic link creation script

This might be useful if you keep your settings (.bashrc, .subversion/config, etc.) outside the home directory, like in a Subversion repository. Just copy the following script to the repository directory corresponding to $HOME and run
/path/to/make-links.sh
to create symlinks from ~ to each file in the repository, recursively.

If you don’t want to use meld to see differences, just change the diff variable near the top of the code.

Edit 2: Now actually does something useful if the target file already exists or the target directory does not exist.

make-links.sh

#!/bin/sh
#
# $Id: make-links.sh 1870 2009-11-10 08:56:35Z vengmark $
#
# NAME
#    make-links.sh - Make symlinks to all user settings in repository
#
# SYNOPSIS
#    make-links.sh [options]
#
# OPTIONS
#    -v,--verbose    Verbose output
#    -d              Specify the source and target directories for the symlinks
#
# EXAMPLE
#    /path/to/make-links.sh -d ~/settings/user ~
#
#    Create links in the home directory based on files in ~/settings/user
#
#    /path/to/make-links.sh -v
#
#    Create links in / based on files in the directory of make-links.sh
#
# DESCRIPTION
#    If the file in the source directory doesn't exist in the target directory,
#    a symlink is created directly.
#    If the file exists, or the target directory does not exist, the user is
#    given options to continue.
#
# BUGS
#    Email bugs to victor dot engmark at gmail dot com. Please include the
#    output of running this script in verbose mode (-v).
#
# COPYRIGHT AND LICENSE
#    Copyright (C) 2008-2009 Victor Engmark
#
#    This program is free software: you can redistribute it and/or modify
#    it under the terms of the GNU General Public License as published by
#    the Free Software Foundation, either version 3 of the License, or
#    (at your option) any later version.
#
#    This program is distributed in the hope that it will be useful,
#    but WITHOUT ANY WARRANTY; without even the implied warranty of
#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#    GNU General Public License for more details.
#
#    You should have received a copy of the GNU General Public License
#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
#
################################################################################

# Output error message with optional error code
error()
{
    if [ -z "$2" ]
    then
        error_code=$EX_UNKNOWN
    else
        error_code=$2
    fi
    echo "$1" >&2
    exit $error_code
}

usage()
{
    error "Usage: ${cmdname} [-v|--verbose] [-d source target]" $EX_USAGE
}

verbose_echo()
{
    if [ $verbose ]
    then
        echo "$*"
    fi
}

# Use for mandatory directory checks
# $1 is the directory path
# $2 is the (optional) error message
directory_exists()
{
    if [ ! -d $1 ]
    then
        error "No such directory '${1}'
$2" $EX_NO_SUCH_DIR
    fi
}

# Make sure an executable is available
# $1 is the path to the executable
# $2 is the (optional) error message
executable_exists()
{
    if [ ! -x $1 ]
    then
        error "No such executable '${1}'
$2" $EX_NO_SUCH_EXEC
    fi
}

ifs_original="$IFS" # Reset when done
IFS="
" # Make sure paths with spaces don't make any trouble when looping

PATH="/usr/bin:/bin"
cmdname=`basename $0`
directory=$(dirname $(readlink -f $0))

diff=/usr/bin/meld
source_base="${directory}/home/$(whoami)/"
target_base="$HOME"

# Exit codes from /usr/include/sysexits.h, as recommended by
# http://www.faqs.org/docs/abs/HTML/exitcodes.html
EX_OK=0           # successful termination
EX_USAGE=64       # command line usage error
EX_DATAERR=65     # data format error
EX_NOINPUT=66     # cannot open input
EX_NOUSER=67      # addressee unknown
EX_NOHOST=68      # host name unknown
EX_UNAVAILABLE=69 # service unavailable
EX_SOFTWARE=70    # internal software error
EX_OSERR=71       # system error (e.g., can't fork)
EX_OSFILE=72      # critical OS file missing
EX_CANTCREAT=73   # can't create (user) output file
EX_IOERR=74       # input/output error
EX_TEMPFAIL=75    # temp failure; user is invited to retry
EX_PROTOCOL=76    # remote error in protocol
EX_NOPERM=77      # permission denied
EX_CONFIG=78      # configuration error

# Custom errors
EX_UNKNOWN=1
EX_NO_SUCH_DIR=91
EX_NO_SUCH_EXEC=92

# Process parameters
until [ $# -eq 0 ]
do
    case $1 in
        -v|--verbose)
            verbose=1
            shift
            ;;
	-d)
	    if [ -z "$2" ] || [ -z "$3" ]
	    then
		usage_error
	    fi
	    source_base=$2
	    target_base=$3
	    shift 3
	    ;;
        *)
            # Unknown parameter
            usage
            ;;
    esac
done

verbose_echo "Running $cmdname at `date`."

# Make sure the directory paths don't end with a slash
source_base="${source_base%\/}"
target_base="${target_base%\/}"

# Preliminary checks
directory_exists "$source_base"
directory_exists "$target_base"

verbose_echo "Source directory: '${source_base}'"
verbose_echo "Target directory: '${target_base}'"

# Find files excluding .svn directories
for source_path in `find "$source_base" -mindepth 1 -type f | grep -v "/.svn/"`
do
    target_path="${target_base}${source_path#${source_base}}"
    target_dir="$(dirname "${target_path}")"
    unset replace_file
    unset create_dir

    verbose_echo ""
    verbose_echo "Source file: \"${source_path}\"."
    verbose_echo "Target file: \"${target_path}\"."

    # Trivial case
    if [ -L "$target_path" ]
    then
        verbose_echo "\"${target_path}\" is already a symlink; skipping."
        continue
    fi

    # File exists
    if [ -f "$target_path" ]
    then
        # Make sure we skip or replace in the end
        while ! expr "$replace_file" : "^[SsRr]$" > /dev/null
        do
            echo "\"${target_path}\" is a proper file. What do you want to do?"
            read -p "[S]kip, [D]iff, [R]eplace: " replace_file
            if expr "$replace_file" : "^[Dd]$" > /dev/null
            then
                verbose_echo "Diffing \"${source_path}\" and \"${target_path}\""
                $diff "$source_path" "$target_path"
            fi
        done

        if expr "$replace_file" : "^[Rr]$"
        then
            verbose_echo "Removing ${target_path}"
            rm "$target_path"
        else
            continue
        fi
    fi

    # Not a proper file
    if [ -e "$target_path" ]
    then
        echo "\"${target_path}\" exists but is not a file; skipping."
        continue
    fi

    # Directory missing; might have to create it
    if [ ! -e "$target_dir" ]
    then
        while ! expr "$create_dir" : "^[SsCc]$" > /dev/null
        do
            echo "\"${target_dir}\" doesn't exist. What do you want to do?"
            read -p "[S]kip or [C]reate: " create_dir
        done

        if expr "$create_dir" : "^[Cc]$" > /dev/null
        then
            mkdir -p "$target_dir"
        else
            continue
        fi
    fi

    echo "Creating symlink at \"${target_path}\"."
    ln -s "$source_path" "$target_path"
done

verbose_echo "Cleaning up."
IFS="$ifs_original"

verbose_echo "${cmdname} completed at `date`."
exit $EX_OK