N-way Git synchronization with extra cheese

Index

  1. Background
  2. Converting Subversion to Git
  3. Generate and version .gitignore files
  4. Git via proxy
  5. Setting up pull everywhere
  6. References

Background

I’ve got a desktop and server behind a router with a dynamic IP address at home, a desktop at work, and a laptop that floats around. I’d very much like to have the same settings on all of them, and to be able to synchronize them as easily as possible. I’ve been using Subversion for this, but recent trouble with symlinks and a long-term concern that storing the revision history centrally (even with backups now and then) is a Bad Move in the long term. So when I had to start using Git at work, and after realizing that it could solve both problems (at least in theory), I tried figuring out how to do this. After lots of tries followed by rm -rf settings/, I think I’ve got a working setup. Of course, I don’t guarantee that any of this will work for you.

Converting Subversion to Git

Install the necessary software:
sudo apt-get install git-svn

Copy the following code into a file named svn2git.sh, and run it as documented below.

svn2git.sh

#!/bin/sh
#
# NAME
#    svn2git.sh - Convert a Subversion repository to Git
#
# SYNOPSIS
#    svn2git.sh [options] <Subversion URL> 
#
# OPTIONS
#    --authors=path  Authors file
#    -v,--verbose    Verbose output
#
# EXAMPLE
#    /path/to/svn2git.sh https://example.org/foo
#
#    Create authors file for repository
#
#    /path/to/svn2git.sh -v --authors=authors.txt https://example.org/foo
#
#    Get Subversion repository to ./foo.git
#
# DESCRIPTION
#    Two-part script to migrate from Subversion to Git. First it tries to get
#    a list of the Subversion authors, so it can be formatted to fit the Git
#    commit structure. When running with the authors file, it will fetch the
#    entire Subversion revision history.
#
# BUGS
#    Email bugs to victor dot engmark at gmail dot com. Please include the
#    output of running this script in verbose mode (-v).
#
# COPYRIGHT AND LICENSE
#    Copyright (C) 2009 Victor Engmark
#
#    This program is free software: you can redistribute it and/or modify
#    it under the terms of the GNU General Public License as published by
#    the Free Software Foundation, either version 3 of the License, or
#    (at your option) any later version.
#
#    This program is distributed in the hope that it will be useful,
#    but WITHOUT ANY WARRANTY; without even the implied warranty of
#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#    GNU General Public License for more details.
#
#    You should have received a copy of the GNU General Public License
#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
#
################################################################################

# Output error message with optional error code
error()
{
    if [ -z "$2" ]
    then
        error_code=$EX_UNKNOWN
    else
        error_code=$2
    fi
    echo "$1" >&2
    exit $error_code
}

usage()
{
    error "Usage: ${cmdname} [-v|--verbose] [--authors=path] <Subversion URL>" $EX_USAGE
}

verbose_echo()
{
    if [ $verbose ]
    then
        echo "$*"
    fi
}

# Use for mandatory directory checks
# $1 is the directory path
# $2 is the (optional) error message
directory_exists()
{
    if [ ! -d $1 ]
    then
        error "No such directory '${1}'
$2" $EX_NO_SUCH_DIR
    fi
}

# Make sure an executable is available
# $1 is the path to the executable
# $2 is the (optional) error message
executable_exists()
{
    if [ ! -x $1 ]
    then
        error "No such executable '${1}'
$2" $EX_NO_SUCH_EXEC
    fi
}

PATH="/usr/bin:/bin"
cmdname=`basename $0`
directory=$PWD

# Exit codes from /usr/include/sysexits.h, as recommended by
# http://www.faqs.org/docs/abs/HTML/exitcodes.html
EX_OK=0           # successful termination
EX_USAGE=64       # command line usage error
EX_DATAERR=65     # data format error
EX_NOINPUT=66     # cannot open input
EX_NOUSER=67      # addressee unknown
EX_NOHOST=68      # host name unknown
EX_UNAVAILABLE=69 # service unavailable
EX_SOFTWARE=70    # internal software error
EX_OSERR=71       # system error (e.g., can't fork)
EX_OSFILE=72      # critical OS file missing
EX_CANTCREAT=73   # can't create (user) output file
EX_IOERR=74       # input/output error
EX_TEMPFAIL=75    # temp failure; user is invited to retry
EX_PROTOCOL=76    # remote error in protocol
EX_NOPERM=77      # permission denied
EX_CONFIG=78      # configuration error

# Custom errors
EX_UNKNOWN=1
EX_NO_SUCH_DIR=91
EX_NO_SUCH_EXEC=92

# Process parameters
until [ $# -eq 0 ]
do
    case $1 in
        -v|--verbose)
            verbose=1
            shift
            ;;
        --authors=*)
            authors_file=${directory}/$(echo "$1" | cut -c11-)
            shift
            ;;
        *)
            if [ -z $svn_url ]
            then
                svn_url=$1
                shift
            else
                # Unknown parameter
                usage
            fi
            ;;
    esac
done

if [ -z $svn_url ]
then
    # No Subversion URL provided
    usage
fi

repository_name=`basename $svn_url`

verbose_echo "Running $cmdname at `date`."

# Preliminary checks
directory_exists "$source_base"
executable_exists "/usr/bin/git"
executable_exists "/usr/bin/git-svn"
executable_exists "/usr/bin/svn"

verbose_echo "Source repository: '${svn_url}'"

if [ -z $authors_file ]
then
    # Get authors file
    authors_file="${directory}/${repository_name}-authors.txt"
    if [ -e $authors_file ]
    then
        error "Authors file '${authors_file}' already exists"
    fi
    verbose_echo "Authors file: ${authors_file}"

    svn log --quiet "${svn_url}" | grep '^r.*' | cut -d ' ' -f 3- | cut -d '|' -f 1 | sort | uniq > "${authors_file}"

    author="$(head -1 $authors_file)"
    echo "Please modify ${authors_file} to a format like"
    echo "${author}= Full Name <${author}@example.org>"
    echo "and rerun $cmdname with --authors=${authors_file}"
else
    if [ ! -e $authors_file ]
    then
        error "Authors file '${authors_file}' doesn't exist"
    fi

    git_target="${directory}/${repository_name}.git"
    if [ -e $git_target ]
    then
        error "Target repository '${git_target}' already exists"
    fi
    verbose_echo "Target repository: '${git_target}'"

    # Clone
    git-svn clone --no-metadata --authors-file="${authors_file}" --revision 1:1 "$svn_url" "$git_target" || error "Clone failed"

    # Fetch
    cd "$git_target"
    batch_start=2
    revisions=$(svn info "$svn_url" | grep '^Revision:' | awk '{print $2}')
    while [ $batch_start -le $revisions ]
    do
        batch_end=$(expr $batch_start + 990)
        if [ $batch_end -gt $revisions ]
        then
            batch_end=$revisions
        fi

        verbose_echo "Fetching revisions $batch_start through $batch_end"
        git-svn fetch --authors-file="${authors_file}" --revision $batch_start:$batch_end || error "Fetch failed"
        
        batch_start=$(expr $batch_end + 1)
    done

    git rebase git-svn

    verbose_echo "Applying svn:ignore properties"
    git-svn show-ignore >> .git/info/exclude

    verbose_echo "Removing references to Subversion"
    git config --remove-section svn-remote.svn
    rm --recursive --force .git/svn/
fi

verbose_echo "Cleaning up."
cd "$directory"

verbose_echo "${cmdname} completed at `date`."
exit $EX_OK

Now make sure you do a directory diff between the old Subversion and the new Git repositories to see if it succeeded.

Now you can get this on other machines using
git clone --origin example ssh://example.org/~/settings

Generate and version .gitignore files

This is an optional step in case you would like to version the old svn:ignore properties as .gitignore files:

exclude2gitignore.sh

#!/bin/sh
#
# NAME
#    exclude2gitignore.sh - Convert $GIT_DIR/info/exclude to corresponding
#    .gitignore files
#
# SYNOPSIS
#    exclude2gitignore.sh [options] /path/to/repository
#
# OPTIONS
#    -v,--verbose    Verbose output
#
# EXAMPLE
#    /path/to/exclude2gitignore.sh ~/foo
#
#    Create .gitignore files for the Git repository in ~/foo
#
# DESCRIPTION
#    Based on the format generated by `git-svn show-ignore`, where non-comment
#    lines indicate ignored files. Will try to put the .gitignore as close as
#    possible to the ignored file(s).
#
# BUGS
#    Email bugs to victor dot engmark at gmail dot com. Please include the
#    output of running this script in verbose mode (-v).
#
# COPYRIGHT AND LICENSE
#    Copyright (C) 2009 Victor Engmark
#
#    This program is free software: you can redistribute it and/or modify
#    it under the terms of the GNU General Public License as published by
#    the Free Software Foundation, either version 3 of the License, or
#    (at your option) any later version.
#
#    This program is distributed in the hope that it will be useful,
#    but WITHOUT ANY WARRANTY; without even the implied warranty of
#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#    GNU General Public License for more details.
#
#    You should have received a copy of the GNU General Public License
#    along with this program.  If not, see <http://www.gnu.org/licenses/>.
#
################################################################################

# Output error message with optional error code
error()
{
    if [ -z "$2" ]
    then
        error_code=$EX_UNKNOWN
    else
        error_code=$2
    fi
    echo "$1" >&2
    exit $error_code
}

usage()
{
    error "Usage: ${cmdname} [-v|--verbose] /path/to/repository" $EX_USAGE
}

verbose_echo()
{
    if [ $verbose ]
    then
        echo "$*"
    fi
}

# Use for mandatory directory checks
# $1 is the directory path
# $2 is the (optional) error message
directory_exists()
{
    if [ ! -d $1 ]
    then
        error "No such directory '${1}'
$2" $EX_NO_SUCH_DIR
    fi
}

# Make sure an executable is available
# $1 is the path to the executable
# $2 is the (optional) error message
executable_exists()
{
    if [ ! -x $1 ]
    then
        error "No such executable '${1}'
$2" $EX_NO_SUCH_EXEC
    fi
}

PATH="/usr/bin:/bin"
cmdname=`basename $0`
directory=$PWD

# Exit codes from /usr/include/sysexits.h, as recommended by
# http://www.faqs.org/docs/abs/HTML/exitcodes.html
EX_OK=0           # successful termination
EX_USAGE=64       # command line usage error
EX_DATAERR=65     # data format error
EX_NOINPUT=66     # cannot open input
EX_NOUSER=67      # addressee unknown
EX_NOHOST=68      # host name unknown
EX_UNAVAILABLE=69 # service unavailable
EX_SOFTWARE=70    # internal software error
EX_OSERR=71       # system error (e.g., can't fork)
EX_OSFILE=72      # critical OS file missing
EX_CANTCREAT=73   # can't create (user) output file
EX_IOERR=74       # input/output error
EX_TEMPFAIL=75    # temp failure; user is invited to retry
EX_PROTOCOL=76    # remote error in protocol
EX_NOPERM=77      # permission denied
EX_CONFIG=78      # configuration error

# Custom errors
EX_UNKNOWN=1
EX_NO_SUCH_DIR=91
EX_NO_SUCH_EXEC=92

# Process parameters
until [ $# -eq 0 ]
do
    case $1 in
        -v|--verbose)
            verbose=1
            shift
            ;;
        *)
            if [ -z $repository ]
            then
                repository="${1%\/}"
                shift
            else
                # Unknown parameter
                usage
            fi
            ;;
   esac
done

verbose_echo "Running $cmdname at `date`."

directory_exists "$repository"

grep '^/' "${repository}/.git/info/exclude" | while read line
do
    ignore_path="${repository}${line}"
    verbose_echo "Starting with $ignore_path"
    ignore_name="$ignore_path"

    # Strip globs in path
    ignore_path=`dirname "$ignore_path"`
    while [ ! -e "$ignore_path" ]
    do
        ignore_path=`dirname "$ignore_path"`
    done

    # Remove path from file name (need +2 to include the end slash and to
    # compensate for 1-based indexing
    name_length=$(expr length "$ignore_name")
    path_length=$(expr length "$ignore_path" + 2)
    ignore_name=$(expr substr "$ignore_name" $path_length $name_length)

    # Complete .gitignore path
    ignore_path="${ignore_path}/.gitignore"

    verbose_echo "$ignore_name >> $ignore_path"
    echo "$ignore_name" >> "$ignore_path"
done

verbose_echo "Cleaning up."
cd "$directory"

verbose_echo "${cmdname} completed at `date`."
exit $EX_OK

Git via proxy

One of the machines involved is behind a gateway machine at work, so I had to add the following to ~/.ssh/config:

Host work
     ProxyCommand ssh -q gateway.example.org nc %h %p $*
     HostName work-pc.example.org

With this, it’s possible to refer to just “work”, and SSH commands (even via Git) will take care of connecting via the proxy.

Setting up pull everywhere

The main idea here is to set up Git “remotes” pointing to all the other machines.

To be able to get the updates from the repository in ~/settings on my.example.org, simply run the following on all machines (except, of course, the home machine):
git remote add home ssh://home-pc.example.net/~/settings

To be able to get the updates from the “work” host specified with a proxy above, just use “work” for the host name:
git remote add work ssh://work/~/settings

To be able to pull from a machine which changes IP address, you could set up a DynDNS account and use one of their recommended update scripts to be able to refer to your machine using a single DNS name.

After cloning one of the copies on all of your hosts, you should be able to do the following to get all the changes from the repositories:
git remote update && git pull
If this doesn’t work, you might have more luck fetching each repository individually, and then rebasing to it:
git fetch home && git rebase home/master

To keep a backup on a separate machine, just do a
git clone --origin example ssh://example.org/~/settings
there and set up pushing defaults on the other machines using
git config push.default matching
git remote add backup ssh://backup.example.org/~/settings

Then you can just git push backup master to backup the local master branch.

References

About these ads

One thought on “N-way Git synchronization with extra cheese

  1. Pingback: Fix Git repository after Subversion conversion « Paperless

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s