Merging multiple git repositories
Today I had to merge a bunch of existing git repositories into a new monorepo. I had the following requirements:
- The entire history of each repository had to stay intact. This includes the requirement that
git log -- somefile
works without needing any workarounds. - Each repository $repo should be merged into $repo in the monorepo.
- Being able to merge new changes wasn’t an issue, because the original repositories were going to be deleted.
The first two requirements meant that we could use neither subtree merges – as those break file history – nor ordinary merges – as those would put the contents of $repo in the root, not a subdirectory.
The solution is to use git filter-branch
to rewrite the paths in the repositories, and only then merge the
repositories into the monorepo. The final script looked like this:
new_repo=/tmp/scratch/new_repo
base=git@bitbucket.org:customer
projects=(proj1 proj1)
mkdir -p "$new_repo"
cd "$new_repo"
git init
git commit --allow-empty -m "Initial commit"
for prj in "${projects[@]}"; do
git remote add "$prj" "$base/$prj"
git fetch "$prj"
git filter-branch -f --index-filter \
'git ls-files -s | sed "s%\t\"*%&'"$prj"'/%" |
GIT_INDEX_FILE=$GIT_INDEX_FILE.new git update-index --index-info &&
mv "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE"' "$prj/master"
git merge -m "Merge $prj" "$prj/master"
git remote rm "$prj"
done
Part of that script is based on an answer on Stackoverflow.
It should be easy enough to adapt this script for different needs (merging repositories from multiple sources, into different subdirectories and so on.)
Update: Beginning with Git 1.9, git merge
refuses to merge unrelated histories unless the
--allow-unrelated-histories
flag is used.