diff options
Diffstat (limited to 'docs/git/stop-using-git-pull-to-deploy.md')
| -rw-r--r-- | docs/git/stop-using-git-pull-to-deploy.md | 98 |
1 files changed, 98 insertions, 0 deletions
diff --git a/docs/git/stop-using-git-pull-to-deploy.md b/docs/git/stop-using-git-pull-to-deploy.md new file mode 100644 index 0000000..078c95b --- /dev/null +++ b/docs/git/stop-using-git-pull-to-deploy.md @@ -0,0 +1,98 @@ +# Stop using `git pull` for deployment! + +## The problem + +* You have a Git repository containing your project. +* You want to “deploy” that code when it changes. +* You'd rather not download the entire project from scratch for each + deployment. + +## The antipattern + +“I know, I'll use `git pull` in my deployment script!” + +Stop doing this. Stop teaching other people to do this. It's wrong, and it +will eventually lead to deploying something you didn't want. + +Deployment should be based on predictable, known versions of your code. +Ideally, every deployable version has a tag (and you deploy exactly that tag), +but even less formal processes, where you deploy a branch tip, should still be +deploying exactly the code designated for release. `git pull`, however, can +introduce new commits. + +`git pull` is a two-step process: + +1. Fetch the current branch's designated upstream remote, to obtain all of the + remote's new commits. +2. Merge the current branch's designated upstream branch into the current + branch. + +The merge commit means the actual deployed tree might _not_ be identical to +the intended deployment tree. Local changes (intentional or otherwise) will be +preserved (and merged) into the deployment, for example; once this happens, +the actual deployed commit will _never_ match the intended commit. + +`git pull` will approximate the right thing “by accident”: if the current +local branch (generally `master`) for people using `git pull` is always clean, +and always tracks the desired deployment branch, then `git pull` will update +to the intended commit exactly. This is pretty fragile, though; many git +commands can cause the local branch to diverge from its upstream branch, and +once that happens, `git pull` will always create new commits. You can patch +around the fragility a bit using the `--ff-only` option, but that only tells +you when your deployment environment has diverged and doesn't fix it. + +## The right pattern + +Quoting [Sitaram Chamarty](http://gitolite.com/the-list-and-irc/deploy.html): + +> Here's what we expect from a deployment tool. Note the rule numbers -- +> we'll be referring to some of them simply by number later. +> +> 1. All files in the branch being deployed should be copied to the +> deployment directory. +> +> 2. Files that were deleted in the git repo since the last deployment +> should get deleted from the deployment directory. +> +> 3. Any changes to tracked files in the deployment directory after the +> last deployment should be ignored when following rules 1 and 2. +> +> However, sometimes you might want to detect such changes and abort if +> you found any. +> +> 4. Untracked files in the deploy directory should be left alone. +> +> Again, some people might want to detect this and abort the deployment. + +Sitaram's own documentation talks about how to accomplish these when +“deploying” straight out of a bare repository. That's unwise (not to mention +impractical) in most cases; deployment should use a dedicated clone of the +canonical repository. + +I also disagree with point 3, preferring to keep deployment-related changes +outside of tracked files. This makes it much easier to argue that the changes +introduced to configure the project for deployment do not introduce new bugs +or other surprise features. + +My deployment process, given a dedicated clone at `$DEPLOY_TREE`, is as +follows: + + cd "${DEPLOY_TREE}" + git fetch --all + git checkout --force "${TARGET}" + # Following two lines only required if you use submodules + git submodule sync + git submodule update --init --recursive + # Follow with actual deployment steps (run fabric/capistrano/make/etc) + +`$TARGET` is either a tag name (`v1.2.1`) or a remote branch name +(`origin/master`), but could also be a commit hash or anything else Git +recognizes as a revision. This will detach the head of the `$DEPLOY_TREE` +repository, which is fine as no new changes should be authored in this +repository (so the local branches are irrelevant). The warning Git emits when +`HEAD` becomes detached is unimportant in this case. + +The tracked contents of `$DEPLOY_TREE` will end up identical to the desired +commit, discarding local changes. The pattern above is very similar to what +most continuous integration servers use when building from Git repositories, +for much the same reason. |
