diff options
| author | Owen Jacobson <owen@grimoire.ca> | 2020-01-28 20:49:17 -0500 |
|---|---|---|
| committer | Owen Jacobson <owen@grimoire.ca> | 2020-01-28 23:23:18 -0500 |
| commit | 0d6f58c54a7af6c8b4e6cd98663eb36ec4e3accc (patch) | |
| tree | a2af4dc93f09a920b0ca375c1adde6d8f64eb6be /wiki/git | |
| parent | acf6f5d3bfa748e2f8810ab0fe807f82efcf3eb6 (diff) | |
Editorial pass & migration to mkdocs.
There's a lot in grimoire.ca that I either no longer stand behind or feel pretty weird about having out there.
Diffstat (limited to 'wiki/git')
| -rw-r--r-- | wiki/git/config.md | 58 | ||||
| -rw-r--r-- | wiki/git/detached-sigs.md | 298 | ||||
| -rw-r--r-- | wiki/git/integrate.md | 41 | ||||
| -rw-r--r-- | wiki/git/pull-request-workflow.md | 101 | ||||
| -rw-r--r-- | wiki/git/scratch.md | 55 | ||||
| -rw-r--r-- | wiki/git/stop-using-git-pull-to-deploy.md | 98 | ||||
| -rw-r--r-- | wiki/git/survival.md | 81 | ||||
| -rw-r--r-- | wiki/git/theory-and-practice/index.md | 42 | ||||
| -rw-r--r-- | wiki/git/theory-and-practice/objects.md | 125 | ||||
| -rw-r--r-- | wiki/git/theory-and-practice/refs-and-names.md | 94 |
10 files changed, 0 insertions, 993 deletions
diff --git a/wiki/git/config.md b/wiki/git/config.md deleted file mode 100644 index 9ee058b..0000000 --- a/wiki/git/config.md +++ /dev/null @@ -1,58 +0,0 @@ -# git-config Settings You Want - -Git comes with some fairly [lkml](http://www.tux.org/lkml/)-specific -configuration defaults. You should fix this. All of the items below can be set -either for your entire login account (`git config --global`) or for a specific -repository (`git config`). - -Full documentation is under `git help config`, unless otherwise stated. - -* `git config user.name 'Your Full Name'` and `git config user.email - 'your-email@example.com'`, obviously. - -* `git config push.default simple` - the default behaviour (called `matching`) - of an unqualified `git push` is to identify pairs of branches by name and - push all matches from your local repository to the remote. Given that - branches have explicit “upstream” configuration identifying which, if any, - branch in which, if any, remote they're associated with, this is dumb. The - `simple` mode pushes the current branch to its upstream remote, if and only - if the local branch name and the remote branch name match _and_ the local - branch tracks the remote branch. Requires Git 1.8 or later; will be the - default in Git 2.0. (For older versions of Git, use `upstream` instead, - which does not require that branch names match.) - -* `git config merge.defaultToUpstream true` - causes an unqualified `git - merge` to merge the current branch's configured upstream branch, rather than - being an error. (`git rebase` always has this behaviour. Consistent!) You - should still merge thoughtfully. - -* `git config rebase.autosquash true` - causes `git rebase -i` to parse magic - comments created by `git commit --squash=some-hash` and `git commit - --fixup=some-hash` and reorder the commit list before presenting it for - further editing. See the descriptions of “squash” and “fixup” in `git help - rebase` for details; autosquash makes amending commits other than the most - recent easier and less error-prone. - -* `git config branch.autosetupmerge always` - newly-created branches whose - start point is a branch (`git checkout master -b some-feature`, `git branch - some-feature origin/develop`, and so on) will be configured to have the - start point branch as their upstream. By default (with `true` rather than - `always`) this only happens when the start point is a remote-tracking - branch. - -* `git config rerere.enabled true` - enable “reuse recorded resolution.” The - `git help rerere` docs explain it pretty well, but the short version is that - git can record how you resolve conflicts during a “test” merge and reuse the - same approach when resolving the same conflict later, in a “real” merge. - -## For advanced users - -A few things are nice when you're getting started, but become annoying when -you no longer need them. - -* `git config advice.detachedHead` - if you already understand the difference - between having a branch checked out and having a commit checked out, and - already understand what “detatched head” means, the warning on every `git - checkout ...some detatched thing...` isn't helping anyone. This is also - useful repositories used for deployment, where specific commits (from tags, - for example) are regularly checked out. diff --git a/wiki/git/detached-sigs.md b/wiki/git/detached-sigs.md deleted file mode 100644 index b94013c..0000000 --- a/wiki/git/detached-sigs.md +++ /dev/null @@ -1,298 +0,0 @@ -# Notes Towards Detached Signatures in Git - -Git supports a limited form of object authentication: specific object -categories in Git's internal model can have [GPG](/gpg/terrible) signatures -embedded in them, allowing the authorship of the objects to be verified using -[GPG](/gpg/cool)'s underlying trust model. Tag signatures can be used to -verify the authenticity and integrity of the _snapshot associated with a -tag_, and the authenticity of the tag itself, filling a niche broadly similar -to code signing in binary distribution systems. Commit signatures can be used -to verify the authenticity of the _snapshot associated with the commit_, and -the authorship of the commit itself. (Conventionally, commit signatures are -assumed to also authenticate either the entire line of history leading to a -commit, or the diff between the commit and its first parent, or both.) - -Git's existing system has some tradeoffs. - -* Signatures are embedded within the objects they sign. The signature is part - of the object's identity; since Git is content-addressed, this means that - an object can neither be retroactively signed nor retroactively stripped of - its signature without modifying the object's identity. Git's distributed - model means that these sorts of identity changes are both complicated and - easily detected. - -* Commit signatures are second-class citizens. They're a relatively recent - addition to the Git suite, and both the implementation and the social - conventions around them continue to evolve. - -* Only some objects can be signed. While Git has relatively weak rules about - workflow, the signature system assumes you're using one of Git's more - widespread workflows by limiting your options to at most one signature, and - by restricting signatures to tags and commits (leaving out blobs, trees, - and refs). - -I believe it would be useful from an authentication standpoint to add -"detached" signatures to Git, to allow users to make these tradeoffs -differently if desired. These signatures would be stored as separate (blob) -objects in a dedicated `refs` namespace, supporting retroactive signatures, -multiple signatures for a given object, "policy" signatures, and -authentication of arbitrary objects. - -The following notes are partially guided by Git's one existing "detached -metadata" facility, `git notes`. Similarities are intentional; divergences -will be noted where appropriate. Detached signatures are meant to -interoperate with existing Git workflow as much as possible: in particular, -they can be fetched and pushed like any other bit of Git metadata. - -A detached signature cryptographically binds three facts together into an -assertion whose authenticity can be checked by anyone with access to the -signatory's keys: - -1. An object (in the Git sense; a commit, tag, tree, or blob), -2. A policy label, and -3. A signatory (a person or agent making the assertion). - -These assertions can be published separately from or in tandem with the -objects they apply to. - -## Policies - -Taking a hint from Monotone, every signature includes a "policy" identifying -how the signature is meant to be interpreted. Policies are arbitrary strings; -their meaning is entirely defined by tooling and convention, not by this -draft. - -This draft uses a single policy, `author`, for its examples. A signature -under the `author` policy implies that the signatory had a hand in the -authorship of the designated object. (This is compatible with existing -interpretations of signed tags and commits.) (Authorship under this model is -strictly self-attested: you can claim authorship of anything, and you cannot -assert anyone else's authorship.) - -The Monotone documentation suggests a number of other useful policies related -to testing and release status, automated build results, and numerous other -factors. Use your imagination. - -## What's In A Signature - -Detached signatures cover the disk representation of an object, as given by - - git cat-file <TYPE> <SHA1> - -For most of Git's object types, this means that the signed content is plain -text. For `tree` objects, the signed content is the awful binary -representation of the tree, _not_ the pretty representation given by `git -ls-tree` or `git show`. - -Detached signatures include the "policy" identifier in the signed content, to -prevent others from tampering with policy choices via `refs` hackery. (This -will make more sense momentarily.) The policy identifier is prepended to the -signed content, terminated by a zero byte (as with Git's own type -identifiers, but without a length field as length checks are performed by -signing and again when the signature is stored in Git). - -To generate the _complete_ signable version of an object, use something -equivalent to the following shell snippet: - - # generate-signable POLICY TYPE SHA1 - function generate-signable() { - printf '%s\0' "$1" - git cat-file "$2" "$3" - } - -(In the process of writing this, I discovered how hard it is to get Unix's -C-derived shell tools to emit a zero byte.) - -## Signature Storage and Naming - -We assume that a userid will sign an object at most once. - -Each signature is stored in an independent blob object in the repository it -applies to. The signature object (described above) is stored in Git, and its -hash recorded in `refs/signatures/<POLICY>/<SUBJECT SHA1>/<SIGNER KEY -FINGERPRINT>`. - - # sign POLICY TYPE SHA1 FINGERPRINT - function sign() { - local SIG_HASH=$( - generate-signable "$@" | - gpg --batch --no-tty --sign -u "$4" | - git hash-object --stdin -w -t blob - ) - git update-ref "refs/signatures/$1/$3/$4" - } - -Stored signatures always use the complete fingerprint to identify keys, to -minimize the risk of colliding key IDs while avoiding the need to store full -keys in the `refs` naming hierarchy. - -The policy name can be reliably extracted from the ref, as the trailing part -has a fixed length (in both path segments and bytes) and each ref begins with -a fixed, constant prefix `refs/signatures/`. - -## Signature Verification - -Given a signature ref as described above, we can verify and authenticate the -signature and bind it to the associated object and policy by performing the -following check: - -1. Pick apart the ref into policy, SHA1, and key fingerprint parts. -2. Reconstruct the signed body as above, using the policy name extracted from - the ref. -3. Retrieve the signature from the ref and combine it with the object itself. -4. Verify that the policy in the stored signature matches the policy in the - ref. -5. Verify the signature with GPG: - - # verify-gpg POLICY TYPE SHA1 FINGERPRINT - verify-gpg() { - { - git cat-file "$2" "$3" - git cat-file "refs/signatures/$1/$3/$4" - } | gpg --batch --no-tty --verify - } - -6. Verify the key fingerprint of the signing key matches the key fingerprint - in the ref itself. - -The specific rules for verifying the signature in GPG are left up to the user -to define; for example, some sites may want to auto-retrieve keys and use a -web of trust from some known roots to determine which keys are trusted, while -others may wish to maintain a specific, known keyring containing all signing -keys for each policy, and skip the web of trust entirely. This can be -accomplished via `git-config`, given some work, and via `gpg.conf`. - -## Distributing Signatures - -Since each signature is stored in a separate ref, and since signatures are -_not_ expected to be amended once published, the following refspec can be -used with `git fetch` and `git push` to distribute signatures: - - refs/signatures/*:refs/signatures/* - -Note the lack of a `+` decoration; we explicitly do not want to auto-replace -modified signatures, normally; explicit user action should be required. - -## Workflow Notes - -There are two verification workflows for signatures: "static" verification, -where the repository itself already contains all the refs and objects needed -for signature verification, and "pre-receive" verification, where an object -and its associated signature may be being uploaded at the same time. - -_It is impractical to verify signatures on the fly from an `update` hook_. -Only `pre-receive` hooks can usefully accept or reject ref changes depending -on whether the push contains a signature for the pushed objects. (Git does -not provide a good mechanism for ensuring that signature objects are pushed -before their subjects.) Correctly verifying object signatures during -`pre-receive` regardless of ref order is far too complicated to summarize -here. - -## Attacks - -### Lies of Omission - -It's trivial to hide signatures by deleting the signature refs. Similarly, -anyone with access to a repository can delete any or all detached signatures -from it without otherwise invalidating the signed objects. - -Since signatures are mostly static, sites following the recommended no-force -policy for signature publication should only be affected if relatively recent -signatures are deleted. Older signatures should be available in one or more -of the repository users' loca repositories; once created, a signature can be -legitimately obtained from anywhere, not only from the original signatory. - -The signature naming protocol is designed to resist most other forms of -assertion tampering, but straight-up omission is hard to prevent. - -### Unwarranted Certification - -The `policy` system allows any signatory to assert any policy. While -centralized signature distribution points such as "release" repositories can -make meaningful decisions about which signatures they choose to accept, -publish, and propagate, there's no way to determine after the fact whether a -policy assertion was obtained from a legitimate source or a malicious one -with no grounds for asserting the policy. - -For example, I could, right now, sign an `all-tests-pass` policy assertion -for the Linux kernel. While there's no chance on Earth that the LKML team -would propagate that assertion, if I can convince you to fetch signatures -from my repository, you will fetch my bogus assertion. If `all-tests-pass` is -a meaningful policy assertion for the Linux kernel, then you will have very -few options besides believing that I assert that all tests have passed. - -### Ambigiuous Policy - -This is an ongoing problem with crypto policy systems and user interfaces -generally, but this design does _nothing_ to ensure that policies are -interpreted uniformly by all participants in a repository. In particular, -there's no mechanism described for distributing either prose or programmatic -policy definitions and checks. All policy information is out of band. - -Git already has ambiguity problems around commit signing: there are multiple -ways to interpret a signature on a commit: - -1. I assert that this snapshot and commit message were authored as described - in this commit's metadata. (In this interpretation, the signature's - authenticity guarantees do _not_ transitively apply to parents.) - -2. I assert that this snapshot and commit message were authored as described - in this commit's metadata, based on exactly the parent commits described. - (In this interpretation, the signature's authenticity guarantees _do_ - transitively apply to parents. This is the interpretation favoured by XXX - LINK HERE XXX.) - -3. I assert that this _diff_ and commit message was authored as described in - this commit's metadata. (No assertions about the _snapshot_ are made - whatsoever, and assertions about parentage are barely sensical at all. - This meshes with widespread, diff-oriented policies.) - -### Grafts and Replacements - -Git permits post-hoc replacement of arbitrary objects via both the grafts -system (via an untracked, non-distributed file in `.git`, though some -repositories distribute graft lists for end-users to manually apply) and the -replacements system (via `refs/replace/<SHA1>`, which can optionally be -fetched or pushed). The interaction between these two systems and signature -verification needs to be _very_ closely considered; I've not yet done so. - -Cases of note: - -* Neither signature nor subject replaced - the "normal" case -* Signature not replaced, subject replaced (by graft, by replacement, by both) -* Signature replaced, subject not replaced -* Both signature and subject replaced - -It's tempting to outright disable `git replace` during signing and -verification, but this will have surprising effects when signing a ref-ish -instead of a bare hash. Since this is the _normal_ case, I think this merits -more thought. (I'm also not aware of a way to disable grafts without -modifying `.git`, and having the two replacement mechanisms treated -differently may be dangerous.) - -### No Signed Refs - -I mentioned early in this draft that Git's existing signing system doesn't -support signing refs themselves; since refs are an important piece of Git's -workflow ecosystem, this may be a major omission. Unfortunately, this -proposal doesn't address that. - -## Possible Refinements - -* Monotone's certificate system is key+value based, rather than label-based. - This might be useful; while small pools of related values can be asserted - using mutually exclusive policy labels (whose mutual exclusion is a matter - of local interpretation), larger pools of related values rapidly become - impractical under the proposed system. - - For example, this proposal would be inappropriate for directly asserting - third-party authorship; the asserted author would have to appear in the - policy name itself, exposing the user to a potentially very large number of - similar policy labels. - -* Ref signing via a manifest (a tree constellation whose paths are ref names - and whose blobs sign the refs' values). Consider cribbing DNSSEC here for - things like lightweight absence assertions, too. - -* Describe how this should interact with commit-duplicating and - commit-rewriting workflows. diff --git a/wiki/git/integrate.md b/wiki/git/integrate.md deleted file mode 100644 index 801ddd5..0000000 --- a/wiki/git/integrate.md +++ /dev/null @@ -1,41 +0,0 @@ -# Integrating with Git: A Field Guide - -Pretty much everything you might want to do to a Git repository when writing -tooling or integrations should be done by shelling out to one `git` command or -another. - -## Finding Git's trees - -Git commands can be invoked from locations other than the root of the work -tree or git directory. You can find either of those by invoking `git -rev-parse`. - -To find the absolute path to the root of the work tree: - - git rev-parse --show-toplevel - -This will output the absolute path to the root of the work tree on standard -output, followed by a newline. Since the work tree's absolute path can contain -whitespace (including newlines), you should assume every byte of output save -the final newline is part of the path, and if you're using this in a shell -script, quote defensively. - -To find the relative path from the current working directory: - - git rev-parse --show-cdup - -This will output the relative path to the root of the work tree on standard -output, followed by a newline. - -For bare repositories, both commands will output nothing and exit with a zero -status. (Surprise!) - -To find *a* path to the root of the git directory: - - git rev-parse --git-dir - -This will output either the relative or the absolute path to the git -directory, followed by a newline. - -All three of these commands will exit with non-zero status when run outside of -a work tree or git directory. Check for it. diff --git a/wiki/git/pull-request-workflow.md b/wiki/git/pull-request-workflow.md deleted file mode 100644 index 700eeb6..0000000 --- a/wiki/git/pull-request-workflow.md +++ /dev/null @@ -1,101 +0,0 @@ -# Life With Pull Requests - -I've been party to a number of discussions with folks contributing to -pull-request-based projects on Github (and other hosts, but mostly Github). -Because of Git's innate flexibility, there are lots of ways to work with pull -requests. Here's mine. - -I use a couple of naming conventions here that are not stock `git`: - -origin -: The repository to which you _publish_ proposed changes - -upstream -: The repository from which you receive ongoing development, and which will - receive your changes. - -## One-time setup - -Do these things once, when starting out on a project. Keep the results around -for later. - -I'll be referring to the original project repository as `upstream` and -pretending its push URL is `UPSTREAM-URL` below. In real life, the URL will -often be something like `git@github.com:someguy/project.git`. - -### Fork the project - -Use the repo manager's forking tool to create a copy of the project in your -own namespace. This generally creates your copy with a bunch of useless tat; -feel free to ignore all of this, as the only purpose of this copy is to -provide somewhere for _you_ to publish _your_ changes. - -We'll be calling this repository `origin` later. Assume it has a URL, which -I'll abbreviate `ORIGIN-URL`, for `git push` to use. - -(You can leave this step for later, but if you know you're going to do it, why -not get it out of the way?) - -### Clone the project and configure it - -You'll need a clone locally to do work in. Create one from `origin`: - - git clone ORIGIN-URL some-local-name - -While you're here, `cd` into it and add the original project as a remote: - - cd some-local-name - git remote add upstream UPSTREAM-URL - -## Feature process - -Do these things for each feature you work on. To switch features, just use -`git checkout my-feature`. - -### Create a new feature branch locally - -We use `upstream`'s `master` branch here, so that your feature includes all of -`upstream`'s state initially. We also need to make sure our local cache of -`upstream`'s state is correct: - - git fetch upstream - git checkout upstream/master -b my-feature - -### Do work - -If you need my help here, stop now. - -### Integrate upstream changes - -If you find yourself needing something that's been added upstream, use -_rebase_ to integrate it to avoid littering your feature branch with -“meaningless” merge commits. - - git checkout my-feature - git fetch upstream - git rebase upstream/master - -### Publish your branch - -When you're “done,” publish your branch to your personal repository: - - git push origin my-feature - -Then visit your copy in your repo manager's web UI and create a pull request -for `my-feature`. - -### Integrating feedback - -Very likely, your proposed changes will need work. If you use history-editing -to integrate feedback, you will need to use `--force` when updating the -branch: - - git push --force origin my-feature - -This is safe provided two things are true: - -1. **The branch has not yet been merged to the upstream repo.** -2. You are only force-pushing to your fork, not to the upstream repo. - -Generally, no other users will have work based on your pull request, so -force-pushing history won't cause problems. diff --git a/wiki/git/scratch.md b/wiki/git/scratch.md deleted file mode 100644 index a26c98f..0000000 --- a/wiki/git/scratch.md +++ /dev/null @@ -1,55 +0,0 @@ -# Git Is Not Magic - -I'm bored. Let's make a git repository out of whole cloth. - -Git repos are stored in .git: - - fakegit$ mkdir .git - -They have a “symbolic ref” (which are text files, see [`man -git-symbolic-ref`](http://jk.gs/git-symbolic-ref.html)) named `HEAD`, pointing -to the currently checked-out branch. Let's use `master`. Branches are refs -under `refs/heads` (see [`man git-branch`](http://jk.gs/git-branch.html)): - - fakegit ((unknown))$ echo 'ref: refs/heads/master' > .git/HEAD - -The have an object database and a refs database, both of which are simple -directories (see [`man -gitrepository-layout`](http://jk.gs/gitrepository-layout.html) and [`man -gitrevisions`](http://jk.gs/gitrevisions.html)). Let's also enable the reflog, -because it's a great safety net if you use history-editing tools in git: - - fakegit ((ref: re...))$ mkdir .git/refs .git/objects .git/logs - fakegit (master #)$ - -Now `__git_ps1`, at least, is convinced that we have a working git repository. -Does it work? - - fakegit (master #)$ echo 'Hello, world!' > hello.txt - fakegit (master #)$ git add hello.txt - fakegit (master #)$ git commit -m 'Initial commit' - [master (root-commit) 975307b] Initial commit - 1 file changed, 1 insertion(+) - create mode 100644 hello.txt - - fakegit (master)$ git log - commit 975307ba0485bff92e295e3379a952aff013c688 - Author: Owen Jacobson <owen.jacobson@grimoire.ca> - Date: Wed Feb 6 10:07:07 2013 -0500 - - Initial commit - -[Eeyup](https://www.youtube.com/watch?v=3VwVpaWUu30). - ------ - -Should you do this? **Of course not.** Anywhere you could run these commands, -you could instead run `git init` or `git clone`, which set up a number of -other structures, including `.git/config` and any unusual permissions options. -The key part here is that a directory's identity as “a git repository” is -entirely a function of its contents, not of having been blessed into being by -`git` itself. - -You can infer a lot from this: for example, you can infer that it's “safe” to -move git repositories around using FS tools, or to back them up with the same -tools, for example. This is not as obvious to everyone as you might hope; people diff --git a/wiki/git/stop-using-git-pull-to-deploy.md b/wiki/git/stop-using-git-pull-to-deploy.md deleted file mode 100644 index 078c95b..0000000 --- a/wiki/git/stop-using-git-pull-to-deploy.md +++ /dev/null @@ -1,98 +0,0 @@ -# Stop using `git pull` for deployment! - -## The problem - -* You have a Git repository containing your project. -* You want to “deploy” that code when it changes. -* You'd rather not download the entire project from scratch for each - deployment. - -## The antipattern - -“I know, I'll use `git pull` in my deployment script!” - -Stop doing this. Stop teaching other people to do this. It's wrong, and it -will eventually lead to deploying something you didn't want. - -Deployment should be based on predictable, known versions of your code. -Ideally, every deployable version has a tag (and you deploy exactly that tag), -but even less formal processes, where you deploy a branch tip, should still be -deploying exactly the code designated for release. `git pull`, however, can -introduce new commits. - -`git pull` is a two-step process: - -1. Fetch the current branch's designated upstream remote, to obtain all of the - remote's new commits. -2. Merge the current branch's designated upstream branch into the current - branch. - -The merge commit means the actual deployed tree might _not_ be identical to -the intended deployment tree. Local changes (intentional or otherwise) will be -preserved (and merged) into the deployment, for example; once this happens, -the actual deployed commit will _never_ match the intended commit. - -`git pull` will approximate the right thing “by accident”: if the current -local branch (generally `master`) for people using `git pull` is always clean, -and always tracks the desired deployment branch, then `git pull` will update -to the intended commit exactly. This is pretty fragile, though; many git -commands can cause the local branch to diverge from its upstream branch, and -once that happens, `git pull` will always create new commits. You can patch -around the fragility a bit using the `--ff-only` option, but that only tells -you when your deployment environment has diverged and doesn't fix it. - -## The right pattern - -Quoting [Sitaram Chamarty](http://gitolite.com/the-list-and-irc/deploy.html): - -> Here's what we expect from a deployment tool. Note the rule numbers -- -> we'll be referring to some of them simply by number later. -> -> 1. All files in the branch being deployed should be copied to the -> deployment directory. -> -> 2. Files that were deleted in the git repo since the last deployment -> should get deleted from the deployment directory. -> -> 3. Any changes to tracked files in the deployment directory after the -> last deployment should be ignored when following rules 1 and 2. -> -> However, sometimes you might want to detect such changes and abort if -> you found any. -> -> 4. Untracked files in the deploy directory should be left alone. -> -> Again, some people might want to detect this and abort the deployment. - -Sitaram's own documentation talks about how to accomplish these when -“deploying” straight out of a bare repository. That's unwise (not to mention -impractical) in most cases; deployment should use a dedicated clone of the -canonical repository. - -I also disagree with point 3, preferring to keep deployment-related changes -outside of tracked files. This makes it much easier to argue that the changes -introduced to configure the project for deployment do not introduce new bugs -or other surprise features. - -My deployment process, given a dedicated clone at `$DEPLOY_TREE`, is as -follows: - - cd "${DEPLOY_TREE}" - git fetch --all - git checkout --force "${TARGET}" - # Following two lines only required if you use submodules - git submodule sync - git submodule update --init --recursive - # Follow with actual deployment steps (run fabric/capistrano/make/etc) - -`$TARGET` is either a tag name (`v1.2.1`) or a remote branch name -(`origin/master`), but could also be a commit hash or anything else Git -recognizes as a revision. This will detach the head of the `$DEPLOY_TREE` -repository, which is fine as no new changes should be authored in this -repository (so the local branches are irrelevant). The warning Git emits when -`HEAD` becomes detached is unimportant in this case. - -The tracked contents of `$DEPLOY_TREE` will end up identical to the desired -commit, discarding local changes. The pattern above is very similar to what -most continuous integration servers use when building from Git repositories, -for much the same reason. diff --git a/wiki/git/survival.md b/wiki/git/survival.md deleted file mode 100644 index 60d1b62..0000000 --- a/wiki/git/survival.md +++ /dev/null @@ -1,81 +0,0 @@ -# Git Survival Guide - -I think the `git` UI is pretty awful, and encourages using Git in ways that -will screw you. Here are a few things I've picked up that have saved my bacon. - -* You will inevitably need to understand Git's “internals” to make use of it - as an SCM tool. Accept this early. If you think your SCM tool should not - expose you to so much plumbing, [don't](http://mercurial.selenic.com) - [use](http://bazaar.canonical.com) [Git](http://subversion.apache.org). - * Git weenies will claim that this plumbing is what gives Git all of its - extra power. This is true; it gives Git the power to get you out of - situations you wouldn't be in without Git. -* `git log --graph --decorate --oneline --color --all` -* Run `git fetch` habitually. Stale remote-tracking branches lead to sadness. -* `git push` and `git pull` are **not symmetric**. `git push`'s - opposite operation is `git fetch`. (`git pull` is equivalent to `git fetch` - followed by `git merge`, more or less). -* [Git configuration values don't always have the best defaults](config). -* The upstream branch of `foo` is `foo@{u}`. The upstream branch of your - checked-out branch is `HEAD@{u}` or `@{u}`. This is documented in `git help - revisions`. -* You probably don't want to use a merge operation (such as `git pull`) to - integrate upstream changes into topic branches. The resulting history can be - very confusing to follow, especially if you integrate upstream changes - frequently. - * You can leave topic branches “real” relatively safely. You can do - a test merge to see if they still work cleanly post-integration without - actually integrating upstream into the branch permanently. - * You can use `git rebase` or `git pull --rebase` to transplant your - branch to a new, more recent starting point that includes the changes - you want to integrate. This makes the upstream changes a permanent part - of your branch, just like `git merge` or `git pull` would, but generates - an easier-to-follow history. Conflict resolution will happen as normal. -* Example test merge, using `origin/master` as the upstream branch and `foo` - as the candidate for integration: - - git fetch origin - git checkout origin/master -b test-merge-foo - git merge foo - # run tests, examine files - git diff origin/master..HEAD - - To discard the test merge, delete the branch after checking out some other - branch: - - git checkout foo - git branch -D test-merge-foo - - You can combine this with `git rerere` to save time resolving conflicts in - a later “real,” permanent merge. - -* You can use `git checkout -p` to build new, tidy commits out of a branch - laden with “wip” commits: - - git fetch - git checkout $(git merge-base origin/master foo) -b foo-cleaner-history - git checkout -p foo -- paths/to/files - # pick out changes from the presented patch that form a coherent commit - # repeat 'git checkout -p foo --' steps for related files to build up - # the new commit - git commit - # repeat 'git checkout -p foo --' and 'git commit' steps until no diffs remain - - * Gotcha: `git checkout -p` will do nothing for files that are being - created. Use `git checkout`, instead, and edit the file if necessary. - Thanks, Git. - * Gotcha: The new, clean branch must diverge from its upstream branch - (`origin/master`, in the example above) at exactly the same point, or - the diffs presented by `git checkout -p foo` will include chunks that - revert changes on the upstream branch since the “dirty” branch was - created. The easiest way to find this point is with `git merge-base`. - -## Useful Resources - -That is, resoures that can help you solve problems or understand things, not -resources that reiterate the man pages for you. - -* Sitaram Chamarty's [git concepts - simplified](http://sitaramc.github.com/gcs/) -* Tv's [Git for Computer - Scientists](http://eagain.net/articles/git-for-computer-scientists) diff --git a/wiki/git/theory-and-practice/index.md b/wiki/git/theory-and-practice/index.md deleted file mode 100644 index f257b12..0000000 --- a/wiki/git/theory-and-practice/index.md +++ /dev/null @@ -1,42 +0,0 @@ -# Git Internals 101 - -Yeah, yeah, another article about “how Git works.” There are tons of these -already. Personally, I'm fond of Sitaram Chamarty's [fantastic series of -articles](http://gitolite.com/master-toc.html) explaining Git from both ends, -and of [Git for Computer -Scientists](http://eagain.net/articles/git-for-computer-scientists/). Maybe -you'd rather read those. - -This page was inspired by very specific, recurring issues I've run into while -helping people use Git. I think Git's “porcelain” layer -- its user interface --- is terrible, and does a bad job of insulating non-expert users from Git's -internals. While I'd love to fix that (and I do contribute to discussions on -that front, too), we still have the `git(1)` UI right now and people still get -into trouble with it right now. - -Git follows the New Jersey approach laid out in Richard Gabriel's [The Rise of -“Worse is Better”](http://www.dreamsongs.com/RiseOfWorseIsBetter.html): given -the choice between a simple implementation and a simple interface, Git chooses -the simple implementation almost everywhere. This internal simplicity can give -users the leverage to fix the problems that its horrible user interface leads -them into, so these pages will focus on explaining the simple parts and giving -users the tools to examine them. - -Throughout these articles, I've written “Git does X” a lot. Git is -_incredibly_ configurable; read that as “Git does X _by default_.” I'll try to -call out relevant configuration options as I go, where it doesn't interrupt -the flow of knowledge. - -* [Objects](objects) -* [Refs and Names](refs-and-names) - -By the way, if you think you're just going to follow the -[many](http://git-scm.com/documentation) -[excellent](http://www.atlassian.com/git/tutorial) -[git](http://try.github.io/levels/1/challenges/1) -[tutorials](https://www.kernel.org/pub/software/scm/git/docs/gittutorial.html) -out there and that you won't need this knowledge, well, you will. You can -either learn it during a quiet time, when you can think and experiment, or you -can learn it when something's gone wrong, and everyone's shouting at each -other. Git's high-level interface doesn't do much to keep you on the sensible -path, and you will eventually need to fix something. diff --git a/wiki/git/theory-and-practice/objects.md b/wiki/git/theory-and-practice/objects.md deleted file mode 100644 index 6bf975a..0000000 --- a/wiki/git/theory-and-practice/objects.md +++ /dev/null @@ -1,125 +0,0 @@ -# Objects - -Git's basest level is a storage and naming system for things Git calls -“objects.” These objects hold the bulk of the data about files and projects -tracked by Git: file contents, directory trees, commits, and so on. Every -object is identified by a SHA-1 hash, which is derived from its contents. - -SHA-1 hashes are obnoxiously long, so Git allows you to substitue any unique -prefix of a SHA-1 hash, so long as it's at least four characters long. If the -hash `0b43b9e3e64793f5a222a644ed5ab074d8fa1024` is present in your repository, -then Git commands will understand `0b43`, `0b43b9`, and other patterns to all -refer to the same object, so long as no other object has the same SHA-1 -prefix. - -## Blobs - -The contents of every file that's ever been stored in a Git repository are -stored as `blob` objects. These objects are very simple: they contain the file -contents, byte for byte. - -## Trees - -File contents (and trees, and Other Things we'll get to later) are tied -together into a directory structure by `tree` objects. These objects contain a -list of records, with one child per record. Each record contains a permissions -field corresponding to the POSIX permissions mask of the object, a type, a -SHA-1 for another object, and a name. - -A directory containing only files might be represented as the tree - - 100644 blob 511542ad6c97b28d720c697f7535897195de3318 config.md - 100644 blob 801ddd5ae10d6282bbf36ccefdd0b052972aa8e2 integrate.md - 100644 blob 61d28155862607c3d5d049e18c5a6903dba1f85e scratch.md - 100644 blob d7a79c144c22775239600b332bfa120775bab341 survival.md - -while a directory with subdirectories would also have some `tree` children: - - 040000 tree f57ef2457a551b193779e21a50fb380880574f43 12factor - 040000 tree 844697ce99e1ef962657ce7132460ad7a38b7584 authnz - 100644 blob 54795f9b774547d554f5068985bbc6df7b128832 cool-urls-can-change.md - 040000 tree fc3f39eb5d1a655374385870b8be56b202be7dd8 dev - 040000 tree 22cbfb2c1d7b07432ea7706c36b0d6295563c69c devops - 040000 tree 0b3e63b4f32c0c3acfbcf6ba28d54af4c2f0d594 git - 040000 tree 5914fdcbd34e00e23e52ba8e8bdeba0902941d3f java - 040000 tree 346f71a637a4f8933dc754fef02515a8809369c4 mysql - 100644 blob b70520badbb8de6a74b84788a7fefe64a432c56d packaging-ideas.md - 040000 tree 73ed6572345a368d20271ec5a3ffc2464ac8d270 people - -## Commits - -Blobs and trees are sufficient to store arbitrary directory trees in Git, and -you could use them that way, but Git is mostly used as a revision-tracking -system. Revisions and their history are represented by `commit` objects, which contain: - -* The SHA-1 hash of the root `tree` object of the commit, -* Zero or more SHA-1 hashes for parent commits, -* The name and email address of the commit's “author,” -* The name and email address of the commit's “committer,” -* Timestamps representing when the commit was authored and committed, and -* A commit message. - -Commit objects' parent references form a directed acyclic graph; the subgraph -reachable from a specific commit is that commit's _history_. - -When working with Git's user interface, commit parents are given in a -predictable order determined by the `git checkout` and `git merge` commands. - -## Tags - -Git's revision-tracking system supports “tags,” which are stable names for -specific configurations. It also, uniquely, supports a concept called an -“annotated tag,” represented by the `tag` object type. These annotated tag -objects contain - -* The type and SHA-1 hash of another object, -* The name and email address of the person who created the tag, -* A timestamp representing the moment the tag was created, and -* A tag message. - -## Anonymity - -There's a general theme to Git's object types: no object knows its own name. -Every object only has a name in the context of some containing object, or in -the context of [Git's refs mechanism](refs-and-names), which I'll get to -shortly. This means that the same `blob` object can be reused for multiple -files (or, more probably, the same file in multiple commits), if they happen -to have the same contents. - -This also applies to tag objects, even though their role is part of a system -for providing stable, meaningful names for commits. - -## Examining objects - -* `git cat-file <type> <sha1>`: decodes the object `<sha1>` and prints its - contents to stdout. This prints the object's contents in their raw form, - which is less than useful for `tree` objects. - -* `git cat-file -p <sha1>`: decodes the object `<sha1>` and pretty-prints it. - This pretty-printing stays close to the underlying disk format; it's most - useful for decoding `tree` objects. - -* `git show <sha1>`: decodes the object `<sha1>` and formats its contents to - stdout. For blobs, this is identical to what `git cat-file blob` would do, - but for trees, commits, and tags, the output is reformated to be more - readable. - -## Storage - -Objects are stored in two places in Git: as “loose objects,” and in “pack -files.” Newly-created objects are initially loose objects, for ease of -manipulation; transferring objects to another repository or running certain -administrative commands can cause them to be placed in pack files for faster -transfer and for smaller storage. - -Loose objects are stored directly on the filesystem, in the Git repository's -`objects` directory. Git takes a two-character prefix off of each object's -SHA-1 hash, and uses that to pick a subdirectory of `objects` to store the -object in. The remainder of the hash forms the filename. Loose objects are -compressed with zlib, to conserve space, but the resulting directory tree can -still be quite large. - -Packed objects are stored together in packed files, which live in the -repository's `objects/pack` directory. These packed files are both compressed -and delta-encoded, allowing groups of similar objects to be stored very -compactly. diff --git a/wiki/git/theory-and-practice/refs-and-names.md b/wiki/git/theory-and-practice/refs-and-names.md deleted file mode 100644 index 025ae88..0000000 --- a/wiki/git/theory-and-practice/refs-and-names.md +++ /dev/null @@ -1,94 +0,0 @@ -# Refs and Names - -Git's [object system](objects) stores most of the data for projects tracked in -Git, but only provides SHA-1 hashes. This is basically useless if you want to -make practical use of Git, so Git also has a naming mechanism called “refs” -that provide human-meaningful names for objects. - -There are two kinds of refs: - -* “Normal” refs, which are names that resolve directly to SHA-1 hashes. These - are the vast majority of refs in most repositories. - -* “Symbolic” refs, which are names that resolve to other refs. In most - repositories, only a few of these appear. (Circular references are possible - with symbolic refs. Git will refuse to resolve these.) - -Anywhere you could use a SHA-1, you can use a ref instead. Git interprets them -identically, after resolving the ref down to the SHA-1. - -## Namespaces - -Every operation in Git that uses a name of some sort, including branching -(branch names), tagging (tag names), fetching (remote-tracking branch names), -and pushing (many kinds of name), expands those names to refs, using a -namespace convention. The following namespaces are common: - -* `refs/heads/NAME`: branches. The branch name is the ref name with - `refs/heads/` removed. Names generally point to commits. - -* `refs/remotes/REMOTE/NAME`: “remote-tracking” branches. These are maintained - in tandem by `git remote` and `git fetch`, to cache the state of other - repositories. Names generally point to commits. - -* `refs/tags/NAME`: tags. The tag name is the ref name with `refs/heads/` - removed. Names generally point to commits or tag objects. - -* `refs/bisect/STATE`: `git bisect` markers for known-good and known-bad - revisions, from which the rest of the bisect state can be derived. - -There are also a few special refs directly in the `refs/` namespace, most -notably: - -* `refs/stash`: The most recent stash entry, as maintained by `git stash`. - (Other stash entries are maintained by a separate system.) Names generally - point to commits. - -Tools can invent new refs for their own purposes, or manipulate existing refs; -the convention is that tools that use refs (which is, as I said, most of them) -respect the state of the ref as if they'd created that state themselves, -rather than sanity-checking the ref before using it. - -## Special refs - -There are a handful of special refs used by Git commands for their own -operation. These refs do _not_ begin with `refs/`: - -* `HEAD`: the “current” commit for most operations. This is set when checking - out a commit, and many revision-related commands default to `HEAD` if not - given a revision to operate on. `HEAD` can either be a symbolic ref - (pointing to a branch ref) or a normal ref (pointing directly to a commit), - and is very frequently a symbolic ref. - -* `MERGE_HEAD`: during a merge, `MERGE_HEAD` resolves to the commit whose - history is being merged. - -* `ORIG_HEAD`: set by operations that change `HEAD` in potentially destructive - ways by resolving `HEAD` before making the change. - -* `CHERRY_PICK_HEAD` is set during `git cherry-pick` to the commit whose - changes are being copied. - -* `FETCH_HEAD` is set by the forms of `git fetch` that fetch a single ref, and - points to the commit the fetched ref pointed to. - -## Examining and manipulating refs - -The `git show-ref` command will list the refs in namespaces under `refs` in -your repository, printing the SHA-1 hashes they resolve to. Pass `--head` to -also include `HEAD`. - -The following commands can be used to manipulate refs directly: - -* `git update-ref <ref> <sha1>` forcibly sets `<ref>` to the passed `<sha1>`. - -* `git update-ref -d <ref>` deletes a ref. - -* `git symbolic-ref <ref>` prints the target of `<ref>`, if `<ref>` is a - symbolic ref. (It will fail with an error message for normal refs.) - -* `git symbolic-ref <ref> <target>` forcibly makes `<ref>` a symbolic ref - pointing to `<target>`. - -Additionally, you can see what ref a given name resolves to using `git -rev-parse --symbolic-full-name <name>` or `git show-ref <name>`. |
