From 76aed6ef732de38d82245b3d674f70bab30221e5 Mon Sep 17 00:00:00 2001 From: Owen Jacobson Date: Fri, 3 Jul 2015 22:31:49 -0400 Subject: Fuck it, serve the files directly. --- .html/git/_list.html | 109 +++++++ .html/git/config.html | 151 +++++++++ .html/git/detached-sigs.html | 359 ++++++++++++++++++++++ .html/git/index.html | 109 +++++++ .html/git/integrate.html | 118 +++++++ .html/git/pull-request-workflow.html | 163 ++++++++++ .html/git/scratch.html | 134 ++++++++ .html/git/stop-using-git-pull-to-deploy.html | 178 +++++++++++ .html/git/survival.html | 174 +++++++++++ .html/git/theory-and-practice/_list.html | 96 ++++++ .html/git/theory-and-practice/index.html | 126 ++++++++ .html/git/theory-and-practice/objects.html | 202 ++++++++++++ .html/git/theory-and-practice/refs-and-names.html | 199 ++++++++++++ 13 files changed, 2118 insertions(+) create mode 100644 .html/git/_list.html create mode 100644 .html/git/config.html create mode 100644 .html/git/detached-sigs.html create mode 100644 .html/git/index.html create mode 100644 .html/git/integrate.html create mode 100644 .html/git/pull-request-workflow.html create mode 100644 .html/git/scratch.html create mode 100644 .html/git/stop-using-git-pull-to-deploy.html create mode 100644 .html/git/survival.html create mode 100644 .html/git/theory-and-practice/_list.html create mode 100644 .html/git/theory-and-practice/index.html create mode 100644 .html/git/theory-and-practice/objects.html create mode 100644 .html/git/theory-and-practice/refs-and-names.html (limited to '.html/git') diff --git a/.html/git/_list.html b/.html/git/_list.html new file mode 100644 index 0000000..59ee1d4 --- /dev/null +++ b/.html/git/_list.html @@ -0,0 +1,109 @@ + + + + + The Codex » + ls /git + + + + + + + + +
+ + + + + + + + + + + + + + +
+ + \ No newline at end of file diff --git a/.html/git/config.html b/.html/git/config.html new file mode 100644 index 0000000..c21c4f5 --- /dev/null +++ b/.html/git/config.html @@ -0,0 +1,151 @@ + + + + + The Codex » + git-config Settings You Want + + + + + + + + +
+ + + + + +
+

git-config Settings You Want

+

Git comes with some fairly lkml-specific +configuration defaults. You should fix this. All of the items below can be set +either for your entire login account (git config --global) or for a specific +repository (git config).

+

Full documentation is under git help config, unless otherwise stated.

+
    +
  • +

    git config user.name 'Your Full Name' and git config user.email + 'your-email@example.com', obviously.

    +
  • +
  • +

    git config push.default simple - the default behaviour (called matching) + of an unqualified git push is to identify pairs of branches by name and + push all matches from your local repository to the remote. Given that + branches have explicit “upstream” configuration identifying which, if any, + branch in which, if any, remote they're associated with, this is dumb. The + simple mode pushes the current branch to its upstream remote, if and only + if the local branch name and the remote branch name match and the local + branch tracks the remote branch. Requires Git 1.8 or later; will be the + default in Git 2.0. (For older versions of Git, use upstream instead, + which does not require that branch names match.)

    +
  • +
  • +

    git config merge.defaultToUpstream true - causes an unqualified git + merge to merge the current branch's configured upstream branch, rather than + being an error. (git rebase always has this behaviour. Consistent!) You + should still merge thoughtfully.

    +
  • +
  • +

    git config rebase.autosquash true - causes git rebase -i to parse magic + comments created by git commit --squash=some-hash and git commit + --fixup=some-hash and reorder the commit list before presenting it for + further editing. See the descriptions of “squash” and “fixup” in git help + rebase for details; autosquash makes amending commits other than the most + recent easier and less error-prone.

    +
  • +
  • +

    git config branch.autosetupmerge always - newly-created branches whose + start point is a branch (git checkout master -b some-feature, git branch + some-feature origin/develop, and so on) will be configured to have the + start point branch as their upstream. By default (with true rather than + always) this only happens when the start point is a remote-tracking + branch.

    +
  • +
  • +

    git config rerere.enabled true - enable “reuse recorded resolution.” The + git help rerere docs explain it pretty well, but the short version is that + git can record how you resolve conflicts during a “test” merge and reuse the + same approach when resolving the same conflict later, in a “real” merge.

    +
  • +
+

For advanced users

+

A few things are nice when you're getting started, but become annoying when +you no longer need them.

+
    +
  • git config advice.detachedHead - if you already understand the difference + between having a branch checked out and having a commit checked out, and + already understand what “detatched head” means, the warning on every git + checkout ...some detatched thing... isn't helping anyone. This is also + useful repositories used for deployment, where specific commits (from tags, + for example) are regularly checked out.
  • +
+
+ + + +
+
+ + +comments powered by Disqus +
+ + + + + +
+ + \ No newline at end of file diff --git a/.html/git/detached-sigs.html b/.html/git/detached-sigs.html new file mode 100644 index 0000000..a3e439d --- /dev/null +++ b/.html/git/detached-sigs.html @@ -0,0 +1,359 @@ + + + + + The Codex » + Notes Towards Detached Signatures in Git + + + + + + + + +
+ + + + + +
+

Notes Towards Detached Signatures in Git

+

Git supports a limited form of object authentication: specific object +categories in Git's internal model can have GPG signatures +embedded in them, allowing the authorship of the objects to be verified using +GPG's underlying trust model. Tag signatures can be used to +verify the authenticity and integrity of the snapshot associated with a +tag, and the authenticity of the tag itself, filling a niche broadly similar +to code signing in binary distribution systems. Commit signatures can be used +to verify the authenticity of the snapshot associated with the commit, and +the authorship of the commit itself. (Conventionally, commit signatures are +assumed to also authenticate either the entire line of history leading to a +commit, or the diff between the commit and its first parent, or both.)

+

Git's existing system has some tradeoffs.

+
    +
  • +

    Signatures are embedded within the objects they sign. The signature is part + of the object's identity; since Git is content-addressed, this means that + an object can neither be retroactively signed nor retroactively stripped of + its signature without modifying the object's identity. Git's distributed + model means that these sorts of identity changes are both complicated and + easily detected.

    +
  • +
  • +

    Commit signatures are second-class citizens. They're a relatively recent + addition to the Git suite, and both the implementation and the social + conventions around them continue to evolve.

    +
  • +
  • +

    Only some objects can be signed. While Git has relatively weak rules about + workflow, the signature system assumes you're using one of Git's more + widespread workflows by limiting your options to at most one signature, and + by restricting signatures to tags and commits (leaving out blobs, trees, + and refs).

    +
  • +
+

I believe it would be useful from an authentication standpoint to add +"detached" signatures to Git, to allow users to make these tradeoffs +differently if desired. These signatures would be stored as separate (blob) +objects in a dedicated refs namespace, supporting retroactive signatures, +multiple signatures for a given object, "policy" signatures, and +authentication of arbitrary objects.

+

The following notes are partially guided by Git's one existing "detached +metadata" facility, git notes. Similarities are intentional; divergences +will be noted where appropriate. Detached signatures are meant to +interoperate with existing Git workflow as much as possible: in particular, +they can be fetched and pushed like any other bit of Git metadata.

+

A detached signature cryptographically binds three facts together into an +assertion whose authenticity can be checked by anyone with access to the +signatory's keys:

+
    +
  1. An object (in the Git sense; a commit, tag, tree, or blob),
  2. +
  3. A policy label, and
  4. +
  5. A signatory (a person or agent making the assertion).
  6. +
+

These assertions can be published separately from or in tandem with the +objects they apply to.

+

Policies

+

Taking a hint from Monotone, every signature includes a "policy" identifying +how the signature is meant to be interpreted. Policies are arbitrary strings; +their meaning is entirely defined by tooling and convention, not by this +draft.

+

This draft uses a single policy, author, for its examples. A signature +under the author policy implies that the signatory had a hand in the +authorship of the designated object. (This is compatible with existing +interpretations of signed tags and commits.) (Authorship under this model is +strictly self-attested: you can claim authorship of anything, and you cannot +assert anyone else's authorship.)

+

The Monotone documentation suggests a number of other useful policies related +to testing and release status, automated build results, and numerous other +factors. Use your imagination.

+

What's In A Signature

+

Detached signatures cover the disk representation of an object, as given by

+
git cat-file <TYPE> <SHA1>
+
+

For most of Git's object types, this means that the signed content is plain +text. For tree objects, the signed content is the awful binary +representation of the tree, not the pretty representation given by git +ls-tree or git show.

+

Detached signatures include the "policy" identifier in the signed content, to +prevent others from tampering with policy choices via refs hackery. (This +will make more sense momentarily.) The policy identifier is prepended to the +signed content, terminated by a zero byte (as with Git's own type +identifiers, but without a length field as length checks are performed by +signing and again when the signature is stored in Git).

+

To generate the complete signable version of an object, use something +equivalent to the following shell snippet:

+
# generate-signable POLICY TYPE SHA1
+function generate-signable() {
+    echo -n "$1"
+    SOMETHING OUTPUTTING A NUL HERE
+    git cat-file "$2" "$3"
+}
+
+

(In the process of writing this, I discovered how hard it is to get Unix's +C-derived shell tools to emit a zero byte.)

+

Signature Storage and Naming

+

We assume that a userid will sign an object at most once.

+

Each signature is stored in an independent blob object in the repository it +applies to. The signature object (described above) is stored in Git, and its +hash recorded in refs/signatures/<POLICY>/<SUBJECT SHA1>/<SIGNER KEY +FINGERPRINT>.

+
# sign POLICY TYPE SHA1 FINGERPRINT
+function sign() {
+    local SIG_HASH=$(
+        generate-signable "$@" |
+        gpg --batch --no-tty --sign -u "$4" |
+        git hash-object --stdin -w -t blob
+    )
+    git update-ref "refs/signatures/$1/$3/$4"
+}
+
+

Stored signatures always use the complete fingerprint to identify keys, to +minimize the risk of colliding key IDs while avoiding the need to store full +keys in the refs naming hierarchy.

+

The policy name can be reliably extracted from the ref, as the trailing part +has a fixed length (in both path segments and bytes) and each ref begins with +a fixed, constant prefix refs/signatures/.

+

Signature Verification

+

Given a signature ref as described above, we can verify and authenticate the +signature and bind it to the associated object and policy by performing the +following check:

+
    +
  1. Pick apart the ref into policy, SHA1, and key fingerprint parts.
  2. +
  3. Reconstruct the signed body as above, using the policy name extracted from + the ref.
  4. +
  5. Retrieve the signature from the ref and combine it with the object itself.
  6. +
  7. Verify that the policy in the stored signature matches the policy in the + ref.
  8. +
  9. +

    Verify the signature with GPG:

    +
    # verify-gpg POLICY TYPE SHA1 FINGERPRINT
    +verify-gpg() {
    +    {
    +        git cat-file "$2" "$3"
    +        git cat-file "refs/signatures/$1/$3/$4"
    +    } | gpg --batch --no-tty --verify
    +}
    +
    +
  10. +
  11. +

    Verify the key fingerprint of the signing key matches the key fingerprint + in the ref itself.

    +
  12. +
+

The specific rules for verifying the signature in GPG are left up to the user +to define; for example, some sites may want to auto-retrieve keys and use a +web of trust from some known roots to determine which keys are trusted, while +others may wish to maintain a specific, known keyring containing all signing +keys for each policy, and skip the web of trust entirely. This can be +accomplished via git-config, given some work, and via gpg.conf.

+

Distributing Signatures

+

Since each signature is stored in a separate ref, and since signatures are +not expected to be amended once published, the following refspec can be +used with git fetch and git push to distribute signatures:

+
refs/signatures/*:refs/signatures/*
+
+

Note the lack of a + decoration; we explicitly do not want to auto-replace +modified signatures, normally; explicit user action should be required.

+

Workflow Notes

+

There are two verification workflows for signatures: "static" verification, +where the repository itself already contains all the refs and objects needed +for signature verification, and "pre-receive" verification, where an object +and its associated signature may be being uploaded at the same time.

+

It is impractical to verify signatures on the fly from an update hook. +Only pre-receive hooks can usefully accept or reject ref changes depending +on whether the push contains a signature for the pushed objects. (Git does +not provide a good mechanism for ensuring that signature objects are pushed +before their subjects.) Correctly verifying object signatures during +pre-receive regardless of ref order is far too complicated to summarize +here.

+

Attacks

+

Lies of Omission

+

It's trivial to hide signatures by deleting the signature refs. Similarly, +anyone with access to a repository can delete any or all detached signatures +from it without otherwise invalidating the signed objects.

+

Since signatures are mostly static, sites following the recommended no-force +policy for signature publication should only be affected if relatively recent +signatures are deleted. Older signatures should be available in one or more +of the repository users' loca repositories; once created, a signature can be +legitimately obtained from anywhere, not only from the original signatory.

+

The signature naming protocol is designed to resist most other forms of +assertion tampering, but straight-up omission is hard to prevent.

+

Unwarranted Certification

+

The policy system allows any signatory to assert any policy. While +centralized signature distribution points such as "release" repositories can +make meaningful decisions about which signatures they choose to accept, +publish, and propagate, there's no way to determine after the fact whether a +policy assertion was obtained from a legitimate source or a malicious one +with no grounds for asserting the policy.

+

For example, I could, right now, sign an all-tests-pass policy assertion +for the Linux kernel. While there's no chance on Earth that the LKML team +would propagate that assertion, if I can convince you to fetch signatures +from my repository, you will fetch my bogus assertion. If all-tests-pass is +a meaningful policy assertion for the Linux kernel, then you will have very +few options besides believing that I assert that all tests have passed.

+

Ambigiuous Policy

+

This is an ongoing problem with crypto policy systems and user interfaces +generally, but this design does nothing to ensure that policies are +interpreted uniformly by all participants in a repository. In particular, +there's no mechanism described for distributing either prose or programmatic +policy definitions and checks. All policy information is out of band.

+

Git already has ambiguity problems around commit signing: there are multiple +ways to interpret a signature on a commit:

+
    +
  1. +

    I assert that this snapshot and commit message were authored as described + in this commit's metadata. (In this interpretation, the signature's + authenticity guarantees do not transitively apply to parents.)

    +
  2. +
  3. +

    I assert that this snapshot and commit message were authored as described + in this commit's metadata, based on exactly the parent commits described. + (In this interpretation, the signature's authenticity guarantees do + transitively apply to parents. This is the interpretation favoured by XXX + LINK HERE XXX.)

    +
  4. +
  5. +

    I assert that this diff and commit message was authored as described in + this commit's metadata. (No assertions about the snapshot are made + whatsoever, and assertions about parentage are barely sensical at all. + This meshes with widespread, diff-oriented policies.)

    +
  6. +
+

Grafts and Replacements

+

Git permits post-hoc replacement of arbitrary objects via both the grafts +system (via an untracked, non-distributed file in .git, though some +repositories distribute graft lists for end-users to manually apply) and the +replacements system (via refs/replace/<SHA1>, which can optionally be +fetched or pushed). The interaction between these two systems and signature +verification needs to be very closely considered; I've not yet done so.

+

Cases of note:

+
    +
  • Neither signature nor subject replaced - the "normal" case
  • +
  • Signature not replaced, subject replaced (by graft, by replacement, by both)
  • +
  • Signature replaced, subject not replaced
  • +
  • Both signature and subject replaced
  • +
+

It's tempting to outright disable git replace during signing and +verification, but this will have surprising effects when signing a ref-ish +instead of a bare hash. Since this is the normal case, I think this merits +more thought. (I'm also not aware of a way to disable grafts without +modifying .git, and having the two replacement mechanisms treated +differently may be dangerous.)

+

No Signed Refs

+

I mentioned early in this draft that Git's existing signing system doesn't +support signing refs themselves; since refs are an important piece of Git's +workflow ecosystem, this may be a major omission. Unfortunately, this +proposal doesn't address that.

+

Possible Refinements

+
    +
  • Monotone's certificate system is key+value based, rather than label-based. + This might be useful; while small pools of related values can be asserted + using mutually exclusive policy labels (whose mutual exclusion is a matter + of local interpretation), larger pools of related values rapidly become + impractical under the proposed system.
  • +
+

For example, this proposal would be inappropriate for directly asserting + third-party authorship; the asserted author would have to appear in the + policy name itself, exposing the user to a potentially very large number of + similar policy labels.

+
    +
  • +

    Ref signing via a manifest (a tree constellation whose paths are ref names + and whose blobs sign the refs' values). Consider cribbing DNSSEC here for + things like lightweight absence assertions, too.

    +
  • +
  • +

    Describe how this should interact with commit-duplicating and + commit-rewriting workflows.

    +
  • +
+
+ + + +
+
+ + +comments powered by Disqus +
+ + + + + +
+ + \ No newline at end of file diff --git a/.html/git/index.html b/.html/git/index.html new file mode 100644 index 0000000..59ee1d4 --- /dev/null +++ b/.html/git/index.html @@ -0,0 +1,109 @@ + + + + + The Codex » + ls /git + + + + + + + + +
+ + + + + + + + + + + + + + +
+ + \ No newline at end of file diff --git a/.html/git/integrate.html b/.html/git/integrate.html new file mode 100644 index 0000000..828019f --- /dev/null +++ b/.html/git/integrate.html @@ -0,0 +1,118 @@ + + + + + The Codex » + Integrating with Git: A Field Guide + + + + + + + + +
+ + + + + +
+

Integrating with Git: A Field Guide

+

Pretty much everything you might want to do to a Git repository when writing +tooling or integrations should be done by shelling out to one git command or +another.

+

Finding Git's trees

+

Git commands can be invoked from locations other than the root of the work +tree or git directory. You can find either of those by invoking git +rev-parse.

+

To find the absolute path to the root of the work tree:

+
git rev-parse --show-toplevel
+
+

This will output the absolute path to the root of the work tree on standard +output, followed by a newline. Since the work tree's absolute path can contain +whitespace (including newlines), you should assume every byte of output save +the final newline is part of the path, and if you're using this in a shell +script, quote defensively.

+

To find the relative path from the current working directory:

+
git rev-parse --show-cdup
+
+

This will output the relative path to the root of the work tree on standard +output, followed by a newline.

+

For bare repositories, both commands will output nothing and exit with a zero +status. (Surprise!)

+

To find a path to the root of the git directory:

+
git rev-parse --git-dir
+
+

This will output either the relative or the absolute path to the git +directory, followed by a newline.

+

All three of these commands will exit with non-zero status when run outside of +a work tree or git directory. Check for it.

+
+ + + +
+
+ + +comments powered by Disqus +
+ + + + + +
+ + \ No newline at end of file diff --git a/.html/git/pull-request-workflow.html b/.html/git/pull-request-workflow.html new file mode 100644 index 0000000..1a15642 --- /dev/null +++ b/.html/git/pull-request-workflow.html @@ -0,0 +1,163 @@ + + + + + The Codex » + Life With Pull Requests + + + + + + + + +
+ + + + + +
+

Life With Pull Requests

+

I've been party to a number of discussions with folks contributing to +pull-request-based projects on Github (and other hosts, but mostly Github). +Because of Git's innate flexibility, there are lots of ways to work with pull +requests. Here's mine.

+

I use a couple of naming conventions here that are not stock git:

+
+
origin
+
The repository to which you publish proposed changes
+
upstream
+
The repository from which you receive ongoing development, and which will +receive your changes.
+
+

One-time setup

+

Do these things once, when starting out on a project. Keep the results around +for later.

+

I'll be referring to the original project repository as upstream and +pretending its push URL is UPSTREAM-URL below. In real life, the URL will +often be something like git@github.com:someguy/project.git.

+

Fork the project

+

Use the repo manager's forking tool to create a copy of the project in your +own namespace. This generally creates your copy with a bunch of useless tat; +feel free to ignore all of this, as the only purpose of this copy is to +provide somewhere for you to publish your changes.

+

We'll be calling this repository origin later. Assume it has a URL, which +I'll abbreviate ORIGIN-URL, for git push to use.

+

(You can leave this step for later, but if you know you're going to do it, why +not get it out of the way?)

+

Clone the project and configure it

+

You'll need a clone locally to do work in. Create one from origin:

+
git clone ORIGIN-URL some-local-name
+
+

While you're here, cd into it and add the original project as a remote:

+
cd some-local-name
+git remote add upstream UPSTREAM-URL
+
+

Feature process

+

Do these things for each feature you work on. To switch features, just use +git checkout my-feature.

+

Create a new feature branch locally

+

We use upstream's master branch here, so that your feature includes all of +upstream's state initially. We also need to make sure our local cache of +upstream's state is correct:

+
git fetch upstream
+git checkout upstream/master -b my-feature
+
+

Do work

+

If you need my help here, stop now.

+

Integrate upstream changes

+

If you find yourself needing something that's been added upstream, use +rebase to integrate it to avoid littering your feature branch with +“meaningless” merge commits.

+
git checkout my-feature
+git fetch upstream
+git rebase upstream/master
+
+

Publish your branch

+

When you're “done,” publish your branch to your personal repository:

+
git push origin my-feature
+
+

Then visit your copy in your repo manager's web UI and create a pull request +for my-feature.

+

Integrating feedback

+

Very likely, your proposed changes will need work. If you use history-editing +to integrate feedback, you will need to use --force when updating the +branch:

+
git push --force origin my-feature
+
+

This is safe provided two things are true:

+
    +
  1. The branch has not yet been merged to the upstream repo.
  2. +
  3. You are only force-pushing to your fork, not to the upstream repo.
  4. +
+

Generally, no other users will have work based on your pull request, so +force-pushing history won't cause problems.

+
+ + + +
+
+ + +comments powered by Disqus +
+ + + + + +
+ + \ No newline at end of file diff --git a/.html/git/scratch.html b/.html/git/scratch.html new file mode 100644 index 0000000..ff1bdff --- /dev/null +++ b/.html/git/scratch.html @@ -0,0 +1,134 @@ + + + + + The Codex » + Git Is Not Magic + + + + + + + + +
+ + + + + +
+

Git Is Not Magic

+

I'm bored. Let's make a git repository out of whole cloth.

+

Git repos are stored in .git:

+
fakegit$ mkdir .git
+
+

They have a “symbolic ref” (which are text files, see man +git-symbolic-ref) named HEAD, pointing +to the currently checked-out branch. Let's use master. Branches are refs +under refs/heads (see man git-branch):

+
fakegit ((unknown))$ echo 'ref: refs/heads/master' > .git/HEAD
+
+

The have an object database and a refs database, both of which are simple +directories (see man +gitrepository-layout and man +gitrevisions). Let's also enable the reflog, +because it's a great safety net if you use history-editing tools in git:

+
fakegit ((ref: re...))$ mkdir .git/refs .git/objects .git/logs
+fakegit (master #)$
+
+

Now __git_ps1, at least, is convinced that we have a working git repository. +Does it work?

+
fakegit (master #)$ echo 'Hello, world!' > hello.txt
+fakegit (master #)$ git add hello.txt
+fakegit (master #)$ git commit -m 'Initial commit'
+[master (root-commit) 975307b] Initial commit
+1 file changed, 1 insertion(+)
+create mode 100644 hello.txt
+
+fakegit (master)$ git log
+commit 975307ba0485bff92e295e3379a952aff013c688
+Author: Owen Jacobson <owen.jacobson@grimoire.ca>
+Date:   Wed Feb 6 10:07:07 2013 -0500
+
+    Initial commit
+
+

Eeyup.

+
+

Should you do this? Of course not. Anywhere you could run these commands, +you could instead run git init or git clone, which set up a number of +other structures, including .git/config and any unusual permissions options. +The key part here is that a directory's identity as “a git repository” is +entirely a function of its contents, not of having been blessed into being by +git itself.

+

You can infer a lot from this: for example, you can infer that it's “safe” to +move git repositories around using FS tools, or to back them up with the same +tools, for example. This is not as obvious to everyone as you might hope; people

+
+ + + +
+
+ + +comments powered by Disqus +
+ + + + + +
+ + \ No newline at end of file diff --git a/.html/git/stop-using-git-pull-to-deploy.html b/.html/git/stop-using-git-pull-to-deploy.html new file mode 100644 index 0000000..a3736a0 --- /dev/null +++ b/.html/git/stop-using-git-pull-to-deploy.html @@ -0,0 +1,178 @@ + + + + + The Codex » + Stop Using Git Pull To Deploy + + + + + + + + +
+ + + + + +
+

Stop using git pull for deployment!

+

The problem

+
    +
  • You have a Git repository containing your project.
  • +
  • You want to “deploy” that code when it changes.
  • +
  • You'd rather not download the entire project from scratch for each + deployment.
  • +
+

The antipattern

+

“I know, I'll use git pull in my deployment script!”

+

Stop doing this. Stop teaching other people to do this. It's wrong, and it +will eventually lead to deploying something you didn't want.

+

Deployment should be based on predictable, known versions of your code. +Ideally, every deployable version has a tag (and you deploy exactly that tag), +but even less formal processes, where you deploy a branch tip, should still be +deploying exactly the code designated for release. git pull, however, can +introduce new commits.

+

git pull is a two-step process:

+
    +
  1. Fetch the current branch's designated upstream remote, to obtain all of the + remote's new commits.
  2. +
  3. Merge the current branch's designated upstream branch into the current + branch.
  4. +
+

The merge commit means the actual deployed tree might not be identical to +the intended deployment tree. Local changes (intentional or otherwise) will be +preserved (and merged) into the deployment, for example; once this happens, +the actual deployed commit will never match the intended commit.

+

git pull will approximate the right thing “by accident”: if the current +local branch (generally master) for people using git pull is always clean, +and always tracks the desired deployment branch, then git pull will update +to the intended commit exactly. This is pretty fragile, though; many git +commands can cause the local branch to diverge from its upstream branch, and +once that happens, git pull will always create new commits. You can patch +around the fragility a bit using the --ff-only option, but that only tells +you when your deployment environment has diverged and doesn't fix it.

+

The right pattern

+

Quoting Sitaram Chamarty:

+
+

Here's what we expect from a deployment tool. Note the rule numbers -- +we'll be referring to some of them simply by number later.

+
    +
  1. +

    All files in the branch being deployed should be copied to the + deployment directory.

    +
  2. +
  3. +

    Files that were deleted in the git repo since the last deployment + should get deleted from the deployment directory.

    +
  4. +
  5. +

    Any changes to tracked files in the deployment directory after the + last deployment should be ignored when following rules 1 and 2.

    +

    However, sometimes you might want to detect such changes and abort if +you found any.

    +
  6. +
  7. +

    Untracked files in the deploy directory should be left alone.

    +

    Again, some people might want to detect this and abort the deployment.

    +
  8. +
+
+

Sitaram's own documentation talks about how to accomplish these when +“deploying” straight out of a bare repository. That's unwise (not to mention +impractical) in most cases; deployment should use a dedicated clone of the +canonical repository.

+

I also disagree with point 3, preferring to keep deployment-related changes +outside of tracked files. This makes it much easier to argue that the changes +introduced to configure the project for deployment do not introduce new bugs +or other surprise features.

+

My deployment process, given a dedicated clone at $DEPLOY_TREE, is as +follows:

+
cd "${DEPLOY_TREE}"
+git fetch --all
+git checkout --force "${TARGET}"
+# Following two lines only required if you use submodules
+git submodule sync
+git submodule update --init --recursive
+# Follow with actual deployment steps (run fabric/capistrano/make/etc)
+
+

$TARGET is either a tag name (v1.2.1) or a remote branch name +(origin/master), but could also be a commit hash or anything else Git +recognizes as a revision. This will detach the head of the $DEPLOY_TREE +repository, which is fine as no new changes should be authored in this +repository (so the local branches are irrelevant). The warning Git emits when +HEAD becomes detached is unimportant in this case.

+

The tracked contents of $DEPLOY_TREE will end up identical to the desired +commit, discarding local changes. The pattern above is very similar to what +most continuous integration servers use when building from Git repositories, +for much the same reason.

+
+ + + +
+
+ + +comments powered by Disqus +
+ + + + + +
+ + \ No newline at end of file diff --git a/.html/git/survival.html b/.html/git/survival.html new file mode 100644 index 0000000..c1d43ac --- /dev/null +++ b/.html/git/survival.html @@ -0,0 +1,174 @@ + + + + + The Codex » + Git Survival Guide + + + + + + + + +
+ + + + + +
+

Git Survival Guide

+

I think the git UI is pretty awful, and encourages using Git in ways that +will screw you. Here are a few things I've picked up that have saved my bacon.

+
    +
  • You will inevitably need to understand Git's “internals” to make use of it + as an SCM tool. Accept this early. If you think your SCM tool should not + expose you to so much plumbing, don't + use Git.
      +
    • Git weenies will claim that this plumbing is what gives Git all of its + extra power. This is true; it gives Git the power to get you out of + situations you wouldn't be in without Git.
    • +
    +
  • +
  • git log --graph --decorate --oneline --color --all
  • +
  • Run git fetch habitually. Stale remote-tracking branches lead to sadness.
  • +
  • git push and git pull are not symmetric. git push's + opposite operation is git fetch. (git pull is equivalent to git fetch + followed by git merge, more or less).
  • +
  • Git configuration values don't always have the best defaults.
  • +
  • The upstream branch of foo is foo@{u}. The upstream branch of your + checked-out branch is HEAD@{u} or @{u}. This is documented in git help + revisions.
  • +
  • You probably don't want to use a merge operation (such as git pull) to + integrate upstream changes into topic branches. The resulting history can be + very confusing to follow, especially if you integrate upstream changes + frequently.
      +
    • You can leave topic branches “real” relatively safely. You can do + a test merge to see if they still work cleanly post-integration without + actually integrating upstream into the branch permanently.
    • +
    • You can use git rebase or git pull --rebase to transplant your + branch to a new, more recent starting point that includes the changes + you want to integrate. This makes the upstream changes a permanent part + of your branch, just like git merge or git pull would, but generates + an easier-to-follow history. Conflict resolution will happen as normal.
    • +
    +
  • +
  • +

    Example test merge, using origin/master as the upstream branch and foo + as the candidate for integration:

    +
    git fetch origin
    +git checkout origin/master -b test-merge-foo
    +git merge foo
    +# run tests, examine files
    +git diff origin/master..HEAD
    +
    +

    To discard the test merge, delete the branch after checking out some other +branch:

    +
    git checkout foo
    +git branch -D test-merge-foo
    +
    +

    You can combine this with git rerere to save time resolving conflicts in +a later “real,” permanent merge.

    +
  • +
  • +

    You can use git checkout -p to build new, tidy commits out of a branch + laden with “wip” commits:

    +
    git fetch
    +git checkout $(git merge-base origin/master foo) -b foo-cleaner-history
    +git checkout -p foo -- paths/to/files
    +# pick out changes from the presented patch that form a coherent commit
    +# repeat 'git checkout -p foo --' steps for related files to build up
    +# the new commit
    +git commit
    +# repeat 'git checkout -p foo --' and 'git commit' steps until no diffs remain
    +
    +
      +
    • Gotcha: git checkout -p will do nothing for files that are being + created. Use git checkout, instead, and edit the file if necessary. + Thanks, Git.
    • +
    • Gotcha: The new, clean branch must diverge from its upstream branch + (origin/master, in the example above) at exactly the same point, or + the diffs presented by git checkout -p foo will include chunks that + revert changes on the upstream branch since the “dirty” branch was + created. The easiest way to find this point is with git merge-base.
    • +
    +
  • +
+

Useful Resources

+

That is, resoures that can help you solve problems or understand things, not +resources that reiterate the man pages for you.

+ +
+ + + +
+
+ + +comments powered by Disqus +
+ + + + + +
+ + \ No newline at end of file diff --git a/.html/git/theory-and-practice/_list.html b/.html/git/theory-and-practice/_list.html new file mode 100644 index 0000000..feae190 --- /dev/null +++ b/.html/git/theory-and-practice/_list.html @@ -0,0 +1,96 @@ + + + + + The Codex » + ls /git/theory-and-practice + + + + + + + + +
+ + + + + +
+

ls /git/theory-and-practice

+ + + + +
+

Pages

+ +
+ + + +
+ + + + + + + + +
+ + \ No newline at end of file diff --git a/.html/git/theory-and-practice/index.html b/.html/git/theory-and-practice/index.html new file mode 100644 index 0000000..297cbd9 --- /dev/null +++ b/.html/git/theory-and-practice/index.html @@ -0,0 +1,126 @@ + + + + + The Codex » + Git Internals 101 + + + + + + + + +
+ + + + + +
+

Git Internals 101

+

Yeah, yeah, another article about “how Git works.” There are tons of these +already. Personally, I'm fond of Sitaram Chamarty's fantastic series of +articles explaining Git from both ends, +and of Git for Computer +Scientists. Maybe +you'd rather read those.

+

This page was inspired by very specific, recurring issues I've run into while +helping people use Git. I think Git's “porcelain” layer -- its user interface +-- is terrible, and does a bad job of insulating non-expert users from Git's +internals. While I'd love to fix that (and I do contribute to discussions on +that front, too), we still have the git(1) UI right now and people still get +into trouble with it right now.

+

Git follows the New Jersey approach laid out in Richard Gabriel's The Rise of +“Worse is Better”: given +the choice between a simple implementation and a simple interface, Git chooses +the simple implementation almost everywhere. This internal simplicity can give +users the leverage to fix the problems that its horrible user interface leads +them into, so these pages will focus on explaining the simple parts and giving +users the tools to examine them.

+

Throughout these articles, I've written “Git does X” a lot. Git is +incredibly configurable; read that as “Git does X by default.” I'll try to +call out relevant configuration options as I go, where it doesn't interrupt +the flow of knowledge.

+ +

By the way, if you think you're just going to follow the +many +excellent +git +tutorials +out there and that you won't need this knowledge, well, you will. You can +either learn it during a quiet time, when you can think and experiment, or you +can learn it when something's gone wrong, and everyone's shouting at each +other. Git's high-level interface doesn't do much to keep you on the sensible +path, and you will eventually need to fix something.

+
+ + + +
+
+ + +comments powered by Disqus +
+ + + + + +
+ + \ No newline at end of file diff --git a/.html/git/theory-and-practice/objects.html b/.html/git/theory-and-practice/objects.html new file mode 100644 index 0000000..ff6c53b --- /dev/null +++ b/.html/git/theory-and-practice/objects.html @@ -0,0 +1,202 @@ + + + + + The Codex » + Objects + + + + + + + + +
+ + + + + +
+

Objects

+

Git's basest level is a storage and naming system for things Git calls +“objects.” These objects hold the bulk of the data about files and projects +tracked by Git: file contents, directory trees, commits, and so on. Every +object is identified by a SHA-1 hash, which is derived from its contents.

+

SHA-1 hashes are obnoxiously long, so Git allows you to substitue any unique +prefix of a SHA-1 hash, so long as it's at least four characters long. If the +hash 0b43b9e3e64793f5a222a644ed5ab074d8fa1024 is present in your repository, +then Git commands will understand 0b43, 0b43b9, and other patterns to all +refer to the same object, so long as no other object has the same SHA-1 +prefix.

+

Blobs

+

The contents of every file that's ever been stored in a Git repository are +stored as blob objects. These objects are very simple: they contain the file +contents, byte for byte.

+

Trees

+

File contents (and trees, and Other Things we'll get to later) are tied +together into a directory structure by tree objects. These objects contain a +list of records, with one child per record. Each record contains a permissions +field corresponding to the POSIX permissions mask of the object, a type, a +SHA-1 for another object, and a name.

+

A directory containing only files might be represented as the tree

+
100644 blob 511542ad6c97b28d720c697f7535897195de3318    config.md
+100644 blob 801ddd5ae10d6282bbf36ccefdd0b052972aa8e2    integrate.md
+100644 blob 61d28155862607c3d5d049e18c5a6903dba1f85e    scratch.md
+100644 blob d7a79c144c22775239600b332bfa120775bab341    survival.md
+
+

while a directory with subdirectories would also have some tree children:

+
040000 tree f57ef2457a551b193779e21a50fb380880574f43    12factor
+040000 tree 844697ce99e1ef962657ce7132460ad7a38b7584    authnz
+100644 blob 54795f9b774547d554f5068985bbc6df7b128832    cool-urls-can-change.md
+040000 tree fc3f39eb5d1a655374385870b8be56b202be7dd8    dev
+040000 tree 22cbfb2c1d7b07432ea7706c36b0d6295563c69c    devops
+040000 tree 0b3e63b4f32c0c3acfbcf6ba28d54af4c2f0d594    git
+040000 tree 5914fdcbd34e00e23e52ba8e8bdeba0902941d3f    java
+040000 tree 346f71a637a4f8933dc754fef02515a8809369c4    mysql
+100644 blob b70520badbb8de6a74b84788a7fefe64a432c56d    packaging-ideas.md
+040000 tree 73ed6572345a368d20271ec5a3ffc2464ac8d270    people
+
+

Commits

+

Blobs and trees are sufficient to store arbitrary directory trees in Git, and +you could use them that way, but Git is mostly used as a revision-tracking +system. Revisions and their history are represented by commit objects, which contain:

+
* The SHA-1 hash of the root `tree` object of the commit,
+* Zero or more SHA-1 hashes for parent commits,
+* The name and email address of the commit's “author,”
+* The name and email address of the commit's “committer,”
+* Timestamps representing when the commit was authored and committed, and
+* A commit message.
+
+

Commit objects' parent references form a directed acyclic graph; the subgraph +reachable from a specific commit is that commit's history.

+

When working with Git's user interface, commit parents are given in a +predictable order determined by the git checkout and git merge commands.

+

Tags

+

Git's revision-tracking system supports “tags,” which are stable names for +specific configurations. It also, uniquely, supports a concept called an +“annotated tag,” represented by the tag object type. These annotated tag +objects contain

+
* The type and SHA-1 hash of another object,
+* The name and email address of the person who created the tag,
+* A timestamp representing the moment the tag was created, and
+* A tag message.
+
+

Anonymity

+

There's a general theme to Git's object types: no object knows its own name. +Every object only has a name in the context of some containing object, or in +the context of Git's refs mechanism, which I'll get to +shortly. This means that the same blob object can be reused for multiple +files (or, more probably, the same file in multiple commits), if they happen +to have the same contents.

+

This also applies to tag objects, even though their role is part of a system +for providing stable, meaningful names for commits.

+

Examining objects

+
    +
  • +

    git cat-file <type> <sha1>: decodes the object <sha1> and prints its + contents to stdout. This prints the object's contents in their raw form, + which is less than useful for tree objects.

    +
  • +
  • +

    git cat-file -p <sha1>: decodes the object <sha1> and pretty-prints it. + This pretty-printing stays close to the underlying disk format; it's most + useful for decoding tree objects.

    +
  • +
  • +

    git show <sha1>: decodes the object <sha1> and formats its contents to + stdout. For blobs, this is identical to what git cat-file blob would do, + but for trees, commits, and tags, the output is reformated to be more + readable.

    +
  • +
+

Storage

+

Objects are stored in two places in Git: as “loose objects,” and in “pack +files.” Newly-created objects are initially loose objects, for ease of +manipulation; transferring objects to another repository or running certain +administrative commands can cause them to be placed in pack files for faster +transfer and for smaller storage.

+

Loose objects are stored directly on the filesystem, in the Git repository's +objects directory. Git takes a two-character prefix off of each object's +SHA-1 hash, and uses that to pick a subdirectory of objects to store the +object in. The remainder of the hash forms the filename. Loose objects are +compressed with zlib, to conserve space, but the resulting directory tree can +still be quite large.

+

Packed objects are stored together in packed files, which live in the +repository's objects/pack directory. These packed files are both compressed +and delta-encoded, allowing groups of similar objects to be stored very +compactly.

+
+ + + +
+
+ + +comments powered by Disqus +
+ + + + + +
+ + \ No newline at end of file diff --git a/.html/git/theory-and-practice/refs-and-names.html b/.html/git/theory-and-practice/refs-and-names.html new file mode 100644 index 0000000..fdc56a4 --- /dev/null +++ b/.html/git/theory-and-practice/refs-and-names.html @@ -0,0 +1,199 @@ + + + + + The Codex » + Refs and Names + + + + + + + + +
+ + + + + +
+

Refs and Names

+

Git's object system stores most of the data for projects tracked in +Git, but only provides SHA-1 hashes. This is basically useless if you want to +make practical use of Git, so Git also has a naming mechanism called “refs” +that provide human-meaningful names for objects.

+

There are two kinds of refs:

+
    +
  • +

    “Normal” refs, which are names that resolve directly to SHA-1 hashes. These + are the vast majority of refs in most repositories.

    +
  • +
  • +

    “Symbolic” refs, which are names that resolve to other refs. In most + repositories, only a few of these appear. (Circular references are possible + with symbolic refs. Git will refuse to resolve these.)

    +
  • +
+

Anywhere you could use a SHA-1, you can use a ref instead. Git interprets them +identically, after resolving the ref down to the SHA-1.

+

Namespaces

+

Every operation in Git that uses a name of some sort, including branching +(branch names), tagging (tag names), fetching (remote-tracking branch names), +and pushing (many kinds of name), expands those names to refs, using a +namespace convention. The following namespaces are common:

+
    +
  • +

    refs/heads/NAME: branches. The branch name is the ref name with + refs/heads/ removed. Names generally point to commits.

    +
  • +
  • +

    refs/remotes/REMOTE/NAME: “remote-tracking” branches. These are maintained + in tandem by git remote and git fetch, to cache the state of other + repositories. Names generally point to commits.

    +
  • +
  • +

    refs/tags/NAME: tags. The tag name is the ref name with refs/heads/ + removed. Names generally point to commits or tag objects.

    +
  • +
  • +

    refs/bisect/STATE: git bisect markers for known-good and known-bad + revisions, from which the rest of the bisect state can be derived.

    +
  • +
+

There are also a few special refs directly in the refs/ namespace, most +notably:

+
    +
  • refs/stash: The most recent stash entry, as maintained by git stash. + (Other stash entries are maintained by a separate system.) Names generally + point to commits.
  • +
+

Tools can invent new refs for their own purposes, or manipulate existing refs; +the convention is that tools that use refs (which is, as I said, most of them) +respect the state of the ref as if they'd created that state themselves, +rather than sanity-checking the ref before using it.

+

Special refs

+

There are a handful of special refs used by Git commands for their own +operation. These refs do not begin with refs/:

+
    +
  • +

    HEAD: the “current” commit for most operations. This is set when checking + out a commit, and many revision-related commands default to HEAD if not + given a revision to operate on. HEAD can either be a symbolic ref + (pointing to a branch ref) or a normal ref (pointing directly to a commit), + and is very frequently a symbolic ref.

    +
  • +
  • +

    MERGE_HEAD: during a merge, MERGE_HEAD resolves to the commit whose + history is being merged.

    +
  • +
  • +

    ORIG_HEAD: set by operations that change HEAD in potentially destructive + ways by resolving HEAD before making the change.

    +
  • +
  • +

    CHERRY_PICK_HEAD is set during git cherry-pick to the commit whose + changes are being copied.

    +
  • +
  • +

    FETCH_HEAD is set by the forms of git fetch that fetch a single ref, and + points to the commit the fetched ref pointed to.

    +
  • +
+

Examining and manipulating refs

+

The git show-ref command will list the refs in namespaces under refs in +your repository, printing the SHA-1 hashes they resolve to. Pass --head to +also include HEAD.

+

The following commands can be used to manipulate refs directly:

+
    +
  • +

    git update-ref <ref> <sha1> forcibly sets <ref> to the passed <sha1>.

    +
  • +
  • +

    git update-ref -d <ref> deletes a ref.

    +
  • +
  • +

    git symbolic-ref <ref> prints the target of <ref>, if <ref> is a + symbolic ref. (It will fail with an error message for normal refs.)

    +
  • +
  • +

    git symbolic-ref <ref> <target> forcibly makes <ref> a symbolic ref + pointing to <target>.

    +
  • +
+

Additionally, you can see what ref a given name resolves to using git +rev-parse --symbolic-full-name <name> or git show-ref <name>.

+
+ + + +
+
+ + +comments powered by Disqus +
+ + + + + +
+ + \ No newline at end of file -- cgit v1.2.3