diff options
| author | Owen Jacobson <owen@grimoire.ca> | 2020-01-28 23:34:06 -0500 |
|---|---|---|
| committer | Owen Jacobson <owen@grimoire.ca> | 2020-01-28 23:34:06 -0500 |
| commit | 34708dfa902afabf4833c25233132e56514915de (patch) | |
| tree | 0f4fd5c2c8d782885c5b821114b060e89fce1dcd /wiki/git/theory-and-practice | |
| parent | 9bf334de6a2a17371eae9bcdf342c416332350aa (diff) | |
| parent | 6a7b97b436a5a20c172e6b04bf0caa37d544fde4 (diff) | |
Switch to mkdocs.
Diffstat (limited to 'wiki/git/theory-and-practice')
| -rw-r--r-- | wiki/git/theory-and-practice/index.md | 42 | ||||
| -rw-r--r-- | wiki/git/theory-and-practice/objects.md | 125 | ||||
| -rw-r--r-- | wiki/git/theory-and-practice/refs-and-names.md | 94 |
3 files changed, 0 insertions, 261 deletions
diff --git a/wiki/git/theory-and-practice/index.md b/wiki/git/theory-and-practice/index.md deleted file mode 100644 index f257b12..0000000 --- a/wiki/git/theory-and-practice/index.md +++ /dev/null @@ -1,42 +0,0 @@ -# Git Internals 101 - -Yeah, yeah, another article about “how Git works.” There are tons of these -already. Personally, I'm fond of Sitaram Chamarty's [fantastic series of -articles](http://gitolite.com/master-toc.html) explaining Git from both ends, -and of [Git for Computer -Scientists](http://eagain.net/articles/git-for-computer-scientists/). Maybe -you'd rather read those. - -This page was inspired by very specific, recurring issues I've run into while -helping people use Git. I think Git's “porcelain” layer -- its user interface --- is terrible, and does a bad job of insulating non-expert users from Git's -internals. While I'd love to fix that (and I do contribute to discussions on -that front, too), we still have the `git(1)` UI right now and people still get -into trouble with it right now. - -Git follows the New Jersey approach laid out in Richard Gabriel's [The Rise of -“Worse is Better”](http://www.dreamsongs.com/RiseOfWorseIsBetter.html): given -the choice between a simple implementation and a simple interface, Git chooses -the simple implementation almost everywhere. This internal simplicity can give -users the leverage to fix the problems that its horrible user interface leads -them into, so these pages will focus on explaining the simple parts and giving -users the tools to examine them. - -Throughout these articles, I've written “Git does X” a lot. Git is -_incredibly_ configurable; read that as “Git does X _by default_.” I'll try to -call out relevant configuration options as I go, where it doesn't interrupt -the flow of knowledge. - -* [Objects](objects) -* [Refs and Names](refs-and-names) - -By the way, if you think you're just going to follow the -[many](http://git-scm.com/documentation) -[excellent](http://www.atlassian.com/git/tutorial) -[git](http://try.github.io/levels/1/challenges/1) -[tutorials](https://www.kernel.org/pub/software/scm/git/docs/gittutorial.html) -out there and that you won't need this knowledge, well, you will. You can -either learn it during a quiet time, when you can think and experiment, or you -can learn it when something's gone wrong, and everyone's shouting at each -other. Git's high-level interface doesn't do much to keep you on the sensible -path, and you will eventually need to fix something. diff --git a/wiki/git/theory-and-practice/objects.md b/wiki/git/theory-and-practice/objects.md deleted file mode 100644 index 6bf975a..0000000 --- a/wiki/git/theory-and-practice/objects.md +++ /dev/null @@ -1,125 +0,0 @@ -# Objects - -Git's basest level is a storage and naming system for things Git calls -“objects.” These objects hold the bulk of the data about files and projects -tracked by Git: file contents, directory trees, commits, and so on. Every -object is identified by a SHA-1 hash, which is derived from its contents. - -SHA-1 hashes are obnoxiously long, so Git allows you to substitue any unique -prefix of a SHA-1 hash, so long as it's at least four characters long. If the -hash `0b43b9e3e64793f5a222a644ed5ab074d8fa1024` is present in your repository, -then Git commands will understand `0b43`, `0b43b9`, and other patterns to all -refer to the same object, so long as no other object has the same SHA-1 -prefix. - -## Blobs - -The contents of every file that's ever been stored in a Git repository are -stored as `blob` objects. These objects are very simple: they contain the file -contents, byte for byte. - -## Trees - -File contents (and trees, and Other Things we'll get to later) are tied -together into a directory structure by `tree` objects. These objects contain a -list of records, with one child per record. Each record contains a permissions -field corresponding to the POSIX permissions mask of the object, a type, a -SHA-1 for another object, and a name. - -A directory containing only files might be represented as the tree - - 100644 blob 511542ad6c97b28d720c697f7535897195de3318 config.md - 100644 blob 801ddd5ae10d6282bbf36ccefdd0b052972aa8e2 integrate.md - 100644 blob 61d28155862607c3d5d049e18c5a6903dba1f85e scratch.md - 100644 blob d7a79c144c22775239600b332bfa120775bab341 survival.md - -while a directory with subdirectories would also have some `tree` children: - - 040000 tree f57ef2457a551b193779e21a50fb380880574f43 12factor - 040000 tree 844697ce99e1ef962657ce7132460ad7a38b7584 authnz - 100644 blob 54795f9b774547d554f5068985bbc6df7b128832 cool-urls-can-change.md - 040000 tree fc3f39eb5d1a655374385870b8be56b202be7dd8 dev - 040000 tree 22cbfb2c1d7b07432ea7706c36b0d6295563c69c devops - 040000 tree 0b3e63b4f32c0c3acfbcf6ba28d54af4c2f0d594 git - 040000 tree 5914fdcbd34e00e23e52ba8e8bdeba0902941d3f java - 040000 tree 346f71a637a4f8933dc754fef02515a8809369c4 mysql - 100644 blob b70520badbb8de6a74b84788a7fefe64a432c56d packaging-ideas.md - 040000 tree 73ed6572345a368d20271ec5a3ffc2464ac8d270 people - -## Commits - -Blobs and trees are sufficient to store arbitrary directory trees in Git, and -you could use them that way, but Git is mostly used as a revision-tracking -system. Revisions and their history are represented by `commit` objects, which contain: - -* The SHA-1 hash of the root `tree` object of the commit, -* Zero or more SHA-1 hashes for parent commits, -* The name and email address of the commit's “author,” -* The name and email address of the commit's “committer,” -* Timestamps representing when the commit was authored and committed, and -* A commit message. - -Commit objects' parent references form a directed acyclic graph; the subgraph -reachable from a specific commit is that commit's _history_. - -When working with Git's user interface, commit parents are given in a -predictable order determined by the `git checkout` and `git merge` commands. - -## Tags - -Git's revision-tracking system supports “tags,” which are stable names for -specific configurations. It also, uniquely, supports a concept called an -“annotated tag,” represented by the `tag` object type. These annotated tag -objects contain - -* The type and SHA-1 hash of another object, -* The name and email address of the person who created the tag, -* A timestamp representing the moment the tag was created, and -* A tag message. - -## Anonymity - -There's a general theme to Git's object types: no object knows its own name. -Every object only has a name in the context of some containing object, or in -the context of [Git's refs mechanism](refs-and-names), which I'll get to -shortly. This means that the same `blob` object can be reused for multiple -files (or, more probably, the same file in multiple commits), if they happen -to have the same contents. - -This also applies to tag objects, even though their role is part of a system -for providing stable, meaningful names for commits. - -## Examining objects - -* `git cat-file <type> <sha1>`: decodes the object `<sha1>` and prints its - contents to stdout. This prints the object's contents in their raw form, - which is less than useful for `tree` objects. - -* `git cat-file -p <sha1>`: decodes the object `<sha1>` and pretty-prints it. - This pretty-printing stays close to the underlying disk format; it's most - useful for decoding `tree` objects. - -* `git show <sha1>`: decodes the object `<sha1>` and formats its contents to - stdout. For blobs, this is identical to what `git cat-file blob` would do, - but for trees, commits, and tags, the output is reformated to be more - readable. - -## Storage - -Objects are stored in two places in Git: as “loose objects,” and in “pack -files.” Newly-created objects are initially loose objects, for ease of -manipulation; transferring objects to another repository or running certain -administrative commands can cause them to be placed in pack files for faster -transfer and for smaller storage. - -Loose objects are stored directly on the filesystem, in the Git repository's -`objects` directory. Git takes a two-character prefix off of each object's -SHA-1 hash, and uses that to pick a subdirectory of `objects` to store the -object in. The remainder of the hash forms the filename. Loose objects are -compressed with zlib, to conserve space, but the resulting directory tree can -still be quite large. - -Packed objects are stored together in packed files, which live in the -repository's `objects/pack` directory. These packed files are both compressed -and delta-encoded, allowing groups of similar objects to be stored very -compactly. diff --git a/wiki/git/theory-and-practice/refs-and-names.md b/wiki/git/theory-and-practice/refs-and-names.md deleted file mode 100644 index 025ae88..0000000 --- a/wiki/git/theory-and-practice/refs-and-names.md +++ /dev/null @@ -1,94 +0,0 @@ -# Refs and Names - -Git's [object system](objects) stores most of the data for projects tracked in -Git, but only provides SHA-1 hashes. This is basically useless if you want to -make practical use of Git, so Git also has a naming mechanism called “refs” -that provide human-meaningful names for objects. - -There are two kinds of refs: - -* “Normal” refs, which are names that resolve directly to SHA-1 hashes. These - are the vast majority of refs in most repositories. - -* “Symbolic” refs, which are names that resolve to other refs. In most - repositories, only a few of these appear. (Circular references are possible - with symbolic refs. Git will refuse to resolve these.) - -Anywhere you could use a SHA-1, you can use a ref instead. Git interprets them -identically, after resolving the ref down to the SHA-1. - -## Namespaces - -Every operation in Git that uses a name of some sort, including branching -(branch names), tagging (tag names), fetching (remote-tracking branch names), -and pushing (many kinds of name), expands those names to refs, using a -namespace convention. The following namespaces are common: - -* `refs/heads/NAME`: branches. The branch name is the ref name with - `refs/heads/` removed. Names generally point to commits. - -* `refs/remotes/REMOTE/NAME`: “remote-tracking” branches. These are maintained - in tandem by `git remote` and `git fetch`, to cache the state of other - repositories. Names generally point to commits. - -* `refs/tags/NAME`: tags. The tag name is the ref name with `refs/heads/` - removed. Names generally point to commits or tag objects. - -* `refs/bisect/STATE`: `git bisect` markers for known-good and known-bad - revisions, from which the rest of the bisect state can be derived. - -There are also a few special refs directly in the `refs/` namespace, most -notably: - -* `refs/stash`: The most recent stash entry, as maintained by `git stash`. - (Other stash entries are maintained by a separate system.) Names generally - point to commits. - -Tools can invent new refs for their own purposes, or manipulate existing refs; -the convention is that tools that use refs (which is, as I said, most of them) -respect the state of the ref as if they'd created that state themselves, -rather than sanity-checking the ref before using it. - -## Special refs - -There are a handful of special refs used by Git commands for their own -operation. These refs do _not_ begin with `refs/`: - -* `HEAD`: the “current” commit for most operations. This is set when checking - out a commit, and many revision-related commands default to `HEAD` if not - given a revision to operate on. `HEAD` can either be a symbolic ref - (pointing to a branch ref) or a normal ref (pointing directly to a commit), - and is very frequently a symbolic ref. - -* `MERGE_HEAD`: during a merge, `MERGE_HEAD` resolves to the commit whose - history is being merged. - -* `ORIG_HEAD`: set by operations that change `HEAD` in potentially destructive - ways by resolving `HEAD` before making the change. - -* `CHERRY_PICK_HEAD` is set during `git cherry-pick` to the commit whose - changes are being copied. - -* `FETCH_HEAD` is set by the forms of `git fetch` that fetch a single ref, and - points to the commit the fetched ref pointed to. - -## Examining and manipulating refs - -The `git show-ref` command will list the refs in namespaces under `refs` in -your repository, printing the SHA-1 hashes they resolve to. Pass `--head` to -also include `HEAD`. - -The following commands can be used to manipulate refs directly: - -* `git update-ref <ref> <sha1>` forcibly sets `<ref>` to the passed `<sha1>`. - -* `git update-ref -d <ref>` deletes a ref. - -* `git symbolic-ref <ref>` prints the target of `<ref>`, if `<ref>` is a - symbolic ref. (It will fail with an error message for normal refs.) - -* `git symbolic-ref <ref> <target>` forcibly makes `<ref>` a symbolic ref - pointing to `<target>`. - -Additionally, you can see what ref a given name resolves to using `git -rev-parse --symbolic-full-name <name>` or `git show-ref <name>`. |
