diff options
| author | Owen Jacobson <owen@grimoire.ca> | 2023-12-18 19:41:51 -0500 |
|---|---|---|
| committer | Owen Jacobson <owen@grimoire.ca> | 2024-01-03 03:05:23 -0500 |
| commit | 5562e320736812d1ad309cfaf73383512a87858d (patch) | |
| tree | d93569bd8831f4ea5b90719a61a9d1b217e76b0f /content/code | |
| parent | 27d5717529bf0e7d5806982f1970603bad998eaf (diff) | |
Migrate to Hugo.
This is a big and somewhat complicated decision, but the crux of it is this:
The _mkdocs_ tool embeds a ton of "I am writing a manual" assumptions about document structure.
These assumptions include that there is a single, sitewide TOC, that a top nav bar is broadly an appropriate way to skip around in the document, and numerous others. They serve that use case well, but that's not really what this site _is_, or how I intend it to be approached. I'm trying for something more blog-esque (and deliberately a bit haphazard).
Hugo is an experiment. This commit migrates most pages to it, but it does drop a few; this is a convenient excuse to forget items I'd prefer not to continue publishing.
Diffstat (limited to 'content/code')
| -rw-r--r-- | content/code/_index.md | 7 | ||||
| -rw-r--r-- | content/code/commit-messages.md | 20 | ||||
| -rw-r--r-- | content/code/configuring-browser-apps.md | 103 | ||||
| -rw-r--r-- | content/code/tools-convention.md | 183 | ||||
| -rw-r--r-- | content/code/users-rolegraph-privs.md | 95 |
5 files changed, 408 insertions, 0 deletions
diff --git a/content/code/_index.md b/content/code/_index.md new file mode 100644 index 0000000..0b5fa42 --- /dev/null +++ b/content/code/_index.md @@ -0,0 +1,7 @@ +--- +title: Code +--- + +I program computers. I have done so all of my adult life, and expect to do so as long as I can string concepts together. Like many lifelong programmers, I periodically write up interesting things I've developed, collaborated on, or run across. + +<!--more--> diff --git a/content/code/commit-messages.md b/content/code/commit-messages.md new file mode 100644 index 0000000..94b83f8 --- /dev/null +++ b/content/code/commit-messages.md @@ -0,0 +1,20 @@ +--- +title: Writing Good Commit Messages +date: 2013-02-12T09:42:01-0500 +--- + +A style guide. + +<!--more--> + +Rule zero: “good” is defined by the standards of the project you're on. Have a look at what the existing messages look like, and try to emulate that first before doing anything else. + +Having said that, here are some principles I've found helpful and broadly applicable. + +* Treat the first line of the message as a one-sentence summary. Most SCM systems have an “overview” command that shows shortened commit messages in bulk, so making the very beginning of the message meaningful helps make those modes more useful for finding specific commits. _It's okay for this to be a “what” description_ if the rest of the message is a “why” description. + +* Fill out the rest of the message with prose outlining why you made the change. Don't reiterate the contents of the change in great detail if you can avoid it: anyone who needs that can read the diff themselves, or reach out to ask for help understanding the change. A good rationale sets context for the problem being solved and addresses the ways the proposed change alters that context. + +* If you use an issue tracker (and you should), include whatever issue-linking notes it supports right at the start of the message, where it'll be visible even in summarized commit logs. If your tracker has absurdly long issue-linking syntax, or doesn't support issue links in commits at all, include a short issue identifier at the front of the message and put the long part somewhere out of the way, such as on a line of its own at the end of the message. + +* If you need rich commit messages (links, lists, and so on), pick one markup language and stick with it. It'll be easier to write useful commit formatters if you only have to deal with one syntax, rather than four. Personally, I use Markdown when I can, or a reduced subset of Markdown, as it's something most developers I interact with will be at least passing familiar with. diff --git a/content/code/configuring-browser-apps.md b/content/code/configuring-browser-apps.md new file mode 100644 index 0000000..6f53b6e --- /dev/null +++ b/content/code/configuring-browser-apps.md @@ -0,0 +1,103 @@ +--- +title: Configuring Browser Apps +date: 2016-06-04T12:10:47-0400 +--- + +I've found myself in the unexpected situation of having to write a lot of +browser apps/single page apps this year. I have some thoughts on configuration. + +<!--more--> + +## Why Bother + +* Centralize environment-dependent facts to simplify management & testing +* Make it easy to manage app secrets. + + [@wlonk](https://twitter.com/wlonk) adds: + + > “Secrets”? What this means in a browser app is a bit different. + + Which is unpleasantly true. In a freestanding browser app, a “secret” is only as secret as your users and their network connections choose to make it, i.e., not very secret at all. Maybe that should read “make it easy to manage app _tokens_ and _identities_,” instead. + +* Keep config data & API tokens out of app's source control +* Integration point for external config sources (Aerobatic, Heroku, etc) +* The forces described in [12 Factor App: Dependencies](http://12factor.net/dependencies) and, to a lesser extent, [12 Factor App: Configuration](http://12factor.net/config) apply just as well to web client apps as they do to freestanding services. + +## What Gets Configured + +Yes: + +* Base URLs of backend services +* Tokens and client IDs for various APIs + +No: + +* “Environments” (sorry, Ember folks - I know Ember thought this through carefully, but whole-env configs make it easy to miss settings in prod or test, and encourage patterns like “all devs use the same backends”) + +## Delivering Configuration + +There are a few ways to get configuration into the app. + +### Globals + +```html +<head> + <script>window.appConfig = { + "FOO_URL": "https://foo.example.com/", + "FOO_TOKEN": "my-super-secret-token" + };</script> + <script src="/your/app.js"></script> +</head> +``` + +* Easy to consume: it's just globals, so `window.appConfig.foo` will read them. + * This requires some discipline to use well. +* Have to generate a script to set them. + * This can be a `<script>window.appConfig = {some json}</script>` tag or a standalone config script loaded with `<script src="/config.js">` + * Generating config scripts sets a minimum level of complexity for the deployment process: you either need a server to generate the script at request time, or a preprocessing step at deployment time. + * It's code generation, which is easy to do badly. I had originally proposed using `JSON.stringify` to generate a Javascript object literal, but this fails for any config values with `</script>` in them. That may be an unlikely edge case, but that only makes it a nastier trap for administrators. + + [There are more edge cases](https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/JSON/stringify). I strongly suspect that a hazard-free implementation requires a full-blown JS source generator. I had a look at building something out of [escodegen](https://github.com/estools/escodegen) and [estemplate](https://github.com/estools/estemplate), but + + 1. `escodegen`'s node version [doesn't generate browser-safe code](https://github.com/estools/escodegen/issues/298), so string literals with `</script>` or `</head>` in them still break the page, and + + 2. converting javascript values into parse trees to feed to `estemplate` is some seriously tedious code. + +### Data Attributes and Link Elements + +```html +<head> + <link rel="foo-url" href="https://foo.example.com/"> + <script src="/your/app.js" data-foo-token="my-super-secret-token"></script> +</head> +``` + +* Flat values only. This is probably a good thing in the grand, since flat configurations are easier to reason about and much easier to document, but it makes namespacing trickier than it needs to be for groups of related config values (URL + token for a single service, for example). +* Have to generate the DOM to set them. + * This is only practical given server-side templates or DOM rendering. You can't do this with bare nginx, unless you pre-generate pages at deployment time. + +### Config API Endpoint + +```js +fetch('/config') /* {"FOO_URL": …, "FOO_TOKEN": …} */ + .then(response => response.json()) + .then(json => someConfigurableService); +``` + +* Works even with “dumb” servers (nginx, CloudFront) as the endpoint can be a generated JSON file on disk. If you can generate files, you can generate a JSON endpoint. +* Requires an additional request to fetch the configuration, and logic for injecting config data into all the relevant configurable places in the code. + * This request can't happen until all the app code has loaded. + * It's very tempting to write the config to a global. This produces some hilarious race conditions. + +### Cookies + +See for example [clientconfig](https://github.com/henrikjoreteg/clientconfig): + +```js +var config = require('clientconfig'); +``` + +* Easy to consume given the right tools; tricky to do right from scratch. +* Requires server-side support to send the correct cookie. Some servers will allow you to generate the right cookie once and store it in a config file; others will need custom logic, which means (effectively) you need an app server. +* Cookies persist and get re-sent on subsequent requests, even if the server stops delivering config cookies. Client code has to manage the cookie lifecycle carefully (clientconfig does this automatically) +* Size limits constrain how much configuration you can do. diff --git a/content/code/tools-convention.md b/content/code/tools-convention.md new file mode 100644 index 0000000..12a3253 --- /dev/null +++ b/content/code/tools-convention.md @@ -0,0 +1,183 @@ +--- +title: The `tools` directory +linkTitle: Using the shell environment as a project tool +date: 2022-03-04T15:16:35-0500 +--- + +A general and less tooling-intensive approach to automating routine project tasks. + +<!--more--> + +## Background + +It's fairly common for the authors and maintainers of a software project to accumulate routine tasks as the project proceeds. Common examples of this: + +* Building or packaging the code to prepare it for delivery. +* Running the test suite. +* Compiling project documentation. +* Deploying services to the cloud. + +Many projects also sprout more esoteric, project-specific tasks - as I write this, I'm working on one where "delete an AMI from AWS" is one of the things I've had to automate. + +Many software projects also end up relying on a build tool - `make`, classically, but there are myriad others. These are tools that are designed to solve specific classes of routine problems (in fact, generally the ones listed above). + +I've often seen these two factors converge in projects in the form of a build tool script that contains extensive task-specific support, or in some cases where a build tool such as `make` is pressed into service as a generic task service and does none of the work it was designed to do in the first place. + +This has a couple of consequences. + +First, the configuration language for these tools is invariably unique to the tool. `make` is the least prone to this, but `Makefile`s require careful understanding of both shell and make variables, Make's unique implicit rule system, and the huge number of variables and rules implied by the use of `make` in the first place. + +Adding utility tasks to projects using scons, Gradle, Cargo, Maven, or any other, more modern build tool involves similar assumptions and often a very specific programming language and execution model. Even tools like Gradle, which use a general-purpose language, impose a unqiue dialect of that language which is intended to facilitate the kinds of build tasks the tool is designed around. + +So, writing and understanding the utility tasks that exist requires specific skills, which are orthogonal to the skills needed to understand or implement the project itself. + +Second, it creates a dependency on that tool for those tasks. They can only be executed automatically when the tool is available, and can only be executed by hand (with all that entails), or reimplemented, when the tool is absent. Often, build tools are not expected in the end-user environment. Projects either end up having to different approaches for tasks that might run in both development and end-user environements versus tasks that run in only one of the two. + +## The `tools` approach + +To address those consequences, I've started putting routine tasks into my project as shell scripts, in a `tools` directory. + +The shell is a widely-deployed, general-purpose automation tool that handles many routine administration tasks well. For software which will be delivered to a unix-like end environment, developers can be confident that a shell will be present, and the skills to write basic shell scripts are generally already part of a unix programmers' repertoire. + +These scripts follow a few principles to ensure that they remain manageable: + +* Always use `set -e`, use `set -o pipefail` if appropriate. A tool that needs to run multiple steps should abort, naturally and automatically, if any of those steps fail. + +* Usually chdir to the project's root directory (generally with `cd "$(dirname "$0")/.."`). Normally, manipulating the cwd makes shell scripts brittle and hard to deploy, but in this use case, it ensures that the script can reliably manipulate the project's code. This step can be skipped if the tool doesn't do anything with the project, but it's usually worth keeping it anyways just for consistency. + +* Include a block comment near the top with a usage summary: + + ```bash + ## tools/example + ## tools/example FILENAME + ## + ## Runs the examples. If FILENAME is provided, then only the + ## example in FILENAME will actually be executed. + ``` + +* Minimize the use of arguments. Ideally, use only either no arguments, one argument, or an arbitrary number of arguments. Under no circumstances use options or flags. + +* Minimize the use of shell constructs other than running commands. This means minimizing the use of `if`, `for`, `while`, and `set` constructs, shell functions, `$()` and backticks, complex pipelines, `eval`, reading from the environment, and otherwise. This is not a hard and fast rule, and the underlying intent is to keep the tool script as straightforward and as easily-understood as possible. + +* Where a tool needs to run multiple steps, generally break those steps out into independent tool scripts. Processes are cheap; clarity is valuable. + +More generally, the intention is to encapsulate the commands needed to perform routine tasks, not to be complex software in their own right. + +If you're using a project-specific shell environment (with [`direnv`](https://direnv.net) or similar), add `tools` to your `PATH` so that you can run tool scripts from anywhere in the project. Being able to type `build` and have the project build, regardless of where you're looking at that moment, is very helpful. For `direnv`'s `.envrc` this can be done using `PATH_add tools`. + +## Tradeoffs + +No approach is perfect, and this approach has its own consequences: + +* Tasks can't participate in dependency-based ordering on their own. They're just shell scripts. This can be troubling if a tool does something that needs to take place during ordinary builds. + +* There's always the temptation to Add Features to tools scripts, and it takes steady effort and careful judgment as to how to do so while hewing to the goals of the approach. For example, with Docker, I've had situations where I end up with two tools with nearly-identical Docker command lines (and thus code duplication to maintain) because I preferred to avoid adding an optional debug mode to an existing tools script. + +* If you're not using `direnv`, the tools directory is only easily available to your shell if your `pwd` is the project root. This is awkward, and `sh` derivatives don't provide any convenient way to "search upwards" for commands. + +* Tool scripts need to take care to set their own `pwd` appropriately, which can take some thinking. What directory is appropriate depends on what the tool script needs to accomplish, so the right answer isn't always obvious. A convention of "always cd to the project root" covers most cases, but there are exceptions where the user's `pwd` or some other directory may be more appropriate. + +## Examples + +This site is built using two tools. `tools/build`: + +```bash +#!/bin/bash -e + +cd "$(dirname "$0")/.." + +## tools/build +## +## Converts the content in docs/ into a deployable website in site/ + +exec mkdocs build +``` + +And `tools/publish`: + +```bash +#!/bin/bash -e + +cd "$(dirname "$0")/.." + +## tools/publish +## +## Publishes site/ to the S3 bucket hosting grimoire.ca + +exec aws s3 sync --delete site/ s3://grimoire.ca/ +``` + +Another project I'm working on has a tool to run all CI checks - `tools/check`: + +```bash +#!/bin/bash -ex + +# tools/checks +# +# Runs all code checks. If you're automating testing, call this rather than +# invoking a test command directly; if you're adding a test command, add it here +# or to one of the tools called from this script. + +tools/check-tests +tools/check-lints +tools/check-dependencies +``` + +This, in turn, runs the following: + +* `tools/check-tests`: + + #!/bin/bash -ex + + # tools/check-tests + # + # Checks that the code in this project passes incorrectness checks. + + cargo build --locked --all-targets + cargo test + +* `tools/check-lints`: + + #!/bin/bash -ex + + # tools/check-lints + # + # Checks that the code in this project passes style checks. + + cargo fmt -- --check + cargo clippy -- --deny warnings + +* `tools/check-dependencies`: + + #!/bin/bash -ex + + # check-dependencies + # + # Checks that the dependencies in this project are all in use. + + cargo udeps --locked --all-targets + +Yet another project uses a tool to run tests against a container: + +```bash +#!/bin/bash -e + +cd "$(dirname "$0")/.." + +## tools/check +## +## Runs tests on the docker image. + +VERSION="$(detect-version)" + +py.test \ + --image-version "${VERSION}" +``` + +## Alternatives + +* Casey Rodarmor's [`just`](https://github.com/casey/just) stores shell snippets in a project-wide `Justfile`. + + This allows for reliable generation of project-specific help and usage information from the `Justfile` and allows `just foo` to work from any directory in the project without project-specific shell configuration. + + On the other hand, it's another tool developers would need to install to get started on a project. `Justfile` entries can't be run without `just` (a problem `make` also has). diff --git a/content/code/users-rolegraph-privs.md b/content/code/users-rolegraph-privs.md new file mode 100644 index 0000000..b93380f --- /dev/null +++ b/content/code/users-rolegraph-privs.md @@ -0,0 +1,95 @@ +--- +title: A Users, Roles & Privileges Scheme Using Graphs +date: 2013-02-06T12:29:39-0500 +--- + +An SQL schema and associated queries for handling permissions when roles can nest arbitrarily. + +<!--more--> + +The basic elements: + +* Every agent that can interact with a system is represented by a **user**. +* Every capability the system has is authorized by a distinct **privilege**. +* Each user has a list of zero or more **roles**. + * Roles can **imply** further roles. This relationship is transitive: if role A implies role B, then a member of role A is a member of role B; if role B also implies role C, then a member of role A is also a member of role C. It helps if the resulting role graph is acyclic, but it's not necessary. + * Roles can **grant** privileges. + +A user's privileges are the union of the privileges granted by the transitive closure of their roles. + +```sql +create table "user" ( + username varchar + primary key + -- credentials &c +); + +create table role ( + name varchar + primary key +); + +create table role_member ( + role varchar + not null + references role, + member varchar + not null + references "user", + primary key (role, member) +); + +create table role_implies ( + role varchar + not null + references role, + implied_role varchar + not null +); + +create table privilege ( + privilege varchar + primary key +); + +create table role_grants ( + role varchar + not null + references role, + privilege varchar + not null + references privilege, + primary key (role, privilege) +); +``` + +If your database supports recursive CTEs, this schema can be queried in one shot, since we can have the database do all the graph-walking along roles: + +```sql +with recursive user_roles (role) AS ( + select + role + from + role_member + where + member = 'SOME USERNAME' + union + select + implied_role as role + from + user_roles + join role_implies on + user_roles.role = role_implies.role +) +select distinct + role_grants.privilege as privilege +from + user_roles + join role_grants on + user_roles.role = role_grants.role +order by privilege; +``` + +If not, you'll need to pull the entire graph into memory and manipulate it there: this schema doesn't give you any easy handles to identify only the roles transitively included in the role of interest, and repeatedly querying for each step of the graph requires an IO roundtrip at each step, burning whole milliseconds along the way. + +Realistic use cases should have fairly simple graphs: elemental privileges are grouped into concrete roles, which are in turn grouped into abstracted roles (by department, for example), which are in turn granted to users. If the average user is in tens of roles and has hundreds of privileges, the entire dataset fits in memory, and PostgreSQL performs well. In PostgreSQL, the above schema handles ~10k privileges and ~10k roles with randomly-generated graph relationships in around 100ms on my laptop, which is pretty slow but not intolerable. Perverse cases (interconnected total subgraphs, deeply-nested linear graphs) can take absurd time but do not reflect any likely permissions scheme. |
