summaryrefslogtreecommitdiff
path: root/wiki/devops
diff options
context:
space:
mode:
authorOwen Jacobson <owen@grimoire.ca>2020-01-28 20:49:17 -0500
committerOwen Jacobson <owen@grimoire.ca>2020-01-28 23:23:18 -0500
commit0d6f58c54a7af6c8b4e6cd98663eb36ec4e3accc (patch)
treea2af4dc93f09a920b0ca375c1adde6d8f64eb6be /wiki/devops
parentacf6f5d3bfa748e2f8810ab0fe807f82efcf3eb6 (diff)
Editorial pass & migration to mkdocs.
There's a lot in grimoire.ca that I either no longer stand behind or feel pretty weird about having out there.
Diffstat (limited to 'wiki/devops')
-rw-r--r--wiki/devops/autodeploy.md38
-rw-r--r--wiki/devops/continuous-signing.md7
-rw-r--r--wiki/devops/glassfish-and-upstart.md153
-rw-r--r--wiki/devops/notes-on-bootstrapping-grimoire-dot-ca.md71
-rw-r--r--wiki/devops/puppet-2.7-to-3.1.md51
-rw-r--r--wiki/devops/self-daemonization-sucks.md78
6 files changed, 0 insertions, 398 deletions
diff --git a/wiki/devops/autodeploy.md b/wiki/devops/autodeploy.md
deleted file mode 100644
index 801c3eb..0000000
--- a/wiki/devops/autodeploy.md
+++ /dev/null
@@ -1,38 +0,0 @@
-# Notes towards automating deployment
-
-This is mostly aimed at the hosted-apps folks; deploying packaged software for
-end users requires a slightly different approach.
-
-## Assumptions
-
-1. You have one or more _services_ to deploy. (If not, what are you doing
-here?)
-
-2. Your services are tracked in _source control_. (If not, go sort that out,
-then come back. No, seriously, _now_.)
-
-3. You will be deploying your services to one or more _environments_. An
-environment is an abstract thing: think “production,” not
-“web01.public.example.com.” (If not, where, exactly, will your service run?)
-
-4. For each service, in each environment, there are one or more _servers_ to
-host the service. These servers are functionally identical. (If not, go pave
-them and rebuild them using Puppet, Chef, CFengine, or, hell, shell scripts
-and duct tape. An environment full of one-offs is the kind of hell I wouldn't
-wish on my worst enemy.)
-
-5. For each service, in each environment, there is a canonical series of steps
-that produce a “deployed” system.
-
------
-
-1. Decide what code should be deployed. (This is a version control activity.)
-2. Get the code onto the fucking server.
-3. Decide what configuration values should be deployed. (This is also a
- version control activity, though possibly not in the same repositories as
- the code.)
-4. Get the configuration onto the fucking server.
-5. Get the code running with the configuration.
-6. Log to fucking syslog.
-7. When the machine reboots, make sure the code comes back running the same
- configuration.
diff --git a/wiki/devops/continuous-signing.md b/wiki/devops/continuous-signing.md
deleted file mode 100644
index 422ec49..0000000
--- a/wiki/devops/continuous-signing.md
+++ /dev/null
@@ -1,7 +0,0 @@
-# Code Signing on Build Servers
-
-We sign things so that we can authenticate them later, but authentication is
-largely a conscious function. Computers are bad at answering "is this real".
-
-Major signing systems (GPG, jarsigner) require presentation of credentials at
-signing time. CI servers don't generally have safe tools for this.
diff --git a/wiki/devops/glassfish-and-upstart.md b/wiki/devops/glassfish-and-upstart.md
deleted file mode 100644
index ce5d0eb..0000000
--- a/wiki/devops/glassfish-and-upstart.md
+++ /dev/null
@@ -1,153 +0,0 @@
-# Glassfish and Upstart
-
-**Warning**: the article you're about to read is largely empirical. Take
-everything in it in a grain of salt, and _verify it yourself_ before putting
-it into production. You have been warned.
-
-The following observations apply to Glassfish 3.1.2.2. Other versions probably
-act similarly, but check the docs.
-
-## `asadmin create-service`
-
-Glassfish is capable of emitting SysV init scripts for the DAS, or for any
-instance. These init scripts wrap `asadmin start-domain` and `asadmin
-start-local-instance`. However, the scripts it emits are (justifiably)
-minimalist, and it makes some very strong assumptions about the layout of your
-system's rc.d trees and about your system's choice of runlevels. The minimal
-init scripts avoid any integration with platform “enhancements” (such as
-Redhat's `/var/lock/subsys` mechanism and `condrestart` convention, or
-Debian's `start-stop-daemon` helpers) in the name of portability, and the
-assumptions it makes about runlevels and init layout are becoming
-incrementally more fragile as more distributions switch to alternate init
-systems with SysV compatiblity layers.
-
-## Fork and `expect`
-
-Upstart's process tracking mechanism relies on services following one of three
-forking models, so that it can accurately track which children of PID 1 are
-associated with which services:
-
-* No `expect` stanza: The service's “main” process is expected not to fork at
- all, and to remain running. The process started by upstart is the “main”
- process.
-
-* `expect fork`: The service is expected to call `fork()` or `clone()` once.
- The process started by upstart itself is not the “main” process, but its
- first child process is.
-
-* `expect daemon`: The service is expected to call `fork()` or `clone()`
- twice. The first grandchild process of the one started by upstart itself is
- the “main” process. This corresponds to classical Unix daemons, which fork
- twice to properly dissociate themselves from the launching shell.
-
-Surprisingly, `asadmin`-launched Glassfish matches _none_ of these models, and
-using `asadmin start-domain` to launch Glassfish from Upstart is not, as far
-as I can tell, possible. It's tricky to debug why, since JVM thread creation
-floods `strace` with chaff, but I suspect that either `asadmin` or Glassfish
-itself is forking too many times.
-
-From [this mailing list
-thread](https://java.net/projects/glassfish/lists/dev/archive/2012-02/message/9),
-though, it appears to be safe to launch Glassfish directly, using `java -jar
-GLASSFISH_ROOT/modules/glassfish.jar -domain DOMAIN`. This fits nicely into
-Upstart's non-forking expect mode, but you lose the ability to pass VM
-configuration settings to Glassfish during startup. Any memory settings or
-Java environment properties you want to pass to Glassfish have to be passed to
-the `java` command manually.
-
-You also lose `asadmin`'s treatment of Glassfish's working directory. Since
-Upstart can configure the working directory, this isn't a big deal.
-
-## `SIGTERM` versus `asadmin stop-domain`
-
-Upstart always stops services by sending them a signal. While you can dictate
-which signal it uses, you cannot replace signals with another mechanims.
-Glassfish shuts down abruptly when it recieves `SIGTERM` or `SIGINT`, leaving
-some ugly noise in the logs and potentially aborting any transactions and
-requests in flight. The Glassfish developers believe this is harmless and that
-the server's operation is correct, and that's probably true, but I've not
-tested its effect on outward-facing requests or on in-flight operations far
-enough to be comfortable with it.
-
-I chose to run a “clean”(er) shutdown using `asadmin stop-domain`. This fits
-nicely in Upstart's `pre-stop` step, _provided you do not use Upstart's
-`respawn` feature_. Upstart will correctly notice that Glassfish has already
-stopped after `pre-stop` finishes, but when `respawn` is enabled Upstart will
-treat this as an unexpected termination, switch goals from `stop` to
-`respawn`, and restart Glassfish.
-
-(The Upstart documentation claims that `respawn` does not apply if the tracked
-process exits during `pre-stop`. This may be true in newer versions of
-Upstart, but the version used in Ubuntu 12.04 does restart Glassfish if it
-stops during `pre-stop`.)
-
-Yes, this does make it impossible to stop Glassfish, ever, unless you set a
-respawn limit.
-
-Fortunately, you don't actually want to use `respawn` to manage availability.
-The `respawn` mode cripples your ability to manage the service “out of band”
-by forcing Upstart to restart it as a daemon every time it stops for any
-reason. This means you cannot stop a server with `SIGTERM` or `SIGKILL`; it'll
-immediately start again.
-
-## `initctl reload`
-
-It sends `SIGHUP`. This does not reload Glassfish's configuration. Deal with
-it; use `initctl restart` or `asadmin restart-domain` instead. Most of
-Glassfish's configuration can be changed on the fly with `asadmin set` or
-other commands anyways, so this is not a big limitation.
-
-## Instances
-
-Upstart supports “instances” of a service. This slots nicely into Glassfish's
-ability to host multiple domains and instances on the same physical hardware.
-I ended up with a generic `glassfish-domain.conf` Upstart configuration:
-
- description "Glassfish DAS"
- console log
-
- instance $DOMAIN
-
- setuid glassfish
- setgid glassfish
- umask 0022
- chdir /opt/glassfish3
-
- exec /usr/bin/java -jar /opt/glassfish3/glassfish/modules/glassfish.jar -domain "${DOMAIN}"
-
- pre-stop exec /opt/glassfish3/bin/asadmin stop-domain "${DOMAIN}"
-
-Combined with a per-domain wrapper:
-
- description "Glassfish 'example' domain"
- console log
-
- # Consider using runlevels here.
- start on started networking
- stop on deconfiguring-networking
-
- pre-start script
- start glassfish-domain DOMAIN=example
- end script
-
- post-stop script
- stop glassfish-domain DOMAIN=example
- end script
-
-## Possible refinements
-
-* Pull system properties and VM flags from the domain's own `domain.xml`
- correctly. It might be possible to abuse the (undocumented, unsupported, but
- helpful) `--_dry-run` argument from `asadmin start-domain` for this, or it
- might be necessary to parse `domain.xml` manually, or it may be possible to
- exploit parts of Glassfish itself for this.
-
-* The `asadmin` cwd is actually the domain's `config` dir, not the Glassfish
- installation root.
-
-* Something something something password files.
-
-* Syslog and logrotate integration would be useful. The configurations above
- spew Glassfish's startup output and stdout to
- `/var/log/upstart/glassfish-domain-FOO.log`, which may not be rotated by
- default.
diff --git a/wiki/devops/notes-on-bootstrapping-grimoire-dot-ca.md b/wiki/devops/notes-on-bootstrapping-grimoire-dot-ca.md
deleted file mode 100644
index 36cea2c..0000000
--- a/wiki/devops/notes-on-bootstrapping-grimoire-dot-ca.md
+++ /dev/null
@@ -1,71 +0,0 @@
-# Notes on Bootstrapping This Host
-
-Presented without comment:
-
-* Package updates:
-
- apt-get update
- apt-get upgrade
-
-* Install Git:
-
- apt-get install git
-
-* Set hostname:
-
- echo 'grimoire' > /etc/hostname
- sed -i -e $'s,ubuntu,grimoire.ca\tgrimoire,' /etc/hosts
- poweroff
-
- To verify:
-
- hostname -f # => grimoire.ca
- hostname # => grimoire
-
-* Add `owen` user:
-
- adduser owen
- adduser owen sudo
-
- To verify:
-
- id owen # => uid=1000(owen) gid=1000(owen) groups=1000(owen),27(sudo)
-
-* Install Puppetlabs Repos:
-
- wget https://apt.puppetlabs.com/puppetlabs-release-pc1-trusty.deb
- dpkg -i puppetlabs-release-pc1-trusty.deb
- apt-get update
-
-* Install Puppet server:
-
- apt-get install puppetserver
- sed -i \
- -e '/^JAVA_ARGS=/ s,2g,512m,g' \
- -e '/^JAVA_ARGS=/ s, -XX:MaxPermSize=256m,,' \
- /etc/default/puppetserver
- service puppetserver start
-
-* Test Puppet agent:
-
- /opt/puppetlabs/bin/puppet agent --test --server grimoire.ca
-
- This should output the following:
-
- Info: Retrieving pluginfacts
- Info: Retrieving plugin
- Info: Caching catalog for grimoire.ca
- Info: Applying configuration version '1446415926'
- Info: Creating state file /opt/puppetlabs/puppet/cache/state/state.yaml
- Notice: Applied catalog in 0.01 seconds
-
-* Install environment:
-
- git init --bare /root/puppet.git
- # From workstation, `git push root@grimoire.ca:puppet.git master` to populate the repo
- rm -rf /etc/puppetlabs/code/environments/production
- git clone /root/puppet.git /etc/puppetlabs/code/environments/production
-
-* Bootstrap puppet:
-
- /opt/puppetlabs/bin/puppet agent --test --server grimoire.ca
diff --git a/wiki/devops/puppet-2.7-to-3.1.md b/wiki/devops/puppet-2.7-to-3.1.md
deleted file mode 100644
index aaaf302..0000000
--- a/wiki/devops/puppet-2.7-to-3.1.md
+++ /dev/null
@@ -1,51 +0,0 @@
-# Notes on upgrading Puppet from 2.7 to 3.1
-
-## Bad
-
-* As usual, you have to upgrade the puppet master first. 2.7 agents can speak
- to 3.1 masters just fine, but 3.1 agents cannot speak to 2.7 masters.
-
-* I tried to upgrade the Puppet master using both `puppet agent` (failed when
- package upgrades shut down the puppet master) and `puppet apply` (failed for
- Ubuntu-specific reasons outlined below)
-
-* [This bug](https://projects.puppetlabs.com/issues/19308).
-
-* You more or less can't upgrade Puppet using Puppet.
-
-## Good
-
-* My 2.7 manifests worked perfectly under 3.1.
-
-* Puppet's CA and SSL certs survived intact and required no maintenance after
- the upgrade.
-
-* The Hiera integration into class parameters works as advertised and really
- does help a lot.
-
-* Once I figured out how to execute it, the upgrade was pretty smooth.
-
-* No Ruby upgrade!
-
-* Testing the upgrade in a VM sandbox meant being able to fuck up safely.
- [Vagrant](http://www.vagrantup.com) is super awesome.
-
-## Package Management Sucks
-
-Asking Puppet to upgrade Puppet went wrong on Ubuntu because of the way Puppet
-is packaged: there are three (ish) Puppet packages, and Puppet's resource
-evaluation bits try to upgrade and install one package at a time. Upgrading
-only “puppetmaster” upgraded “puppet-common” but not “puppet,” causing Apt to
-remove “puppet”; upgrading only “puppet” similarly upgraded “puppet-copmmon”
-but not “puppetmaster,” causing Apt to remove “puppetmaster.”
-
-The Puppet aptitude provider (which I use instead of apt-get) for Package
-resources also doesn't know how to tell aptitude what to do with config files
-during upgrades. This prevented Puppet from being able to upgrade pacakges
-even when running standalone (via `puppet apply`).
-
-Finally, something about the switchover from Canonical's Puppet .debs to
-Puppetlabs' .debs caused aptitude to consider all three packages “broken”
-after a manual upgrade ('aptitude upgrade puppet puppetmaster'). Upgrading the
-packages a second time corrected it; this is the path I eventually took with
-my production puppetmaster and nodes.
diff --git a/wiki/devops/self-daemonization-sucks.md b/wiki/devops/self-daemonization-sucks.md
deleted file mode 100644
index b527da8..0000000
--- a/wiki/devops/self-daemonization-sucks.md
+++ /dev/null
@@ -1,78 +0,0 @@
-# Self-daemonizing code is awful
-
-The classical UNIX approach to services is to implement them as “daemons,”
-programs that run without a terminal attached and provide some service. The
-key feature of a classical daemon is that, when started, it carefully
-detaches itself from its initial environment and terminal, then continues
-running in the background.
-
-This is awful and I'm glad modern init replacements discourage it.
-
-## Process Tracking
-
-Daemons don't exist in a vacuum. Administrators and owners need to be able to
-start and stop daemons reliably, and check their status. The classic
-self-daemonization approach makes this impossible.
-
-Traditionally, daemons run as children of `init` (pid 1), even if they start
-out as children of some terminal or startup process. Posix only provides
-deterministic APIs for processes to manage their children and their immediate
-parents; the classic daemonisation protocol hands the newly-started daemon
-process off from its original parent process, which knows how to start and
-stop it, to an unsuspecting `init`, which has no idea how this specific
-daemon is special.
-
-The standard workaround has daemons write their own PIDs to a file, but a
-file is “dead” data: it's not automatically updated if the daemon dies, and
-can linger long enough to contain the PID of some later, unrelated program.
-PID file validity checks generally suffer from subtle (or, sometimes, quite
-gross) race conditions.
-
-## Complexity
-
-The actual _code_ to correctly daemonize a process is surprisingly complex,
-given the individual interfaces' relative simplicity:
-
-* The daemon must start its own process group
-
-* The daemon must detach from its controlling terminal
-
-* The daemon should close (and may reopen) file handles inherited from its
- parent process (generally, a shell)
-
-* The daemon should ensure its working directory is predictable and
- controllable
-
-* The daemon should ensure its umask is predictable and controllable
-
-* If the daemon uses privileged resources (such as low-numbered ports), it
- should carefully manage its effective, real, and session UID and GIDs
-
-* Daemons must ensure that all of the above steps happen in signal-safe ways,
- so that a daemon can be shut down sanely even if it's still starting up
-
-See [this list](http://www.freedesktop.org/software/systemd/man/daemon.html)
-for a longer version. It's worse than you think.
-
-All of this gets even more complicated if the daemon has its own child
-processes, a pattern common to network services. Naturally, a lot of daemons
-in the real world get some of these steps wrong.
-
-## The Future
-
-[Supervisord](http://supervisord.org),
-[Foreman](http://ddollar.github.io/foreman/),
-[Upstart](http://upstart.ubuntu.com),
-[Launchd](https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man1/launchctl.1.html),
-[systemd](http://www.freedesktop.org/wiki/Software/systemd/), and [daemontools](http://cr.yp.to/daemontools.html) all
-encourage services _not_ to self-daemonize by providing a sane system for
-starting the daemon with the right parent process and the right environment
-in the first place.
-
-This is a great application of
-[DRY](http://c2.com/cgi/wiki?DontRepeatYourself), as the daemon management
-code only needs to be written once (in the daemon-managing daemon) rather
-than many times over (in each individual daemon). It also makes daemon
-execution more predictable, since daemons “in production” behave more like
-they do when run attached to a developer's console during debugging or
-development.