pilcrow - Run-It-Yourself web chat, maybe

	Commit message (Collapse)	Author	Age
*	Define a generic "Failed" case for app-level errors (and a few others).	Owen Jacobson	2025-11-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were previously exporting root causes from one layer of abstraction to the next. For example, anything that called into the database could cause an `sqlx::Error`, so anything that transitively called into that logic exported a `Database(sqlx::Error)` error variant of its own, using `From` to map errors from inner type to outer type. This had a couple of side effects. First, it required each layer of error handling to carry with it a `From` implementation unwrapping and rewrapping root causes from the next layer down. This was particularly apparent in the event and boot endpoints, which had separate error cases unique to crypto key processing errors solely because they happened to involve handling events that contained those keys. There were others, including the pervasive `Database(sqlx::Error)` error variants. Separately, none of the error variants introduced for this purpose were being used for anything other than printing to stderr. All the complexity of From impls and all the structure of the error types was being thrown away at top-level error handlers. This change replaces most of those error types with a generic `Failed` error. A `Failed` carries with it two pieces of information: a (boxed) underlying error, of any boxable `Error` type, and text meant to explain the context and cause of an error. Code which acts on errors can treat `Failed` as a catch-all case, while individually handling errors that signify important cases. Errors can be moved into or out of the `Failed` case by refactoring, as needed. The design of `Failed` is heavily motivated by [anyhow's `context` system][context] as a way for the programmer to capture immediate intention as an explanation for some underlying error. However, instead of accepting the full breadth of types that implement `Display`, a `Failed` can only carry strings as explanation. We don't need the generality at this time, and the implementation underlying it is pretty complex for what it does. [context]: https://docs.rs/anyhow/latest/anyhow/struct.Error.html#method.context This change also means that the full source chain for an error is now available to top-level error handlers, allowing more complete error messages. For example, starting `pilcrow` with an invalid network listen address produces Failed to bind to www.google.com:64209 Caused by: Can't assign requested address (os error 49) instead of the previous Error: Io(Os { code: 49, kind: AddrNotAvailable, message: "Can't assign requested address" }) which previously captured the same _cause_, but without the formatting (see previous commit) and without the _context_ (this commit). Similar improvements are available for many of the error scenarios Pilcrow is designed to give up on. When deciding which errors to use `Failed` with, I've used the heuristic that if something can fail for more than one underlying reason, and if the caller will only ever need to be able to differentiate those reasons after substantial refactoring anyways, then the reasons should collase into `Failed`. If there's either only a single underlying failure reason possible, or only errors arising out of the function body possible, then I've left error handling alone. In the process I've refactored most request-handler-level error mappings to explicitly map `Failed` to `Internal`, rather than having a catch-all mapping for all unhandled errors, to make it easier to remember to add request-level error representations when adding app-level error cases. This also includes helper traits for `Error` and `Result`, to make constructing `Failed` (and errors that include `Failed` as an alternative) easier to do, and some constants for the recurring error messages related to transaction demarcation. I'm not completely happy with the repetitive nature of those error cases, but this is the best I've arrived at so far. As errors are no longer expected to be convertible up the call stack, the `NotFound` and `Duplicate` helper traits for database errors had to change a bit. Those previously assumed that they would be used in the context of an error type implementing `From<sqlx::Error>` (or from another error type with similar characteristics), and that's not the case any more. The resulting idiom for converting a missing value into a domain error is `foo.await.optional().fail(MESSAGE)?.ok_or(DOMAIN ERROR)?`, which is rather clunky, but I've opted not to go further with it. The `Duplicate` helper is just plain gone, as it's not easily generalizable in this structure and using `match` is more tractable for me. Finally, I've changed the convention for error messages from `all lowercase messages in whatever tense i feel like at the moment` to `Sentence-case messages in the past tense`, frequently starting with `Failed to` and a short summary of the task at hand. This, as above, makes error message capitalization between Pilcrow's own messages and messages coming from other libraries/the Rust stdlib much more coherent and less jarring to read.
*	De minimis "send me a notification" implementation.	Owen Jacobson	2025-11-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a user clicks "send a test notification," Pilcrow delivers a push message (with a fixed payload) to all active subscriptions. The included client then displays this as a notification, using browser APIs to do so. This lets us verify that push notification works, end to end - and it appears to. The API endpoint for sending a test notification is not documented. I didn't feel it prudent to extensively document an endpoint that is intended to be temporary and whose side effects are very much subject to change. However, for posterity, the endpoint is POST /api/push/ping {} and the push message payload is ping Subscriptions with permanent delivery failures are nuked when we encounter them. Subscriptions with temporary failures cause the `ping` endpoint to return an internal server error, and are not retried. We'll likely want retry logic - including retry logic to handle server restarts - for any more serious use, but for a smoke test, giving up immediately is fine. To make the push implementation testable, `App` is now generic over it. Tests use a dummy implementation that stores sent messages in memory. This has some significant limitations, documented in the test suite, but it beats sending real notifications to nowhere in tests.
*	Use PKCS8 PEM, not raw SEC1 bytes, to store VAPID keys.	Owen Jacobson	2025-10-28
\| \| \| \| \| \|	The `web-push` crate's VAPID signing support requires a private key. The `p256` crate is more than capable of generating one, but the easiest way to get a key from a `p256::ecdsa::SigningKey` to a `web_push::PartialVapidSignature` is via PKCS #8 PEM, not via the bytes. Since we'll need it in that form anyways, store it that way, so that we don't have to decode it using `p256`, re-encode to PEM, then decode to `PartialVapidSignature`. The migration in this commit invalidates existing VAPID keys. We could include support for re-encoding them on read, but there's little point: this code is still in flux anyways, and only development deployments exist. By the time this is final, the schema will have settled.
*	Merge remote-tracking branch 'codeberg/main' into push-notify	Owen Jacobson	2025-10-28
\|\
\| *	Convert the `Events` app component into a freestanding struct.	Owen Jacobson	2025-10-28
\| \| \| \| \| \| \| \|	This one doesn't need a FromRef impl at this time, as it's only ever used in a handler that also uses other components and so will need to continue receiving `App`. However, there's little reason not to make the implementatino of the `Events` struct consistent.
* \|	Generate, store, and deliver a VAPID key.	Owen Jacobson	2025-08-30
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	VAPID is used to authenticate applications to push brokers, as part of the [Web Push] specification. It's notionally optional, but we believe [Apple requires it][apple], and in any case making it impossible to use subscription URLs without the corresponding private key available, and thus harder to impersonate the server, seems like a good security practice regardless. [Web Push]: https://developer.mozilla.org/en-US/docs/Web/API/Push_API [apple]: https://developer.apple.com/documentation/usernotifications/sending-web-push-notifications-in-web-apps-and-browsers There are several implementations of VAPID for Rust: * [web_push](https://docs.rs/web-push/latest/web_push/) includes an implementation of VAPID but requires callers to provision their own keys. We will likely use this crate for Web Push fulfilment, but we cannot use it for key generation. * [vapid](https://docs.rs/vapid/latest/vapid/) includes an implementation of VAPID key generation. It delegates to `openssl` to handle cryptographic operations. * [p256](https://docs.rs/p256/latest/p256/) implements NIST P-256 in Rust. It's maintained by the RustCrypto team, though as of this writing it is largely written by a single contributor. It isn't specifically designed for use with VAPID. I opted to use p256 for this, as I believe the RustCrypto team are the most likely to produce a correct and secure implementation, and because openssl has consistently left a bad taste in my mouth for years. Because it's a general implementation of the algorithm, I expect that it will require more work for us to adapt it for use with VAPID specifically; I'm willing to chance it and we can swap it out for the vapid crate if it sucks. This has left me with one area of uncertainty: I'm not actually sure I'm using the right parts of p256. The choice of `ecdsa::SigningKey` over `p256::SecretKey` is based on [the MDN docs] using phrases like "This value is part of a signing key pair generated by your application server, and usable with elliptic curve digital signature (ECDSA), over the P-256 curve." and on [RFC 8292]'s "The 'k' parameter includes an ECDSA public key in uncompressed form that is encoded using base64url encoding. However, we won't be able to test my implementation until we implement some other key parts of Web Push, which are out of scope of this commit. [the MDN docs]: https://developer.mozilla.org/en-US/docs/Web/API/PushSubscription/options [RFC 8292]: https://datatracker.ietf.org/doc/html/rfc8292#section-3.2 Following the design used for storing logins and users, VAPID keys are split into a non-synchronized part (consisting of the private key), whose exposure would allow others to impersonate the Pilcrow server, and a synchronized part (consisting of event coordinates and, notionally, the public key), which is non-sensitive and can be safely shared with any user. However, the public key is derived from the stored private key, rather than being stored directly, to minimize redundancy in the stored data. Following the design used for expiring stale entities, the app checks for and creates, or rotates, its VAPID key using middleware that runs before most API requests. If, at that point, the key is either absent, or more than 30 days old, it is replaced. This imposes a small tax on API request latency, which is used to fund prompt and automatic key rotation without the need for an operator-facing key management interface. VAPID keys are delivered to clients via the event stream, as laid out in `docs/api/events.md`. There are a few reasons for this, but the big one is that changing the VAPID key would immediately invalidate push subscriptions: we throw away the private key, so we wouldn't be able to publish to them any longer. Clients must replace their push subscriptions in order to resume delivery, and doing so promptly when notified that the key has changed will minimize the gap. This design is intended to allow for manual key rotation. The key can be rotated "immedately" by emptying the `vapid_key` and `vapid_signing_key` tables (which destroys the rotated kye); the server will generate a new one before it is needed, and will notify clients that the key has been invalidated. This change includes client support for tracking the current VAPID key. The client doesn't _use_ this information anywhere, yet, but it has it.
*	Rename "channel" to "conversation" within the server.	Owen Jacobson	2025-07-03
\| \| \| \| \| \|	I've split this from the schema and API changes because, frankly, it's huge. Annoyingly so. There are no semantic changes in this, it's all symbol changes, but there are a _lot_ of them because the term "channel" leaks all over everything in a service whose primary role is managing messages sent to channels (now, conversations). I found a buggy test while working on this! It's not fixed in this commit, because it felt mean to hide a real change in the middle of this much chaff.
*	Rename `login` to `user` throughout the server	Owen Jacobson	2025-03-23
\|
*	Rename the `login` module to `user`.	Owen Jacobson	2025-03-23
\|
*	Upgrade to Rust 1.85 and Rust 2024 edition.	Owen Jacobson	2025-02-20
\| \| \| \| \| \| \| \|	There are a couple of migration suggestions from `cargo fix --edition` that I have deliberately skipped, which are intended to make sure that the changes to `if let` scoping don't bite us. They don't, I'm pretty sure, and if I turn out to be wrong, I'd rather fix the scoping issues (as they arise) than use `match` (`cargo fix --edition`'s suggestion). This change also includes a bulk reformat and a clippy cleanup. NOTA BENE: As this requires a new Rust toolchain, you'll need to update Rust (`rustup update`, normally) or the server won't build. This also applies to the Debian builder Docker image; it'll need to be rebuilt (from scratch, pulling its base image again) as well.
*	Resume points are no longer optional.	Owen Jacobson	2024-10-30
\| \| \| \|	This is an inconsequential change for actual clients, since "resume from the beginning" was never a preferred mode of operation, and it simplifies some internals. It should also mean we get better query plans where `coalesce(cond, true)` was previously being used.
*	Canonicalize login and channel names.	Owen Jacobson	2024-10-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Canonicalization does two things: * It prevents duplicate names that differ only by case or only by normalization/encoding sequence; and * It makes certain name-based comparisons "case-insensitive" (generalizing via Unicode's case-folding rules). This change is complicated, as it means that every name now needs to be stored in two forms. Unfortunately, this is _very likely_ a breaking schema change. The migrations in this commit perform a best-effort attempt to canonicalize existing channel or login names, but it's likely any existing channels or logins with non-ASCII characters will not be canonicalize correctly. Since clients look at all channel names and all login names on boot, and since the code in this commit verifies canonicalization when reading from the database, this will effectively make the server un-usuable until any incorrectly-canonicalized values are either manually canonicalized, or removed It might be possible to do better with [the `icu` sqlite3 extension][icu], but (a) I'm not convinced of that and (b) this commit is already huge; adding database extension support would make it far larger. [icu]: https://sqlite.org/src/dir/ext/icu For some references on why it's worth storing usernames this way, see <https://www.b-list.org/weblog/2018/nov/26/case/> and the refernced talk, as well as <https://www.b-list.org/weblog/2018/feb/11/usernames/>. Bennett's treatment of this issue is, to my eye, much more readable than the referenced Unicode technical reports, and I'm inclined to trust his opinion given that he maintains a widely-used, internet-facing user registration library for Django.
*	Provide a view of logins to clients.	Owen Jacobson	2024-10-09
\|
*	Separate `/api/boot` into its own module.	Owen Jacobson	2024-10-05
\|
*	List messages per channel.	Owen Jacobson	2024-10-03
\|
*	Add endpoints for deleting channels and messages.	Owen Jacobson	2024-10-03
\| \| \| \|	It is deliberate that the expire() functions do not use them. To avoid races, the transactions must be committed before events get sent, in both cases, which makes them structurally pretty different.
*	Represent channels and messages using a split "History" and "Snapshot" model.	Owen Jacobson	2024-10-03
\| \| \| \| \| \|	This separates the code that figures out what happened to an entity from the code that represents it to a user, and makes it easier to compute a snapshot at a point in time (for things like bootstrap). It also makes the internal logic a bit easier to follow, since it's easier to tell whether you're working with a point in time or with the whole recorded history. This hefty.
*	Package up common event fields as Instant	Owen Jacobson	2024-10-02
\|
*	Retire top-level `repo`.	Owen Jacobson	2024-10-02
\| \| \| \|	This helped me discover an organizational scheme I like more.
*	First pass on reorganizing the backend.	Owen Jacobson	2024-10-02
	This is primarily renames and repackagings.