pilcrow - Run-It-Yourself web chat, maybe

	Commit message (Collapse)	Author	Age
*	Define a generic "Failed" case for app-level errors (and a few others).	Owen Jacobson	2025-11-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were previously exporting root causes from one layer of abstraction to the next. For example, anything that called into the database could cause an `sqlx::Error`, so anything that transitively called into that logic exported a `Database(sqlx::Error)` error variant of its own, using `From` to map errors from inner type to outer type. This had a couple of side effects. First, it required each layer of error handling to carry with it a `From` implementation unwrapping and rewrapping root causes from the next layer down. This was particularly apparent in the event and boot endpoints, which had separate error cases unique to crypto key processing errors solely because they happened to involve handling events that contained those keys. There were others, including the pervasive `Database(sqlx::Error)` error variants. Separately, none of the error variants introduced for this purpose were being used for anything other than printing to stderr. All the complexity of From impls and all the structure of the error types was being thrown away at top-level error handlers. This change replaces most of those error types with a generic `Failed` error. A `Failed` carries with it two pieces of information: a (boxed) underlying error, of any boxable `Error` type, and text meant to explain the context and cause of an error. Code which acts on errors can treat `Failed` as a catch-all case, while individually handling errors that signify important cases. Errors can be moved into or out of the `Failed` case by refactoring, as needed. The design of `Failed` is heavily motivated by [anyhow's `context` system][context] as a way for the programmer to capture immediate intention as an explanation for some underlying error. However, instead of accepting the full breadth of types that implement `Display`, a `Failed` can only carry strings as explanation. We don't need the generality at this time, and the implementation underlying it is pretty complex for what it does. [context]: https://docs.rs/anyhow/latest/anyhow/struct.Error.html#method.context This change also means that the full source chain for an error is now available to top-level error handlers, allowing more complete error messages. For example, starting `pilcrow` with an invalid network listen address produces Failed to bind to www.google.com:64209 Caused by: Can't assign requested address (os error 49) instead of the previous Error: Io(Os { code: 49, kind: AddrNotAvailable, message: "Can't assign requested address" }) which previously captured the same _cause_, but without the formatting (see previous commit) and without the _context_ (this commit). Similar improvements are available for many of the error scenarios Pilcrow is designed to give up on. When deciding which errors to use `Failed` with, I've used the heuristic that if something can fail for more than one underlying reason, and if the caller will only ever need to be able to differentiate those reasons after substantial refactoring anyways, then the reasons should collase into `Failed`. If there's either only a single underlying failure reason possible, or only errors arising out of the function body possible, then I've left error handling alone. In the process I've refactored most request-handler-level error mappings to explicitly map `Failed` to `Internal`, rather than having a catch-all mapping for all unhandled errors, to make it easier to remember to add request-level error representations when adding app-level error cases. This also includes helper traits for `Error` and `Result`, to make constructing `Failed` (and errors that include `Failed` as an alternative) easier to do, and some constants for the recurring error messages related to transaction demarcation. I'm not completely happy with the repetitive nature of those error cases, but this is the best I've arrived at so far. As errors are no longer expected to be convertible up the call stack, the `NotFound` and `Duplicate` helper traits for database errors had to change a bit. Those previously assumed that they would be used in the context of an error type implementing `From<sqlx::Error>` (or from another error type with similar characteristics), and that's not the case any more. The resulting idiom for converting a missing value into a domain error is `foo.await.optional().fail(MESSAGE)?.ok_or(DOMAIN ERROR)?`, which is rather clunky, but I've opted not to go further with it. The `Duplicate` helper is just plain gone, as it's not easily generalizable in this structure and using `match` is more tractable for me. Finally, I've changed the convention for error messages from `all lowercase messages in whatever tense i feel like at the moment` to `Sentence-case messages in the past tense`, frequently starting with `Failed to` and a short summary of the task at hand. This, as above, makes error message capitalization between Pilcrow's own messages and messages coming from other libraries/the Rust stdlib much more coherent and less jarring to read.
*	Convert `Invites` into a freestanding component.	Owen Jacobson	2025-10-28
\|
*	Split `user` into a chat-facing entity and an authentication-facing entity.	Owen Jacobson	2025-08-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The taxonomy is now as follows: * A _login_ is someone's identity for the purposes of authenticating to the service. Logins are not synchronized, and in fact are not published anywhere in the current API. They have a login ID, a name and a password. * A _user_ is someone's identity for the purpose of participating in conversations. Users _are_ synchronized, as before. They have a user ID, a name, and a creation instant for the purposes of synchronization. In practice, a user exists for every login - in fact, users' names are stored in the login table and are joined in, rather than being stored redundantly in the user table. A login ID and its corresponding user ID are always equal, and the user and login ID types support conversion and comparison to facilitate their use in this context. Tokens are now associated with logins, not users. The currently-acting identity is passed down into app types as a login, not a user, and then resolved to a user where appropriate within the app methods. As a side effect, the `GET /api/boot` method now returns a `login` key instead of a `user` key. The structure of the nested value is unchanged.
*	Generate tokens in memory and then store them.	Owen Jacobson	2025-08-26
\| \| \| \|	This is the leading edge of a larger storage refactoring, where repo types stop doing things like generating secrets or deciding whether to carry out an operation. To make this work, there is now a `Token` type that holds the complete state of a token, in memory.
*	Split the `user` table into an authentication portion and a chat portion.	Owen Jacobson	2025-08-26
\| \| \| \|	We'll be building separate entities around this in future commits, to better separate the authentication data (non-synchronized and indeed "not public") from the chat data (synchronized and public).
*	Factor out common authentication test verification steps into helpers.	Owen Jacobson	2025-08-26
\| \| \| \|	These checks tended to be wordy, and were prone to being done subtly differently in different locations for no good reason. Centralizing them cleans this up and makes the tests easier to follow, at the expense of making it somewhat harder to follow what the test is specifically checking.
*	Return an identity, rather than the parts of an identity, when validating an ↵	Owen Jacobson	2025-08-25
\| \| \| \| \| \|	identity token. This is a small refactoring that's been possible for a while, and we only just noticed.
*	Remove the now-unused return value from the final stage of user creation.	Owen Jacobson	2025-08-24
\|
*	Stop returning an HTTP body from `POST /api/invite/:id`.	Owen Jacobson	2025-08-24
\| \| \| \|	As with the previous commits, the body was never actually being used.
*	Stop returning body data from `POST /api/auth/login`.	Owen Jacobson	2025-08-24
\| \| \| \|	As with `/api/setup`, the response was an ad-hoc choice, which we are not using and which constrains future development just by existing.
*	Hoist `password` out to the top level.	Owen Jacobson	2025-08-24
\| \| \| \|	Having this buried under `crate::user` makes it hard to split up the roles `user` fulfils right now. Moving it out to its own module makes it a bit tidier to reuse it in a separate, authentication-only way.
*	Rust 1.89: Add elided lifetime parameters (`'_`) where appropriate.	Owen Jacobson	2025-08-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rust 1.89 added a new warning: warning: hiding a lifetime that's elided elsewhere is confusing --> src/setup/repo.rs:4:14 \| 4 \| fn setup(&mut self) -> Setup; \| ^^^^^^^^^ ----- the same lifetime is hidden here \| \| \| the lifetime is elided here \| = help: the same lifetime is referred to in inconsistent ways, making the signature confusing help: use `'_` for type paths \| 4 \| fn setup(&mut self) -> Setup<'_>; \| ++++ I don't entirely agree with the style advice here, but lifetime elision style is an evolving area in Rust and I'd rather track the Rust team's recommendations than invent my own, so I've added all of them.
*	Define ID types as specializations, rather than newtypes.	Owen Jacobson	2025-07-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is based heavily on the work done for normalized strings, in `crate::normalize`. The key realization in that module is that the logic distinguishing one kind of thing (normalized strings in that case, IDs, in this case) can be packaged up as a type token, and that doing so may reduce the overall complexity. This implementation for ID also borrows heavily from the implementation for normalized strings. It's less flexible: an ID implemented this way can't expose _less_ of `crate::id::ID`'s interface, whereas newtype wrappers can, for example. However, our code doesn't use that flexiblity on purpose anywhere and we're relatively unlikely to change that. In return, the individual ID types require substantially less code - they do not, for example, need to re-implement `Display` for themselves. I very nearly made the trait `Prefix`: ```rust pub trait Prefix { const PREFIX: &str; } ``` however, I think having an effectively-constant method is less surprising overall.
*	Handlers are _named operations_, which can be exposed via routes.	Owen Jacobson	2025-06-18
\| \| \| \| \| \|	Each domain module that exposes handlers does so through a `handlers` child module, ideally as a top-level symbol that can be plugged directly into Axum's `MethodRouter`. Modules could make exceptions to this - kill the doctrinaire inside yourself, after all - but none of the API modules that actually exist need such exceptions, and consistency is useful. The related details of request types, URL types, response types, errors, &c &c are then organized into modules under `handlers`, along with their respective tests.
*	Reorganize and consolidate HTTP routes.	Owen Jacobson	2025-06-18
\| \| \| \| \| \| \| \|	HTTP routes are now defined in a single, unified module, pulling them out of the topical modules they were formerly part of. This is intended to improve the navigability of the codebase. Previously, finding the handler corresponding to a specific endpoint required prior familiarity, though in practice you could usually guess from topic area. Now, all routes are defined in `crate::routes`. Other than changing visibility, I've avoided making changes to the handlers at the ends of those routes.
*	Remove a bunch of clippy suppressions.	Owen Jacobson	2025-05-21
\| \| \| \|	Notably, one of them was hiding a real (if unreachable) bug, by converting a "the token you have presented is not valid" scenario into an internal server error.
*	Rename `login` to `user` throughout the server	Owen Jacobson	2025-03-23
\|
*	Rename the `login` module to `user`.	Owen Jacobson	2025-03-23
\|
*	Rename `user` to `login` at the database.	Owen Jacobson	2025-03-23
\|
*	Upgrade to Rust 1.85 and Rust 2024 edition.	Owen Jacobson	2025-02-20
\| \| \| \| \| \| \| \|	There are a couple of migration suggestions from `cargo fix --edition` that I have deliberately skipped, which are intended to make sure that the changes to `if let` scoping don't bite us. They don't, I'm pretty sure, and if I turn out to be wrong, I'd rather fix the scoping issues (as they arise) than use `match` (`cargo fix --edition`'s suggestion). This change also includes a bulk reformat and a clippy cleanup. NOTA BENE: As this requires a new Rust toolchain, you'll need to update Rust (`rustup update`, normally) or the server won't build. This also applies to the Debian builder Docker image; it'll need to be rebuilt (from scratch, pulling its base image again) as well.
*	Upgrade Axum to 0.8.1.	Owen Jacobson	2025-02-19
\|
*	Create a dedicated workflow type for creating logins.	Owen Jacobson	2024-10-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Nasty design corner. Logins need to be created in three places: 1. In tests, using app.logins().create(…); 2. On initial setup, using app.setup().initial(…); and 3. When accepting invites, using app.invites().accept(…). These three places do the same thing with respect to logins, but also do a varying mix of other things. Testing is the simplest and _only_ creates a login. Initial setup and invite acceptance both issue a token for the newly-created login. Accepting an invite also invalidates the invite. Previously, those three functions have been copy-pasted variations on a theme. Now that we have validation, the copy-paste approach is no longer tenable; it will become increasingly hard to ensure that the three functions (plus any future functions) remain in synch. To accommodate the variations while consolidating login creation, I've added a typestate-based state machine, which is driven by method calls: * A creation attempt begins with `let create = Create::begin()`. This always succeeds; it packages up arguments used in later steps, but does nothing else. * A creation attempt can be validated using `let validated = create.validate()?`. This may fail. Input validation and password hashing are carried out at this stage, making it potentially expensive. * A validated attempt can be stored in the DB, using `let stored = validated.store(&mut tx).await?`. This may fail. The login will be written to the DB; the caller is responsible for transaction demarcation, to allow other things to take place in the same transaction. * A fully-stored attempt can be used to publish events, using `let login = stored.publish(self.events)`. This always succeeds, and unwraps the state machine to its final product (a `login::History`).
*	Restrict login names.	Owen Jacobson	2024-10-29
\| \| \| \| \| \| \| \|	There's no good reason to use an empty string as your login name, or to use one so long as to annoy others. Names beginning or ending with whitespace, or containing runs of whitespace, are also a technical problem, so they're also prohibited. This change does not implement [UTS #39], as I haven't yet fully understood how to do so. [UTS #39]: https://www.unicode.org/reports/tr39/
*	Invite accept error is Error	Owen Jacobson	2024-10-26
\|
*	Tests for channel, invite, setup, and message deletion events.	Owen Jacobson	2024-10-24
\| \| \| \|	This also found a bug! No live event was being emitted during invite accept. The only way to find out about invites was to reconnect.
*	Tests for accepting invites	Owen Jacobson	2024-10-24
\|
*	Tests for retrieving invites	Owen Jacobson	2024-10-24
\|
*	Remove tabs in Rust files.	Owen Jacobson	2024-10-22
\|
*	Sort out the naming of the various parts of an identity.	Owen Jacobson	2024-10-22
\| \| \| \| \| \| \| \| \|	* A `cookie::Identity` (`IdentityCookie`) is a specialized CookieJar for working with identities. * An `Identity` is a token/login pair. I hope for this to be a bit more legible. In service of this, `Login` is no longer extractable. You have to get an identity.
*	Canonicalize login and channel names.	Owen Jacobson	2024-10-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Canonicalization does two things: * It prevents duplicate names that differ only by case or only by normalization/encoding sequence; and * It makes certain name-based comparisons "case-insensitive" (generalizing via Unicode's case-folding rules). This change is complicated, as it means that every name now needs to be stored in two forms. Unfortunately, this is _very likely_ a breaking schema change. The migrations in this commit perform a best-effort attempt to canonicalize existing channel or login names, but it's likely any existing channels or logins with non-ASCII characters will not be canonicalize correctly. Since clients look at all channel names and all login names on boot, and since the code in this commit verifies canonicalization when reading from the database, this will effectively make the server un-usuable until any incorrectly-canonicalized values are either manually canonicalized, or removed It might be possible to do better with [the `icu` sqlite3 extension][icu], but (a) I'm not convinced of that and (b) this commit is already huge; adding database extension support would make it far larger. [icu]: https://sqlite.org/src/dir/ext/icu For some references on why it's worth storing usernames this way, see <https://www.b-list.org/weblog/2018/nov/26/case/> and the refernced talk, as well as <https://www.b-list.org/weblog/2018/feb/11/usernames/>. Bennett's treatment of this issue is, to my eye, much more readable than the referenced Unicode technical reports, and I'm inclined to trust his opinion given that he maintains a widely-used, internet-facing user registration library for Django.
*	Unicode normalization on input.	Owen Jacobson	2024-10-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This normalizes the following values: * login names * passwords * channel names * message bodies, because why not The goal here is to have a canonical representation of these values, so that, for example, the service does not inadvertently host two channels whose names are semantically identical but differ in the specifics of how diacritics are encoded, or two users whose names are identical. Normalization is done on input from the wire, using Serde hooks, and when reading from the database. The `crate::nfc::String` type implements these normalizations (as well as normalizing whenever converted from a `std::string::String` generally). This change does not cover: * Trying to cope with passwords that were created as non-normalized strings, which are now non-verifiable as all the paths to verify passwords normalize the input. * Trying to ensure that non-normalized data in the database compares reasonably to normalized data. Fortunately, we don't _do_ very many string comparisons (I think only login names), so this isn't a huge deal at this stage. Login names will probably have to Get Fixed later on, when we figure out how to handle case folding for login name verification.
*	Make the responses for various data creation requests more consistent.	Owen Jacobson	2024-10-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In general: * If the client can only assume the response is immediately valid (mostly, login creation, where the client cannot monitor the event stream), then 200 Okay, with data describing the server's view of the request. * If the client can monitor for completion by watching the event stream, then 202 Accepted, with data describing the server's view of the request. This comes on the heels of a comment I made on Discord: > hrm > > creating a login: 204 No Content, no body > sending a message: 202 Accepted, no body > creating a channel: 200 Okay, has a body > > past me, what were you on There wasn't any principled reason for this inconsistency; it happened as the endpoints were written at different times and with different states of mind.
*	Get loaded data using `export let data`, instead of fishing around in $page.	Owen Jacobson	2024-10-17
\| \| \| \| \| \|	This is mostly a how-to-Svelte thing. I've also made the API responses for invites a bit more caller-friendly by flattening them and adding the ID field into them. The ID is redundant (the client knows it because the client has the invitation URL), but it makes presenting invitations and actioning them a bit easier.
*	Organizational pass on endpoints and routes.	Owen Jacobson	2024-10-16
\|
*	Return a distinct error when an invite username is in use.	Owen Jacobson	2024-10-11
\| \| \| \|	I've also aligned channel creation with this (it's 409 Conflict). To make server setup more distinct, it now returns 503 Service Unavailable if setup has not been completed.
*	Create APIs for inviting users.	Owen Jacobson	2024-10-11