pilcrow - Run-It-Yourself web chat, maybe

	Commit message (Collapse)	Author	Age
*	Store `Message` instances using their events.	Owen Jacobson	2025-08-26
\| \| \| \|	I found a test bug! The tests for deleting previously-deleted or previously-expired tests were using the wrong user to try to delete those messages. The tests happened to pass anyways because the message authorship check was done after the message lifecycle check. They would have no longer passed; the tests are fixed to use the sender, instead.
*	Split `user` into a chat-facing entity and an authentication-facing entity.	Owen Jacobson	2025-08-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The taxonomy is now as follows: * A _login_ is someone's identity for the purposes of authenticating to the service. Logins are not synchronized, and in fact are not published anywhere in the current API. They have a login ID, a name and a password. * A _user_ is someone's identity for the purpose of participating in conversations. Users _are_ synchronized, as before. They have a user ID, a name, and a creation instant for the purposes of synchronization. In practice, a user exists for every login - in fact, users' names are stored in the login table and are joined in, rather than being stored redundantly in the user table. A login ID and its corresponding user ID are always equal, and the user and login ID types support conversion and comparison to facilitate their use in this context. Tokens are now associated with logins, not users. The currently-acting identity is passed down into app types as a login, not a user, and then resolved to a user where appropriate within the app methods. As a side effect, the `GET /api/boot` method now returns a `login` key instead of a `user` key. The structure of the nested value is unchanged.
*	Collapse redundant "deleted_at" timestaps and "deleted" event instants.	Owen Jacobson	2025-08-24
\| \| \| \|	These were separated as there wasn't an obvious way to serialize two fields with the same _type_ with different _prefixes_. Turns out this is a common problem, and someone's written a crate for it that remaps the names for you.
*	Rust 1.89: Add elided lifetime parameters (`'_`) where appropriate.	Owen Jacobson	2025-08-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rust 1.89 added a new warning: warning: hiding a lifetime that's elided elsewhere is confusing --> src/setup/repo.rs:4:14 \| 4 \| fn setup(&mut self) -> Setup; \| ^^^^^^^^^ ----- the same lifetime is hidden here \| \| \| the lifetime is elided here \| = help: the same lifetime is referred to in inconsistent ways, making the signature confusing help: use `'_` for type paths \| 4 \| fn setup(&mut self) -> Setup<'_>; \| ++++ I don't entirely agree with the style advice here, but lifetime elision style is an evolving area in Rust and I'd rather track the Rust team's recommendations than invent my own, so I've added all of them.
*	Replace `channel` with `conversation` throughout the API.	Owen Jacobson	2025-07-03
\| \| \| \|	This is a breaking change for essentially all clients. Thankfully, there's presently just the one, so we don't need to go to much effort to accommoate that; the client is modified in this commit to adapt, users can reload their client, and life will go on.
*	Rename "channel" to "conversation" within the server.	Owen Jacobson	2025-07-03
\| \| \| \| \| \|	I've split this from the schema and API changes because, frankly, it's huge. Annoyingly so. There are no semantic changes in this, it's all symbol changes, but there are a _lot_ of them because the term "channel" leaks all over everything in a service whose primary role is managing messages sent to channels (now, conversations). I found a buggy test while working on this! It's not fixed in this commit, because it felt mean to hide a real change in the middle of this much chaff.
*	Rename "channel" to "conversation" in the database.	Owen Jacobson	2025-07-03
\| \| \| \| \| \|	I've - somewhat arbitrarily - started renaming column aliases, as well, though the corresponding Rust data model, API fields and nouns, and client code still references them as "channel" (or as derived terms). As with so many schema changes, this entails a complete rebuild of a substantial portion of the schema. sqlite3 still doesn't have very many `alter table` primitives, for renaming columns in particular.
*	Prevent sending messages to deleted channels.	Owen Jacobson	2025-07-03
\| \| \| \|	I've opted to make it clear in the error message which scenario - deleted vs. non-existant - a channel falls into. This isn't particularly consistent with the rest of the API, so we might need to review this decision later, but it's at least relatively harmless if it's mistaken. (Formally, they're both 404s, so clients that go by error code won't notice.)
*	Rename the `login` module to `user`.	Owen Jacobson	2025-03-23
\|
*	Upgrade to Rust 1.85 and Rust 2024 edition.	Owen Jacobson	2025-02-20
\| \| \| \| \| \| \| \|	There are a couple of migration suggestions from `cargo fix --edition` that I have deliberately skipped, which are intended to make sure that the changes to `if let` scoping don't bite us. They don't, I'm pretty sure, and if I turn out to be wrong, I'd rather fix the scoping issues (as they arise) than use `match` (`cargo fix --edition`'s suggestion). This change also includes a bulk reformat and a clippy cleanup. NOTA BENE: As this requires a new Rust toolchain, you'll need to update Rust (`rustup update`, normally) or the server won't build. This also applies to the Debian builder Docker image; it'll need to be rebuilt (from scratch, pulling its base image again) as well.
*	Track an index-friendly sequence range for both channels and messages.	Owen Jacobson	2024-10-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is meant to limit the amount of messages that event replay needs to examine. Previously, the query required a table scan; table scans on `message` can be quite large, and should really be avoided. The new schema allows replays to be carried out using an index scan. The premise of this change is that, for each (channel, message), there is a range of event sequence numbers that the (channel, message) may appear in. We'll notate that as `[start, end]` in the general case, but they are: * for active channels, `[ch.created_sequence, ch.created_sequence]`. * for deleted channels, `[ch.created_sequence, ch_del.deleted_sequence]`. * for active messages, `[mg.sent_sequence, mg.sent_sequence]`. * for deleted messages, `[mg.sent_seqeunce, mg_del.deleted_sequence]`. (The two "active" ranges may grow in future releases, to support things like channel renames and message editing. That won't change the logic, but those two things will need to update the new `last_sequence` field.) There are two families of operations that need to retrieve based on these ranges: * Boot creates a snapshot as of a specific `resume_at` sequence number, and thus must include any record whose `start` falls on or before `resume_at`. We can't exclude records whose `end` is also before it, as their terminal state may be one that is included in boot (eg. active channels). * Event replay needs to include any events that fall after the same `resume_at`, and thus must include events from any record whose `end` falls strictly after `resume_at`. We can't exclude records whose `start` is also strictly after `resume_at`, as we'd omit them from replay, inappropriately, if we did. This gives three interesting cases: 1. Record fully after `resume_at`: event sequence --increasing--> x-a … x … x+k … resume_at start end This record should be omitted by boot, but included for event replay. 2. Record fully before `resume_at`: event sequence --increasing--> x … x+k … x+a start end resume_at This record should be omitted for event replay, but included for boot. 3. Record overlapping `resume_at`: event sequence --increasing--> x … x+a … x+k start resume_at end This record needs to be included for both boot and event replay. However, the bounds of that range were previously stored in two tables (`channel` and `channel_deleted`, or `message` and `message_deleted`, respectively), which sqlite (indeed most SQL implementations) cannot index. This forced a table scan, leading to the program considering every possible (channel, message) during event replay. This commit adds a `last_sequence` field to channels and messages, which is set to the above values as channels and messages are operated on. This field is indexed, and queries can use it to rapidly identify relevant rows for event replay, cutting down the amount of reading needed to generate events on resume.
*	Resume points are no longer optional.	Owen Jacobson	2024-10-30
\| \| \| \|	This is an inconsequential change for actual clients, since "resume from the beginning" was never a preferred mode of operation, and it simplifies some internals. It should also mean we get better query plans where `coalesce(cond, true)` was previously being used.
*	Make sure (most) queries avoid table scans.	Owen Jacobson	2024-10-23
\| \| \| \| \| \| \| \| \| \| \| \|	I've exempted inserts (they never scan in the first place), queries on `event_sequence` (at most one row), and the coalesce()s used for event replay (for now; these are obviously a performance risk area and need addressing). Method: ``` find .sqlx -name 'query-*.json' -exec jq -r '"explain query plan " + .query + ";"' {} + > explain.sql ``` Then go query by query through the resulting file.
*	Unicode normalization on input.	Owen Jacobson	2024-10-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This normalizes the following values: * login names * passwords * channel names * message bodies, because why not The goal here is to have a canonical representation of these values, so that, for example, the service does not inadvertently host two channels whose names are semantically identical but differ in the specifics of how diacritics are encoded, or two users whose names are identical. Normalization is done on input from the wire, using Serde hooks, and when reading from the database. The `crate::nfc::String` type implements these normalizations (as well as normalizing whenever converted from a `std::string::String` generally). This change does not cover: * Trying to cope with passwords that were created as non-normalized strings, which are now non-verifiable as all the paths to verify passwords normalize the input. * Trying to ensure that non-normalized data in the database compares reasonably to normalized data. Fortunately, we don't _do_ very many string comparisons (I think only login names), so this isn't a huge deal at this stage. Login names will probably have to Get Fixed later on, when we figure out how to handle case folding for login name verification.
*	Switch to blanking tombstoned data with null, not empty string.	Owen Jacobson	2024-10-18
\| \| \| \| \| \| \|	This accomplishes two things: * It removes the need for an additional `channel_name_reservation` table, since `channel.name` now only contains non-null values for active channels, and * It nicely dovetails with the idea that `null` means an unknown value in SQL-land.
*	Retain deleted messages and channels temporarily, to preserve events for replay.	Owen Jacobson	2024-10-17
\| \| \| \| \| \| \| \| \| \| \| \|	Previously, when a channel (message) was deleted, `hi` would send events to all _connected_ clients to inform them of the deletion, then delete all memory of the channel (message). Any disconnected client, on reconnecting, would not receive the deletion event, and would de-synch with the service. The creation events were also immediately retconned out of the event stream, as well. With this change, `hi` keeps a record of deleted channels (messages). When replaying events, these records are used to replay the deletion event. After 7 days, the retained data is deleted, both to keep storage under control and to conform to users' expectations that deleted means gone. To match users' likely intuitions about what deletion does, deleting a channel (message) _does_ immediately delete some of its associated data. Channels' names are blanked, and messages' bodies are also blanked. When the event stream is replayed, the original channel.created (message.sent) event is "tombstoned", with an additional `deleted_at` field to inform clients. The included client does not use this field, at least yet. The migration is, once again, screamingingly complicated due to sqlite's limited ALTER TABLE … ALTER COLUMN support. This change also contains capabilities that would allow the API to return 410 Gone for deleted channels or messages, instead of 404. I did experiment with this, but it's tricky to do pervasively, especially since most app-level interfaces return an `Option<Channel>` or `Option<Message>`. Redesigning these to return either `Ok(Channel)` (`Ok(Message)`) or `Err(Error::NotFound)` or `Err(Error::Deleted)` is more work than I wanted to take on for this change, and the utility of 410 Gone responses is not obvious to me. We have other, more pressing API design warts to address.
*	Return a flat message list on boot, not nested lists by channel.	Owen Jacobson	2024-10-09
\| \| \| \|	This is a bit easier to compute, and sets us up nicely for pulling message boot out of the `/api/boot` response entirely.
*	Provide a view of logins to clients.	Owen Jacobson	2024-10-09
\|
*	Simplify channel IDs in events. Remove redundant ones.	Owen Jacobson	2024-10-09
\|
*	Separate `/api/boot` into its own module.	Owen Jacobson	2024-10-05
\|
*	Clean up naming and semantics of history accessors.	Owen Jacobson	2024-10-04
\|
*	List messages per channel.	Owen Jacobson	2024-10-03
\|
*	Add endpoints for deleting channels and messages.	Owen Jacobson	2024-10-03
\| \| \| \|	It is deliberate that the expire() functions do not use them. To avoid races, the transactions must be committed before events get sent, in both cases, which makes them structurally pretty different.
*	Represent channels and messages using a split "History" and "Snapshot" model.	Owen Jacobson	2024-10-03
	This separates the code that figures out what happened to an entity from the code that represents it to a user, and makes it easier to compute a snapshot at a point in time (for things like bootstrap). It also makes the internal logic a bit easier to follow, since it's easier to tell whether you're working with a point in time or with the whole recorded history. This hefty.