summaryrefslogtreecommitdiff
path: root/docs/design.md
diff options
context:
space:
mode:
authorOwen Jacobson <owen@grimoire.ca>2024-09-24 19:51:01 -0400
committerOwen Jacobson <owen@grimoire.ca>2024-09-25 00:44:30 -0400
commit410f4d740cfc3d9b5a53ac237667ed03b7f19381 (patch)
tree948c3c7da7e871315080f2df331cebb37b226ae5 /docs/design.md
parent8bb25062e5b804c27b58ae36f585f12b1c602487 (diff)
Use a vector of sequence numbers, not timestamps, to restart /api/events streams.
The timestamp-based approach had some formal problems. In particular, it assumed that time always went forwards, which isn't necessarily the case: * Alice calls `/api/channels/Cfoo` to send a message. * The server assigns time T to the request. * The server stalls somewhere in send() for a while, before storing and broadcasting the message. If it helps, imagine blocking on `tx.begin().await?` for a while. * In this interval, Bob calls `/api/events?channel=Cfoo`, receives historical messages up to time U (after T), and disconnects. * The server resumes Alice's request and finishes it. * Bob reconnects, setting his Last-Event-Id header to timestamp U. In this scenario, Bob never sees Alice's message unless he starts over. It wasn't in the original stream, since it wasn't broadcast while Bob was subscribed, and it's not in the new stream, since Bob's resume point is after the timestamp on Alice's message. The new approach avoids this. Each message is assigned a _sequence number_ when it's stored. Bob can be sure that his stream included every event, since the resume point is identified by sequence number even if the server processes them out of chronological order: * Alice calls `/api/channels/Cfoo` to send a message. * The server assigns time T to the request. * The server stalls somewhere in send() for a while, before storing and broadcasting. * In this interval, Bob calls `/api/events?channel=Cfoo`, receives historical messages up to sequence Cfoo=N, and disconnects. * The server resumes Alice's request, assigns her message sequence M (after N), and finishes it. * Bob resumes his subscription at Cfoo=N. * Bob receives Alice's message at Cfoo=M. There's a natural mutual exclusion on sequence numbers, enforced by sqlite, which ensures that no two messages have the same sequence number. Since sqlite promises that transactions are serializable by default (and enforces this with a whole-DB write lock), we can be confident that sequence numbers are monotonic, as well. This scenario is, to put it mildly, contrived and unlikely - which is what motivated me to fix it. These kinds of bugs are fiendishly hard to identify, let alone reproduce or understand. I wonder how costly cloning a map is going to turn out to be… A note on database migrations: sqlite3 really, truly has no `alter table … alter column` statement. The only way to modify an existing column is to add the column to a new table. If `alter column` existed, I would create the new `sequence` column in `message` in a much less roundabout way. Fortunately, these migrations assume that they are being run _offline_, so operations like "replace the whole table" are reasonable.
Diffstat (limited to 'docs/design.md')
0 files changed, 0 insertions, 0 deletions