Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libsql/core: Clean up sqlite3 on drop #293

Merged
merged 1 commit into from
Aug 17, 2023
Merged

libsql/core: Clean up sqlite3 on drop #293

merged 1 commit into from
Aug 17, 2023

Conversation

LucioFranco
Copy link
Contributor

No description provided.

@LucioFranco LucioFranco added this pull request to the merge queue Aug 17, 2023
Merged via the queue into main with commit db893af Aug 17, 2023
@LucioFranco LucioFranco deleted the lucio/asan branch August 17, 2023 18:25
MarinPostma added a commit that referenced this pull request Oct 17, 2023
293: log compaction r=MarinPostma a=MarinPostma

# Log compaction and snapshotting

This PR enables log compaction and snapshotting.

## Motivation

The replication log follows sqlite WAL and grows indefinitely. Fortunately, it contains a lot of duplicate data, so we can compress it by getting by keeping only the most recent version of each page. This is what this PR does: whenever the replication log grows above some threshold, a new log is created, and the old log is compacted. This operation is done atomically, so the compaction can happen in the background, while we keep writing to the old log.

## Log compaction:

The log compaction is very straightforward. We iterate backwards through the replication log, and write the frames to the snapshot file. We keep track of what pages we have already seen, and ignore older version of them. When the snapshot is finished, we remove the old log file.

Notice how the frames are now in a reverse order in the snapshot: starting with the most recent, and ending with the oldest.

## Snapshoting:

Whenever a replica asks for a frame that is not present in the current log (i.e, it asks for a frame index less than the log starting frame id), then it sends the replica an error, asking it to ask for a snapshot.

The replica receives this message, and immediately asks for a snapshot, sending over the frame id that got it rejected in the first place.
The primary looks for a snapshot containing the request frame, and starts iterating through it, until it reaches a frame that is less than the requested one.

This mechanism allows us to send partial snapshot: the replica gets minimal amount of frames required to get up to speed.

The replica writes the snapshot frames to a file, and then `mmap` that file to build a chained list of pages to append to the WAL.


## Future work:

This PR was getting a bit too long, so I left out some work for followup PRs:

- Even though the snapshot significantly compresses the size of the log, a new log is created, that will lead to the creation of a new snapshot. Those snapshot will pile up and we're back at step one. The next step is to merge snapshots into bigger snapshots, and get rid of the older snapshots.
- Every query for a frame causes a read to the snapshot/log. To speed things up, let's add a MRU cache.
- Explore compression


Co-authored-by: ad hoc <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant