Skip to content

Commit

Permalink
Auto merge of #13492 - ehuss:stabilize-gc-collect, r=weihanglo
Browse files Browse the repository at this point in the history
Stabilize global cache data tracking.

This stabilizes the global cache last-use data tracking. This does not stabilize automatic or manual gc.

Tracking issue: #12633

## Motivation

The intent is to start getting cargo to collect data so that when we do stabilize automatic gc, there will be a wider range of cargo versions that will be updating the data so the user is less likely to see cache misses due to an over-aggressive gc.

Additionally, this should give us more exposure and time to respond to any problems, such as filesystem issues.

## What is stabilized?

Cargo will now automatically create and update an SQLite database, located at `$CARGO_HOME/.global-cache`. This database tracks timestamps of the last time cargo touched an index, `.crate` file, extracted crate `src` directory, git clone, or git checkout. The schema for this database is [here](https://github.com/rust-lang/cargo/blob/a7e93479261432593cb70aea5099ed02dfd08cf5/src/cargo/core/global_cache_tracker.rs#L233-L307).

Cargo updates this file on any command that needs to touch any of those on-disk caches.

The testsuite for this feature is located in [`global_cache_tracker.rs`](https://github.com/rust-lang/cargo/blob/a7e93479261432593cb70aea5099ed02dfd08cf5/tests/testsuite/global_cache_tracker.rs).

## Stabilization risks

There are some risks to stabilizing, since it commits us to staying compatible with the current design.

The concerns I can think of with stabilizing:

This commits us to using the database schema in the current design.

The code is designed to support both backwards and forwards compatible extensions, so I think it should be fairly flexible. Worst case, if we need to make changes that are fundamentally incompatible, then we can switch to a different database filename or tracking approach.

There are certain kinds of errors that are ignored if cargo fails to save the tracking data (see [`is_silent_error`](https://github.com/rust-lang/cargo/blob/64ccff290fe20e2aa7c04b9c71460a7fd962ea61/src/cargo/core/global_cache_tracker.rs#L1796-L1813)).

The silent errors are only shown with --verbose. This should help deal with read-only filesystem mounts and other issues. Non-silent errors always show just a warning. I don't know if that will be sufficient to avoid problems.

I did a fair bit of testing of performance, and there is a bench suite for this code, but we don't know if there will be pathological problems in the real world. It also incurs an overhead that all builds will have to pay for.

I've done my best to ensure that this should be reliable when used on network or unusual filesystems, but I think those are still a high-risk category. SQLite should be configured to accommodate these cases, as well as the extensive locking code (which has already been enabled).

A call for public testing was announced in December at https://blog.rust-lang.org/2023/12/11/cargo-cache-cleaning.html. At this time, I don't see any issues in https://github.com/rust-lang/cargo/labels/Z-gc that should block this step.
  • Loading branch information
bors committed Feb 27, 2024
2 parents bf5acf8 39863e7 commit 98f6bf3
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 24 deletions.
8 changes: 1 addition & 7 deletions src/cargo/core/global_cache_tracker.rs
Original file line number Diff line number Diff line change
Expand Up @@ -354,13 354,7 @@ impl GlobalCacheTracker {
// provide user feedback) rather than blocking inside sqlite
// (which by default has a short timeout).
let db_path = gctx.assert_package_cache_locked(CacheLockMode::DownloadExclusive, &db_path);
let mut conn = if gctx.cli_unstable().gc {
Connection::open(db_path)?
} else {
// To simplify things (so there aren't checks everywhere for being
// enabled), just process everything in memory.
Connection::open_in_memory()?
};
let mut conn = Connection::open(db_path)?;
conn.pragma_update(None, "foreign_keys", true)?;
sqlite::migrate(&mut conn, &migrations())?;
Ok(GlobalCacheTracker {
Expand Down
31 changes: 14 additions & 17 deletions tests/testsuite/global_cache_tracker.rs
Original file line number Diff line number Diff line change
Expand Up @@ -164,23 164,20 @@ fn rustup_cargo() -> Execs {

#[cargo_test]
fn auto_gc_gated() {
// Requires -Zgc to both track last-use data and to run auto-gc.
// Requires -Zgc to run auto-gc.
let p = basic_foo_bar_project();
p.cargo("check")
.env("__CARGO_TEST_LAST_USE_NOW", months_ago_unix(4))
.run();
// Check that it did not create a database or delete anything.
// Check that it created a database.
let gctx = GlobalContextBuilder::new().build();
assert!(!GlobalCacheTracker::db_path(&gctx)
assert!(GlobalCacheTracker::db_path(&gctx)
.into_path_unlocked()
.exists());
assert_eq!(get_index_names().len(), 1);

// Again in the future, shouldn't auto-gc.
p.cargo("check").run();
assert!(!GlobalCacheTracker::db_path(&gctx)
.into_path_unlocked()
.exists());
assert_eq!(get_index_names().len(), 1);
}

Expand All @@ -203,7 200,7 @@ See [..]
fn implies_source() {
// Checks that when a src, crate, or checkout is marked as used, the
// corresponding index or git db also gets marked as used.
let gctx = GlobalContextBuilder::new().unstable_flag("gc").build();
let gctx = GlobalContextBuilder::new().build();
let _lock = gctx
.acquire_package_cache_lock(CacheLockMode::MutateExclusive)
.unwrap();
Expand Down Expand Up @@ -563,7 560,7 @@ fn auto_gc_various_commands() {
.masquerade_as_nightly_cargo(&["gc"])
.env("__CARGO_TEST_LAST_USE_NOW", months_ago_unix(4))
.run();
let gctx = GlobalContextBuilder::new().unstable_flag("gc").build();
let gctx = GlobalContextBuilder::new().build();
let lock = gctx
.acquire_package_cache_lock(CacheLockMode::MutateExclusive)
.unwrap();
Expand Down Expand Up @@ -647,7 644,7 @@ fn updates_last_use_various_commands() {
.arg("-Zgc")
.masquerade_as_nightly_cargo(&["gc"])
.run();
let gctx = GlobalContextBuilder::new().unstable_flag("gc").build();
let gctx = GlobalContextBuilder::new().build();
let lock = gctx
.acquire_package_cache_lock(CacheLockMode::MutateExclusive)
.unwrap();
Expand Down Expand Up @@ -696,7 693,7 @@ fn both_git_and_http_index_cleans() {
.masquerade_as_nightly_cargo(&["gc"])
.env("__CARGO_TEST_LAST_USE_NOW", months_ago_unix(4))
.run();
let gctx = GlobalContextBuilder::new().unstable_flag("gc").build();
let gctx = GlobalContextBuilder::new().build();
let lock = gctx
.acquire_package_cache_lock(CacheLockMode::MutateExclusive)
.unwrap();
Expand Down Expand Up @@ -821,7 818,7 @@ fn tracks_sizes() {
.run();

// Check that the crate sizes are the same as on disk.
let gctx = GlobalContextBuilder::new().unstable_flag("gc").build();
let gctx = GlobalContextBuilder::new().build();
let _lock = gctx
.acquire_package_cache_lock(CacheLockMode::MutateExclusive)
.unwrap();
Expand Down Expand Up @@ -863,7 860,7 @@ fn tracks_sizes() {
#[cargo_test]
fn max_size() {
// Checks --max-crate-size and --max-src-size with various cleaning thresholds.
let gctx = GlobalContextBuilder::new().unstable_flag("gc").build();
let gctx = GlobalContextBuilder::new().build();

let test_crates = [
// name, age, crate_size, src_size
Expand Down Expand Up @@ -962,7 959,7 @@ fn max_size_untracked_crate() {
// When a .crate file exists from an older version of cargo that did not
// track sizes, `clean --max-crate-size` should populate the db with the
// sizes.
let gctx = GlobalContextBuilder::new().unstable_flag("gc").build();
let gctx = GlobalContextBuilder::new().build();
let cache = paths::home().join(".cargo/registry/cache/example.com-a6c4a5adcb232b9a");
cache.mkdir_p();
paths::home()
Expand Down Expand Up @@ -1003,7 1000,7 @@ fn max_size_untracked_prepare() -> (GlobalContext, Project) {
let p = basic_foo_bar_project();
p.cargo("fetch").run();
// Pretend it was an older version that did not track last-use.
let gctx = GlobalContextBuilder::new().unstable_flag("gc").build();
let gctx = GlobalContextBuilder::new().build();
GlobalCacheTracker::db_path(&gctx)
.into_path_unlocked()
.rm_rf();
Expand Down Expand Up @@ -1084,7 1081,7 @@ fn max_download_size() {
// This creates some sample crates of specific sizes, and then tries
// deleting at various specific size thresholds that exercise different
// edge conditions.
let gctx = GlobalContextBuilder::new().unstable_flag("gc").build();
let gctx = GlobalContextBuilder::new().build();

let test_crates = [
// name, age, crate_size, src_size
Expand Down Expand Up @@ -1339,7 1336,7 @@ fn clean_syncs_missing_files() {
.run();

// Verify things are tracked.
let gctx = GlobalContextBuilder::new().unstable_flag("gc").build();
let gctx = GlobalContextBuilder::new().build();
let lock = gctx
.acquire_package_cache_lock(CacheLockMode::MutateExclusive)
.unwrap();
Expand Down Expand Up @@ -1992,7 1989,7 @@ fn forward_compatible() {
.masquerade_as_nightly_cargo(&["gc"])
.run();

let config = GlobalContextBuilder::new().unstable_flag("gc").build();
let config = GlobalContextBuilder::new().build();
let lock = config
.acquire_package_cache_lock(CacheLockMode::MutateExclusive)
.unwrap();
Expand Down

0 comments on commit 98f6bf3

Please sign in to comment.