Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workspace dependencies #10306

Open
stanislav-tkach opened this issue Feb 7, 2023 · 14 comments
Open

Workspace dependencies #10306

stanislav-tkach opened this issue Feb 7, 2023 · 14 comments
Labels
A-lint Area: New lints

Comments

@stanislav-tkach
Copy link
Contributor

What it does

I'm not sure if it is a Clippy responsibility, but I think it would be nice to suggest to use "workspace dependencies" if some library is used in multiple workspace crates.

Perhaps the opposite changes can be suggested if a dependency is used only once or less than some threshold value, though this logic should probably be a separate lint. Ideally both values should be configurable.

Lint Name

workspace_dependencies

Category

pedantic

Advantage

  • It is much more convenient to change the dependency version in one place.

Drawbacks

The only disadvantage I can see is that such lint can be noisy for small workspaces, so it probably should be disabled by default.

Example

I don't think this example is needed, but nevertheless:

# [PROJECT_DIR]/Cargo.toml
[workspace]
members = ["foo", "bar"]

# [PROJECT_DIR]/foo/Cargo.toml
[package]
name = "foo"

[dependencies]
rand = "0.8.5"

# [PROJECT_DIR]/bar/Cargo.toml
[package]
name = "bar"

[dependencies]
rand = "0.8.5"

Could be written as:

# [PROJECT_DIR]/Cargo.toml
[workspace]
members = ["foo", "bar"]

[workspace.dependencies]
rand = "0.8.5"

# [PROJECT_DIR]/foo/Cargo.toml
[package]
name = "foo"

[dependencies]
rand.workspace = true

# [PROJECT_DIR]/bar/Cargo.toml
[package]
name = "bar"

[dependencies]
rand.workspace = true
@stanislav-tkach stanislav-tkach added the A-lint Area: New lints label Feb 7, 2023
@epage
Copy link

epage commented Feb 7, 2023

There are several potential strategies for workspace dependenices

  • Put everything (except maybe path dependencies) in them allowing a central place to see everything
    • path dependencies are dependent on whether you want your published version requirements to all point to the same version or nit
  • Put "this version must be shared" dependencies in them, this is context specific
  • Put nothing in them

For me, putting only duplicated dependencies in as workspace dependencies just seems like a source of churn as that is dependent on the current implementation of your code base and can change.

@stanislav-tkach
Copy link
Contributor Author

While I agree with most of arguments, I'm still not sure about this part:

For me, putting only duplicated dependencies in as workspace dependencies just seems like a source of churn as that is dependent on the current implementation of your code base and can change.

Sure it can cause some friction, but it is mostly the same as adding new and removing unused dependencies because of changes in the implementation.

Personally I'm thinking about putting all dependencies in the workspace config file, but it is still feels strange in cases if a dependency is used only once. Especially if is a wrapper of some kind and other workspace crates uses underlying dependency through this wrapper.

@epage
Copy link

epage commented Feb 8, 2023

Sure it can cause some friction, but it is mostly the same as adding new and removing unused dependencies because of changes in the implementation.

It is a couple extra steps

  • Adding a dependency that will now be duplicated will require touching three manfiests instead of one (workspace and the two packages with the duplicated dependency)
  • Similarly, removing a dependency that will no longer be duplicated will now require touching three manifests instead of one

btw rust-lang/cargo#10608 might be of interest for exploring what the cargo add and cargo rm experience might be like for automating workspace dependency behavior.

@kraktus
Copy link
Contributor

kraktus commented Feb 10, 2023

I agree with the concerns above but OTOH, updating a dependency and make sure you sync versions (especially important if you share data-structures) is much easier, and updating deps is more frequent than adding new ones, especially in a big repo.

@heaths
Copy link

heaths commented Jan 24, 2024

We have a monorepo and want to manage all dependencies centrally like we do for many other languages we support. A linter like this would be immensely helpful. Seems an EarlyLintPass check could do this: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lint/trait.EarlyLintPass.html#method.check_crate

Path dependencies should be allowed, but all version and repo dependencies should be defined in a workspace.

@epage
Copy link

epage commented Jan 24, 2024

To clarify, I'm suggesting the focus should be on centralizing all dependencies, not just duplicated. Maybe path dependencies can be a config field on top of that. I think the case for linting only duplicates is small and we shouldn't be driving people to do that by making it the only lint.

In defining a lint for workspace dependencies, we'd need to decide how to handle two dependencies with the same name, because of different version requirements. It should emit but not be machine applicable (ie no --fix support).

Also, machine applicable changes should only consolidate the source. Unsure if it should default-features = false or not. In hindsight, merging the separate dependencies, rather than the workspace having only a source, feels like a mistake (as the person who reviewed the feature going in and has had to deal with user problems with corner cases since).

@heaths
Copy link

heaths commented Jan 24, 2024

To clarify, I'm suggesting the focus should be on centralizing all dependencies, not just duplicated

Yep. That's what I want, too. When possible in our supported languages' package managers, we want central oversight and management e.g., separate CODEOWNERS entries for files/directories used for central package management - not just for security, but also to help watch for unnecessary version bumps so we maintain the lowest possible dependencies in some cases e.g., nuget.

In defining a lint for workspace dependencies, we'd need to decide how to handle two dependencies with the same name, because of different version requirements.

In our case, we don't want that but I can appreciate some might. Perhaps a linter option? I don't particularly care if it's opt-in or -out.

Could you clarify the third paragraph? I'm afraid I don't follow.

@epage
Copy link

epage commented Jan 24, 2024

In our case, we don't want that but I can appreciate some might. Perhaps a linter option? I don't particularly care if it's opt-in or -out.

More so, the lint can't suggest an automatic fix because

  • If two version requirements were meant to be the same but drifted, there isn't a clear "right" answer
  • Sometimes you have to work with multiple incompatible versions, like toml benchmarking itself against an older version

Could you clarify the third paragraph? I'm afraid I don't follow.

cargo clippy can detect when something is the matter. In the ideal scenario, clippy would provide suggested fixes for the problems it detects. You can apply this fixes with cargo clippy --fix. In the last year or so, cargo started informing people that this flag exists. These are referred to as "machine applicable". I'm talking about how clippy should suggest the dependencies be centralized which can be applied with --fix.

Now to what should or shouldn't be suggested. With workspace dependencies, we took a layered config approach where we support merging two different dependency tables. This allows you to centrally say that a feature should be enabled and all packages that reference your workspace dependency get it enabled. There are times where there are requirements that drive a feature to be enabled across your entire workspace but I feel that isn't so often the case and the implementation just happens to need the two features. Centralizing that implementation detail couples things together, causing churn and makes it harder to audit what features are enabled that don't have to be. This came up when we were considering the ability to specify which dependencies are public and which are private. For API stability, it is best to be minimal about what you declare to be public. For that reason, we decided to not make the public field allowed in a workspace dependency and regret allowing other fields.

The main reason for sharing features in a workspace dependency as a poor man's "workspace hack" (if not familiar with that term, feel free to skip this section). I say poor man's because feature unification is still dependent on what dependencies exist in each package so a workspace hack should still be used if desired.

@heaths
Copy link

heaths commented Jan 24, 2024

Interesting. A quick search yields a lot of links to https://docs.rs/cargo-hakari/latest/cargo_hakari/about/index.html. Think that sums it up? For our case, I highly doubt we'll run into that problem, but I appreciate trying to account for it in a linter. I agree we can't really fix it - and I'm wondering if we can even completely fix the general "put your dependencies in the workspace and inherit" since we only likely have access to the package manifest - but we could certainly warn on it, sort of like, "hey, this might be a mistake". People could suppress it if that's what they intended. Thoughts?

@epage
Copy link

epage commented Jan 24, 2024

I think an opt-in package-scoped lint that emits on non-workspace sources for package dependencies in an explicit [workspace] makes sense. Again, path dependencies might want to be a config value, unsure on default. Maybe also a config on applying this even when a [workspace] doesn't exists (I've at least moved to always specifying [workspace] to make diffing repos easier).

I don't know about clippy's internals but I would hope it can be machine applicable and I would suggest the following behavior:

  • If an exact-match source exists in workspace.dependencies (e.g. same version requirement), the suggestion would be to use it
  • If no related source exists in workspace.dependencies, the suggestion would be to add one and to use it
  • If compatible source exists in workspace.dependencies (e.g. compatible version requirement), a Maybe Applicable suggestion is given to use update the workspace dependency with the more restrictive version requirement and use it
  • If an incompatible source exists in workspace.dependencies, no suggestion is given

My hope/expectation is that if two workspace members have duplicated dependencies, --fix would "just work". From my understanding of the underlying logic in rustfix, the linters are run serially along with their fixes. So what would normally be emitted as two packages with "no workspace source exists" suggestions would instead have the first package apply the "no source exists" suggestion and the second package would be one of the other categories and would be fixed or emitted depending on if the suggestion is high enough confidence.

I would suggest treating workspace.dependencies with non-source information as incompatible with a package dependency with the possible exception of the non-source information matching as that gets messy to merge.

I would also suggest a workspace-scope lint that emits whenever non-source information is in workspace.dependencies. However, I don't think clippy currently runs at the workspace-scope. We've brainstormed about moving the cargo side of clippy out of the driver and into cargo-clippy which would allow workspace-level lints.

@heaths
Copy link

heaths commented Jan 26, 2024

Just because it will probably be easier to get right initially (not that it's too hard later), I'm waffling on a name. Given we want to allow workspace and package dependencies now, seems a lint name needs to reflect that workspace dependencies are required. Looking at similar lints in https://rust-lang.github.io/rust-clippy/master/index.html, maybe must_use_workspace_dependencies? Pretty long, though.

Also thinking this sounds more like a clippy::restriction than a clippy::cargo given it is meant to restrict, if denied by maintainers for a repo. https://doc.rust-lang.org/nightly/clippy/lints.html#cargo reads like correctness suggestions for publishing crates.

@epage
Copy link

epage commented Jan 26, 2024

I think clippy::mod_module_files and clippy::self_named_module_files are related in concept and so I would use them as a model for naming and group. For naming, when you say allow or deny, you are saying what policy you are allowing or denying. So I guess package_dependencies would be the name of this lint so you can say --deny package_dependencies?

(cargo as a group name seems weird to me since thats cross-cutting and clippy aims for one group per lint iirc)

@heaths
Copy link

heaths commented Jan 26, 2024

Thanks. I'll look into those. I did start implementing one as a cargo/ function only to realize that cargo metadata "flattens" dependencies and has no information whether the dependency is from the workspace or package manifest. I'd really appreciate some idiomatic advice here.

I could see if cargo maintainers would be open to exposing this information within the cargo metadata output; otherwise, when I first started looking at this as an early pass lint, I found I can get Crate info that just gives me the manifest I can process myself (and walk up the tree to find/verify a workspace Cargo.toml).

@epage
Copy link

epage commented Jan 26, 2024

No idea whats idiomatic within clippy as I'm familiar with the cargo side. I would not expect to add these implementation details to cargo metadata. Instead, if clippy is up for it, you can read the manifests directly. We provide a first-party toml parser so you can avoid going stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-lint Area: New lints
Projects
None yet
Development

No branches or pull requests

4 participants