Policy for lint expansions #122759

tmandry · 2024-03-20T02:05:22Z

Background

When a lint is expanded to include many new cases, it adds significant complexity to the rollout of a toolchain in the ecosystem. In particular, certain challenges occur in large codebases. Maintainers are stuck with the choice of

Disabling the existing lint while the toolchain is updated and new cases are fixed.
Fixing cases manually, and updating the toolchain immediately so that other developers see the warnings and don't add more such code.

Both of these come with the problem of racing with other developers in a codebase who may land new code which triggers the expanded lint in a new compiler, but does not trigger the lint in an old compiler.

While it would be nice to solve this "raciness" once and for all, there are other considerations at play. Instead, we propose a set of guidelines to help decide whether an existing lint should be expanded, as well as mitigations for the above problems when expansions occur.

These guidelines are based on the lang team discussion of #121708.

cc @estebank @petrochenkov

Guidelines

Deciding on new lint vs expansion

We don't have hard-and-fast rules for when to use a new lint versus expanding an existing one, but we want to take several factors into account:

How important is fixing the issue the lint identifies, to the maintainer of the code or the health of the ecosystem?
Are the cases of the old and new lints of the same severity?
Is it possible to fix code to no longer trigger the lint without making the code depend on the new version of Rust?
Is there a clear description of the distinction between the old and new cases, such that it's easy to give a logical name to the new cases?
What fraction of the ecosystem will encounter the lint? (Crater can give a good idea of this.)
Is there an automatic machine-applicable fix?

We'd like to avoid having a disproportionately painful impact on the ecosystem compared to the value provided. We'd also like to avoid having a painful impact on large codebases that need to migrate their code to take the new lint into account.

When proposing or reviewing an expansion of an existing lint, please take the above factors into account. If in doubt about whether to use a new or existing lint, please consult with lang, and seek a decision via FCP.

Mitigations for lint expansions

When an existing lint is expanded to include many new cases, please consider providing at least one of:

A new lint name under the existing group, so that users may opt out of the expansion at least temporarily. This is particularly desirable if there's a clear description of the new cases being added.
A MachineApplicable fix for the lint. This fix must produce code that compiles without new lints or errors under the most recent stable compiler. This helps maintainers, particularly of large codebases, quickly fix all existing instances so they can upgrade/deny the lint, without racing with new code additions that trigger the lint. (Note, though, that a MachineApplicable fix doesn't suffice if applying the fix would make code fail to compile on the immediately prior version of Rust.)

To evaluate whether a lint produces "many new cases", try doing a crater run on the top 1000 crates. If the lint addition impacts more than 1% of the top-1000 crates on crates.io, or is producing a large number of warnings within a small number of crates in that set, treat it as producing many new cases. See below for details on how to run crater.

A crater run is not required before landing for every lint expansion. Reviewers should use their best judgment to decide if one is required.

Crater command

To measure the impact of a lint on the top 1000 crates, you can use the following crater command:

@craterbot run name=<name> start=master#<hash1>+rustflags=-D<lint_name> end=master#<hash2>+rustflags=-D<lint_name> crates=top-1000 mode=check-only p=1

See the crater docs for more information.

Lint revert policy

In general, if a lint expansion does not meet the above guidelines, and in particular especially if it has neither a new lint name nor a machine applicable fix, it should be flagged for review and likely reverted before it hits stable Rust.

The text was updated successfully, but these errors were encountered:

tmandry · 2024-03-20T02:06:00Z

As discussed in last week's lang team meeting, I've drafted a policy for the team to review.

@rustbot label I-lang-nominated

estebank · 2024-03-20T04:50:48Z

Here, we define "many new cases" as impacting more than 5% of the top-1000 crates on crates.io. This can be measured by counting the number of regressions from a crater run like the one below.

I would likely be more stringent than that. Affecting just .5% would already make me consider mitigations.

There's an additional approach we could take: annotating all lints with a metadata field for "last significantly updated". This would allow tooling to account for Rust versions when triggering any lints. Then we could make #![deny(warnings)] be coupled with an additional attribute (or Cargo.toml field) to state "I want to mark warn-by-default lints as errors up until the version I support, and later warns remain as warnings at most". (I'm aware this is a separate issue, but believe the same set of users are affected by both simultaneously.)

For the original issue, I endorse splitting the lint into a new family.

tmandry · 2024-04-02T22:14:10Z

I would likely be more stringent than that. Affecting just .5% would already make me consider mitigations.

Yeah, I pulled a number out out of thin air. Let me continue doing that here. How about: If .5% would make you consider mitigations, maybe the policy line should be 1%?

There's an additional approach we could take: annotating all lints with a metadata field for "last significantly updated". This would allow tooling to account for Rust versions when triggering any lints. Then we could make #![deny(warnings)] be coupled with an additional attribute (or Cargo.toml field) to state "I want to mark warn-by-default lints as errors up until the version I support, and later warns remain as warnings at most". (I'm aware this is a separate issue, but believe the same set of users are affected by both simultaneously.)

I like this idea too. It would work well for those releasing to their codebase from discrete Rust versions. Those releasing from nightly might not have such a good time unless we manage to include the merge date somehow.

joshtriplett · 2024-05-30T00:21:57Z

We still need to figure out where this should live (e.g. rustc-dev-guide, somewhere on forge), but 👍 for documenting guidelines for how we evaluate both new lints and expansions of existing lints.

Let's see if we have consensus on the proposal in the top post here:

@rfcbot merge

rfcbot · 2024-05-30T00:21:59Z

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns.
See this document for info about what commands tagged team members can give me.

RalfJung · 2024-05-30T05:48:14Z

I don't quite understand how expanding an existing lint is any worse than adding a new (warn-by-default) lint. Reading the motivation here

Maintainers are stuck with the choice of

Disabling the existing lint while the toolchain is updated and new cases are fixed.

Fixing cases manually, and updating the toolchain immediately so that other developers see the warnings and don't add more such code.

This applies equally to the case of landing a new lint, doesn't it? If you have developers using outdated compilers, they may submit code that would trigger a lint on newer compilers. If you allow the lint, you may miss out on genuine bugs.

We have a technology that can solve issues like "automatically ensure developers submit code that adheres to a certain standard", it's called CI. Why should Rust compiler developers have to bear the cost for people not updating their compilers?

ChrisDenton · 2024-05-30T10:00:22Z

If you have a new lint you can ignore specifically that lint without ignoring the old one.

Note that also an "outdated compiler" can mean stable. Testing in CI on nightly is common and can avoid having to deal with multiple lint changes all at once.

RalfJung · 2024-05-30T10:26:31Z

Testing in CI on nightly is common and can avoid having to deal with multiple lint changes all at once.

Sure, but there's no problem with that -- unless you set -D warnings on your nightly build, but that's a Bad Idea, not something we should accommodate.

tmandry · 2024-05-31T18:37:06Z

If you have developers using outdated compilers, they may submit code that would trigger a lint on newer compilers.

Repos that are large or complex enough typically control the version their developers are using, e.g. with rust_toolchain.toml or some other mechanism, to maintain consistency between the local workflow and CI.

If you allow the lint, you may miss out on genuine bugs.

True, but that is one consideration among many to be made when managing the rollout of an updated compiler with new lints. Often it is easier to update a compiler with a new lint turned off while concurrent changes are made, then enable the new lint in a follow-up change. Consider that in large projects, there are many other potential blockers to upgrading a toolchain: Compiler bugs, latent bugs in the build exposed by changes in the compiler, behavior changes in related tools like clippy and rustfmt, and so on.

In the case of the lint discussed in #121708, an existing lint was massively expanded while catching probably zero new bugs – this was purely an aesthetic choice about redundant imports. That is an example of something that should definitely be a separate lint group under this policy. Expansion of a lint that catches UB should be given more leeway under this policy: "We'd like to avoid having a disproportionately painful impact on the ecosystem compared to the value provided."

We have a technology that can solve issues like "automatically ensure developers submit code that adheres to a certain standard", it's called CI. Why should Rust compiler developers have to bear the cost for people not updating their compilers?

The difficulty that motivated this policy in the first place was the pain of very large monorepos not being able to upgrade their compiler when they wanted to. Intuitions that apply to smaller repos on Github don't apply to these projects. Updating a compiler is a change in the repo itself as I said above, and there are so many other changes happening concurrently that it becomes unmanageable to do so if a high-prevalence lint is turned on as part of that change.

RalfJung · 2024-05-31T21:23:35Z

In the case of the lint discussed in #121708, an existing lint was massively expanded while catching probably zero new bugs

If you don't consider unnecessary imports bugs, then -- true. It did find a bunch of unnecessary imports in my own fairly small codebases though, so I expect it found a lot of those out there as well.

If you don't consider unnecessary imports bugs, then there should be no problem with allowing that lint in the changeset that bumps the compiler, and then later getting the lint to warn can follow whatever process you usually use for that (and it doesn't have to happen for the entire repo at once).

So I don't think I understand the problem. Since you're fine with extending the scope of lints that detect more serious problems, the policy seems to be about "soft" problems only where the issues they find are not considered bugs, and then a temporary allow seems entirely fine.

We'd like to avoid having a disproportionately painful impact on the ecosystem compared to the value provided."

You're not speaking about the entire ecosystem here, but about large monorepos specifically. For the ecosystem at large I don't think this was painful -- it was a nice cleanup. That's based on my extremely limited view, so please correct me if I am wrong -- but the arguments given so far did not apply to the ecosystem at large, but just for the specific case of big companies' internal monorepos.

traviscross · 2024-05-31T23:41:01Z

One reason that #121708 is an unfortunate¹ example is that, regardless of the noise created by the lint expansion, the lint was broader than we actually wanted it to be for other important reasons, as we later decided here in #123813.

It's unfortunate in the sense of "hard cases make bad law", though, by that citation, I mean only that the hardness of the example makes this discussion more challenging, not that the resulting policy is bad. ↩

RalfJung · 2024-06-01T05:52:14Z

Right, so for this particular lint you actually want (some of?) the new cases to be allow by default. That is unfortunate but I understand why. However that makes it an even less suited precedent/argument for this policy, so this just adds to the arguments I gave above.

nikomatsakis · 2024-06-04T14:11:44Z

I had a hard time understanding the proposed policy. I appreciate that making hard-and-fast rules are hard but I think we should give a few more examples at least of times when a new vs expanded lint would be preferred.

I also think we should be annotating lints with the compiler version in which they were introduced (or massively expanded) and letting toolchain authors say "error for all cases that are older than the last few revisions", but that's a separate feature request.

tmandry · 2024-06-05T03:25:33Z

If you don't consider unnecessary imports bugs, then there should be no problem with allowing that lint in the changeset that bumps the compiler, and then later getting the lint to warn can follow whatever process you usually use for that (and it doesn't have to happen for the entire repo at once).

This would have meant allowing an existing very-high-precedence lint that people actually use while updating the compiler. Even if the existing lint didn't catch bugs, it was still useful because it's a standard part of people's workflow. Disabling it would mean degrading the existing user experience, and then having a lot of manual cleanup to do afterward – because there is no MachineApplicable fix.

I have also heard the argument made that unused imports are different from redundant imports in that the first can plausibly catch bugs, while the latter is always just a cleanup. But I'm not sure how much weight I would put on it.

You're not speaking about the entire ecosystem here, but about large monorepos specifically. For the ecosystem at large I don't think this was painful -- it was a nice cleanup. That's based on my extremely limited view, so please correct me if I am wrong -- but the arguments given so far did not apply to the ecosystem at large, but just for the specific case of big companies' internal monorepos.

Two things can be true at the same time: It can be both disruptive and a nice cleanup! But something that touches a high percentage of crates should be rolled out carefully.

I see what you're saying about the "ecosystem" phrasing – the phrasing was not actually my suggestion but @joshtriplett's, and it seems a bit disingenuous given that all my examples are from large monorepos – but I did feel it was warranted given the number of backlinks on #117772. That may be partly due to the issue mentioned by @traviscross.

tmandry · 2024-06-05T03:35:09Z

I had a hard time understanding the proposed policy. I appreciate that making hard-and-fast rules are hard but I think we should give a few more examples at least of times when a new vs expanded lint would be preferred.

You know, I actually had such a hard-and-fast rule as part of the initial proposal. Reading back through the revised guidelines, I think it works as motivation/rationale for a broader policy, but in the interest of making something clear and easy to follow I think I would like to go back to the simple rule that was here before (or something close to it).

When an existing lint is expanded to include many new cases, we must provide either:

A new lint name under the existing group, so that users may opt out of the expansion at least temporarily, or

A MachineApplicable fix for the lint. This fix must produce code that compiles without new lints or errors under the most recent stable compiler.

Exceptions to this policy may be made via Language Team FCP.

RalfJung · 2024-06-05T06:05:24Z

This would have meant allowing an existing very-high-precedence lint that people actually use while updating the compiler. Even if the existing lint didn't catch bugs, it was still useful because it's a standard part of people's workflow. Disabling it would mean degrading the existing user experience, and then having a lot of manual cleanup to do afterward – because there is no MachineApplicable fix.

To me this all sounds somewhat contradicting, you basically want your cake and eat it, too (based on your repo's particular preferences for which lints should be enabled when), while rustc should then bear the complexity required to achieve this.

I have also heard the argument made that unused imports are different from redundant imports in that the first can plausibly catch bugs, while the latter is always just a cleanup. But I'm not sure how much weight I would put on it.

The lint in question seems to be a very bad example for motivating such a policy, given that it was later determined that many of the new cases should be allow anyway. I think this example should be discarded from the proposal for that reason. Is there any other lint where you have similar concerns, or is this now entirely hypothetical?

nikomatsakis · 2024-06-05T09:02:56Z

I disagree Ralf that this problem is specific to Google's requirements. I can definitely say that within Amazon existing lint policy is a serious pain point and it is something that the Rust org has been hearing from users for a long time. I still want us to keep aggressively improving and updating lints but giving users tools to manage those updates is part of how we are able to do that without unduly disrupting users' workflows. That said, I am not entirely convinced that the proposed policy is the way to do it. In my ideal world we would make "lint management " more of a first class feature and have an RFC, if for no other reason then I think this will be important to a number of users and merits a clearer workflow. But I don't want to let perfect be enemy of the good either, and I know that's a much higher bar. I am mildly worried about a proliferation of lints -- lint names and lint groups are permanent.

…

On Wed, Jun 5, 2024, at 2:05 AM, Ralf Jung wrote: > This would have meant allowing an existing very-high-precedence lint that people actually use while updating the compiler. Even if the existing lint didn't catch bugs, it was still useful because it's a standard part of people's workflow. Disabling it would mean degrading the existing user experience, and then having a lot of manual cleanup to do afterward – because there is no MachineApplicable fix. > To me this all sounds somewhat contradicting, you basically want your cake and eat it, too (based on your repo's particular preferences for which lints should be enabled when), while rustc should then bear the complexity required to achieve this. > I have also heard the argument made that unused imports are different from redundant imports in that the first can plausibly catch bugs, while the latter is always just a cleanup. But I'm not sure how much weight I would put on it. > The lint in question seems to be a very bad example for motivating such a policy, given that it was later determined that many of the new cases should be `allow` anyway. I think this example should be discarded from the proposal for that reason. Is there any other lint where you have similar concerns, or is this now entirely hypothetical? — Reply to this email directly, view it on GitHub <#122759 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABF4ZVLS3YTYDHRO5L6UNDZF2TDXAVCNFSM6AAAAABE6UQXROVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBYHE2DKMBYGY>. You are receiving this because you were mentioned.Message ID: ***@***.***>

tmandry added A-diagnostics Area: Messages for errors, warnings, and lints T-lang Relevant to the language team, which will review and decide on the PR/issue. C-discussion Category: Discussion or questions that doesn't represent real issues. labels Mar 20, 2024

rustbot added the I-lang-nominated Nominated for discussion during a lang team meeting. label Mar 20, 2024

jieyouxu added the A-lint Area: Lints (warnings about flaws in source code) such as unused_mut. label Mar 20, 2024

workingjubilee mentioned this issue Mar 29, 2024

Consider splitting unused_import's glob re-export visibility warning into its own lint #123210

Open

Urgau mentioned this issue May 15, 2024

Implement a lint for implicit autoref of raw pointer dereference - take 2 #123239

Open

rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels May 30, 2024

ChrisDenton mentioned this issue Jun 13, 2024

dead_code lint wrongly warns about public repr(C) structs with private fields but no constructors #126169

Closed

chrisnc mentioned this issue Jul 30, 2024

Consider const fn with &self while checking liveness #128329

Closed

This was referenced Aug 21, 2024

Don't warn empty branches unreachable for now #129103

Merged

Decide whether we want MSRV-gating #129445

Open

Mark-Simulacrum mentioned this issue Nov 10, 2024

Report the unexpected_cfgs lint in external macros #132577

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Policy for lint expansions #122759

Policy for lint expansions #122759

tmandry commented Mar 20, 2024 •

edited

Loading

tmandry commented Mar 20, 2024

estebank commented Mar 20, 2024

tmandry commented Apr 2, 2024

joshtriplett commented May 30, 2024

rfcbot commented May 30, 2024 •

edited by tmandry

Loading

RalfJung commented May 30, 2024 •

edited

Loading

ChrisDenton commented May 30, 2024

RalfJung commented May 30, 2024

tmandry commented May 31, 2024

RalfJung commented May 31, 2024 •

edited

Loading

traviscross commented May 31, 2024

RalfJung commented Jun 1, 2024 via email

nikomatsakis commented Jun 4, 2024

tmandry commented Jun 5, 2024 •

edited

Loading

tmandry commented Jun 5, 2024

RalfJung commented Jun 5, 2024 •

edited

Loading

nikomatsakis commented Jun 5, 2024 via email

Policy for lint expansions #122759

Policy for lint expansions #122759

Comments

tmandry commented Mar 20, 2024 • edited Loading

Background

Guidelines

Deciding on new lint vs expansion

Mitigations for lint expansions

Crater command

Lint revert policy

tmandry commented Mar 20, 2024

estebank commented Mar 20, 2024

tmandry commented Apr 2, 2024

joshtriplett commented May 30, 2024

rfcbot commented May 30, 2024 • edited by tmandry Loading

RalfJung commented May 30, 2024 • edited Loading

ChrisDenton commented May 30, 2024

RalfJung commented May 30, 2024

tmandry commented May 31, 2024

RalfJung commented May 31, 2024 • edited Loading

traviscross commented May 31, 2024

Footnotes

RalfJung commented Jun 1, 2024 via email

nikomatsakis commented Jun 4, 2024

tmandry commented Jun 5, 2024 • edited Loading

tmandry commented Jun 5, 2024

RalfJung commented Jun 5, 2024 • edited Loading

nikomatsakis commented Jun 5, 2024 via email

tmandry commented Mar 20, 2024 •

edited

Loading

rfcbot commented May 30, 2024 •

edited by tmandry

Loading

RalfJung commented May 30, 2024 •

edited

Loading

RalfJung commented May 31, 2024 •

edited

Loading

tmandry commented Jun 5, 2024 •

edited

Loading

RalfJung commented Jun 5, 2024 •

edited

Loading