Add an LLM policy for rust-lang/rust#1040
Conversation
|
r? @jieyouxu rustbot has assigned @jieyouxu. Use Why was this reviewer chosen?The reviewer was selected based on:
|
|
@rustbot label T-libs T-compiler T-rustdoc T-bootstrap |
## Summary [summary]: #summary This document establishes a policy for how LLMs can be used when contributing to `rust-lang/rust`. Subtrees, submodules, and dependencies from crates.io are not in scope. Other repositories in the `rust-lang` organization are not in scope. This policy is intended to live in [Forge](https://forge.rust-lang.org/) as a living document, not as a dead RFC. It will be linked from `CONTRIBUTING.md` in rust-lang/rust as well as from the rustc- and std-dev-guides. ## Moderation guidelines This PR is preceded by [an enormous amount of discussion on Zulip](https://rust-lang.zulipchat.com/#narrow/channel/588130-project-llm-policy). Almost every conceivable angle has been discussed to death; there have been upwards of 3000 messages, not even counting discussion on GitHub. We initially doubted whether we could reach consensus at all. Therefore, we ask to bound the scope of this PR specifically to the policy itself. In particular, we mark several topics as out of scope below. We still consider these topics to be important, we simply do not believe this is the right place to discuss them. No comment on this PR may mention the following topics: - Long-term social or economic impact of LLMs - The environmental impact of LLMs - Anything to do with the copyright status of LLM output - Moral judgements about people who use LLMs We have asked the moderation team to help us enforce these rules. ## Feedback guidelines We are aware that parts of this policy will make some people very unhappy. As you are reading, we ask you to consider the following. - Can you think of a *concrete* improvement to the policy that addresses your concern? Consider: - Whether your change will make the policy harder to moderate - Whether your change will make it harder to come to a consensus - Does your concern need to be addressed before merging or can it be addressed in a follow-up? - Keep in mind the cost of *not* creating a policy. ### If your concern is for yourself or for your team - What are the *specific* parts of your workflow that will be disrupted? - In particular we are *only* interested in workflows involving `rust-lang/rust`. Other repositories are not affected by this policy and are therefore not in scope. - Can you live with the disruption? Is it worth blocking the policy over? --- Previous versions of this document were discussed on Zulip, and we have made edits in responses to suggestions there. ## Motivation [motivation]: #motivation - Many people find LLM-generated code and writing deeply unpleasant to read or review. - Many people find LLMs to be a significant aid to learning and discovery. - `rust-lang/rust` is currently dealing with a deluge of low-effort "slop" PRs primarily authored by LLMs. - Having *a* policy makes these easier to moderate, without having to take every single instance on a case-by-case basis. This policy is *not* intended as a debate over whether LLMs are a good or bad idea, nor over the long-term impact of LLMs. It is only intended to set out the future policy of `rust-lang/rust` itself. ## Drawbacks [drawbacks]: #drawbacks - This bans some valid usages of LLMs. We intentionally err on the side of banning too much rather than too little in order to make the policy easy to understand and moderate. - This intentionally does not address the moral, social, and environmental impacts of LLMs. These topics have been extensively discussed on Zulip without reaching consensus, but this policy is relevant regardless of the outcome of these discussions. - This intentionally does not attempt to set a project-wide policy. We have attempted to come to a consensus for upwards of a month without significant process. We are cutting our losses so we can have *something* rather than adhoc moderation decisions. - This intentionally does not apply to subtrees of rust-lang/rust. We don't have the same moderation issues there, so we don't have time pressure to set a policy in the same way. ## Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives - We could create a project-wide policy, rather than scoping it to `rust-lang/rust`. This has the advantage that everyone knows what the policy is everywhere, and that it's easy to make things part of the mono-repo at a later date. It has the disadvantage that we think it is nigh-impossible to get everyone to agree. There are also reasons for teams to have different policies; for example, the standard for correctness is much higher within the compiler than within Clippy. - We could have a more strict policy that removes the [threshold of originality](https://fsfe.org/news/2025/news-20250515-01.en.html) condition. This has the advantage that our policy becomes easier to moderate and understand. It has the disadvantage that it becomes easy for people to intend to follow the policy, but be put in a position where their only choices are to either discard the PR altogether, rewrite it from scratch, or tell "white lies" about whether an LLM was involved. - We could have a more strict policy that bans LLMs altogether. It seems unlikely we will be able to agree on this, and we believe attempting it will cause many people to leave the project. ## Prior art [prior-art]: #prior-art This prior art section is taken almost entirely from [Jane Lusby's summary of her research](rust-lang/leadership-council#273 (comment)), although we have taken the liberty of moving the Rust project's prior art to the top. We thank her for her help. ### Rust - [Moderation team's spam policy](https://github.com/rust-lang/moderation-team/blob/main/policies/spam.md/#fully-or-partially-automated-contribs) - [Compiler team's "burdensome PRs" policy](rust-lang/compiler-team#893) ### Other organizations These are organized along a spectrum of AI friendliness, where top is least friendly, and bottom is most friendly. - full ban - [postmarketOS](https://docs.postmarketos.org/policies-and-processes/development/ai-policy.html) - also explicitly bans encouraging others to use AI for solving problems related to postmarketOS - multi point ethics based rational with citations included - [zig](https://ziglang.org/code-of-conduct/) - philosophical, cites [Profession (novella)](https://en.wikipedia.org/wiki/Profession_(novella)) - rooted in concerns around the construction and origins of original thought - [servo](https://book.servo.org/contributing/getting-started.html#ai-contributions) - more pragmatic, directly lists concerns around ai, fairly concise - [qemu](https://www.qemu.org/docs/master/devel/code-provenance.html#use-of-ai-content-generators) - pragmatic, focuses on copyright and licensing concerns - explicitly allows AI for exploring api, debugging, and other non generative assistance, other policies do not explicitly ban this or mention it in any way - allowed with supervision, human is ultimately responsible - [scipy](https://github.com/scipy/scipy/pull/24583/changes) - strict attribution policy including name of model - [llvm](https://llvm.org/docs/AIToolPolicy.html) - [blender](https://devtalk.blender.org/t/ai-contributions-policy/44202) - [linux kernel](https://kernel.org/doc/html/next/process/coding-assistants.html) - quite concise but otherwise seems the same as many in this category - [mesa](https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/docs/submittingpatches.rst) - framed as a contribution policy not an AI policy, AI is listed as a tool that can be used but emphasizes same requirements that author must understand the code they contribute, seems to leave room for partial understanding from new contributors. > Understand the code you write at least well enough to be able to explain why your changes are beneficial to the project. - [forgejo](https://codeberg.org/forgejo/governance/src/branch/main/AIAgreement.md) - bans AI for review, does not explicitly require contributors to understand code generated by ai. One could interpret the "accountability for contribution lies with contributor even if AI is used" line as implying this requirement, though their version seems poorly worded imo. - [firefox](https://firefox-source-docs.mozilla.org/contributing/ai-coding.html) - [ghostty](https://github.com/ghostty-org/ghostty/blob/main/AI_POLICY.md) - pro-AI but views "bad users" as the source of issues with it and the only reason for what ghostty considers a "strict AI policy" - [fedora](https://communityblog.fedoraproject.org/council-policy-proposal-policy-on-ai-assisted-contributions/) - clearly inspired and is cited by many of the above, but is definitely framed more pro-ai than the derived policies tend to be - [curl](https://curl.se/dev/contribute.html#on-ai-use-in-curl) - does not explicitly require humans understand contributions, otherwise policy is similar to above policies - [linux foundation](https://www.linuxfoundation.org/legal/generative-ai) - encourages usage, focuses on legal liability, mentions that tooling exists to help automate managing legal liability, does not mention specific tools - In progress - NixOS - NixOS/nixpkgs#410741 ## Unresolved questions [unresolved-questions]: #unresolved-questions See the "Moderation guidelines" and "Drawbacks" section for a list of topics that are out of scope.
There was a problem hiding this comment.
I really like this version, and thanks a ton for working on it. Specifically:
- It doesn't try to dump entire walls of text, which is unfortunately a good way to be sure nobody reads it. Instead, it gives you concrete examples, and a guiding rule-of-thumb for uncovered scenarios, and acknowledges upfront that it surely cannot be exhaustive.
- I also like where it points out the nuance and recognizes the uncertainties.
- I like that it covers both "producers" and "consumers" (with nuance that reviewers can also technically use LLMs in ways that are frustrating to the PR authors!)
I left a few suggestions / nits, but even without them this is still a very good start IMO.
(Will not leave an explicit approval until we establish wider consensus, which likely will take the form of 4-team joint FCP.)
|
The links to Zulip are project-private, FWIW. |
I'm aware. This PR is targeted towards Rust project members moreso than the broad community. |
This comment was marked as off-topic.
This comment was marked as off-topic.
|
Ok, despite all good intentions we had to lock this conversation again. To all those coming here and point out that this policy does not consider ethics, environmental and other related issues: we hear you. We know. We totally understand. But here again the reasoning for not including these topics here at this time. This is not negotiable. Please take a few moments to read and fully understand that comment and consider its implications. |
| #### Be honest | ||
| Conversely, lying about whether you've used an LLM, or attempting to hide the extent of the use, is considered a [code of conduct](https://rust-lang.org/policies/code-of-conduct/) violation. | ||
| If you are not sure where something you would like to do falls in this policy, please talk to the [moderation team](mailto:rust-mods@rust-lang.org). | ||
| Don't try to hide it. | ||
|
|
||
| #### Penalties | ||
| The policies marked with a 🔨 follow the same guidelines as the code of conduct: | ||
| Violations will first result in a warning, and repeated violations may result in a ban. | ||
| - 🔨 Violations of the "Be honest" section |
There was a problem hiding this comment.
One of the stated goals of this document is to:
Make the policy enforceable and easy to moderate.
But how do we plan to enforce the "be honest" section? How are we to judge whether someone's use of an LLM exceeded what was disclosed?
That seems an important detail. This document doesn't talk anywhere about standards of proof. Other CoC items are enforceable on visible behavior. This one is harder.
False accusations of LLM use are widespread. This policy makes each accusation into a moderation matter that must be litigated. An accused Project member can't just ignore a false accusation as an annoyance when we've decided to treat this as a bannable CoC violation. We will have raised the stakes.
Surely we don't expect the accused to prove a negative. And surely we don't intend to judge people as liars based on faulty stylistic heuristics. So what is the plan for how we adjudicate this fairly, consistently, and while lowering the burden on moderators?1
Footnotes
-
For anyone wondering why I needed to momentarily unlock the thread to post this review comment, I don't know either. But GitHub refused to let me post it while the thread was locked ("Failed to save comment: Issue is locked."). ↩
There was a problem hiding this comment.
Hrm. I'm thinking on this a bit.
This does seem like an important point. I'd worry about accusations that cannot be disproved leading to a moderation action.
What would be the mechanics of the accusation resolution?
There was a problem hiding this comment.
@traviscross it's about trust.
You can assume good faith and cannot bullet-proof malice but we will do our best to tell one from the other. I think the word "experimentation" in this document is also about this and I wouldn't block on this concern.
This policy makes each accusation into a moderation matter that must be litigated
Not exactly. This document indicates guidelines for project members and for us (mods) as well as expectations from contributions. Mods are called to help - and I am quoting the document - in case of "major violations or extractive PRs" while reviewers are asked to be helpful in case of "minor violations". I wouldn't read too much into it.
At the RustWeek Niko said something about the Rust funding initiative that I think also applies here: maybe we won't get everything right immediately but we will have something out now and we will iterate as needed.
There was a problem hiding this comment.
Note also that I have an entire "Don’t play detective" section dedicated to discussing this.
There was a problem hiding this comment.
@apiraino: I'm trying to connect it, but I don't really follow how what you're saying engages with the concern I'm raising. The lines I'm commenting on above are the only part of the policy that is made coequal with other CoC items. If I accuse a Project member of harassment (a CoC item), I'd expect the mods would investigate to see whether that occurred. Similarly, if I accuse another member of lying about the extent of covered LLM use, given the policy treats this as a CoC item, I'd presume the mods would investigate to determine whether that happened.
My question isn't about trust, or good faith, or malice — those would seem to come after we determine the facts, and I'm unsure how we plan to determine facts here. My worry is that we may only have two bad options: guessing and unenforceability. I'm wondering whether (and hoping that) we have a better idea for this. We're talking about judgments that will have real effects on people. This isn't the kind of thing I'd want to leave to experimentation.
@jyn514: The "it's not your job to play detective" section mostly gives guidance to reporters (those who believe the policy has been broken). I'm asking how we expect the moderators (to whom the reports go) to adjudicate these reports. On that, the section says only, "the mod team is free to exercise their own judgment and discretion", but I don't see how this solves the epistemological problem. And asking the mod team to decide the undecidable seems in tension with the goal of making "the policy enforceable and easy to moderate."
There was a problem hiding this comment.
I feel the only thing you can do here is to go with "innocent until proven guilty". My stances are very anti-LLM but I would rather have some LLM outputs slip in than have witch hunts against innocent contributors.
Reports should include as much evidence as possible, so it makes LLM usage very obvious and reduces moderation burden.
Also, I believe https://cyrneko.eu/ai-policy.html has a bunch of good ideas.
There was a problem hiding this comment.
I agree that "innocent until proven guilty" is the only workable approach here. What other standard could realistically be applied?
Once failure to disclose LLM use becomes a CoC violation, the project implicitly needs a standard of proof. But allegations of undisclosed LLM use are often inherently hard to verify, so the burden of proof pretty much has to be, in other words: "beyond a reasonable doubt”.
And if a case really is obvious beyond reasonable doubt, it probably does not need to become a moderation matter in the first place, the contributor can simply be pointed at the policy and asked to correct the disclosure. The difficult cases are precisely the ones where certainty is not realistically achievable.
|
I'm going to unlock this now that RustWeek is over and we're not all busy. |
| For minor violations we recommend telling the author that we can't review the PR until it complies with the policy, with pointers to exactly what they need to do. | ||
| For major violations or extractive PRs, we recommend closing the PR or issue. | ||
|
|
||
| It is **not** ok to harrass a contributor for using an LLM. |
There was a problem hiding this comment.
| It is **not** ok to harrass a contributor for using an LLM. | |
| It is a [code of conduct](https://rust-lang.org/policies/code-of-conduct/) violation to harass a contributor on the basis of suspected or perceived LLM use. |
Rendered
View all comments
FCP link
Summary
This document establishes a policy for how LLMs can be used when contributing to
rust-lang/rust. Subtrees, submodules, and dependencies from crates.io are not in scope. Other repositories in therust-langorganization are not in scope.This policy is intended to live in Forge as a living document, not as a dead RFC. It will be linked from
CONTRIBUTING.mdin rust-lang/rust as well as from the rustc- and std-dev-guides.Ethical issues
See this thread.
Moderation guidelines
This PR is preceded by an enormous amount of discussion on Zulip. Almost every conceivable angle has been discussed to death; there have been upwards of 3000 messages, not even counting discussion on GitHub. We initially doubted whether we could reach consensus at all.
Therefore, we ask to bound the scope of this PR specifically to the policy itself. In particular, we mark several topics as out of scope below. We still consider these topics to be important, we simply do not believe this is the right place to discuss them.
So, the following are considered off topic for this PR specifically:
We have asked the moderation team to help us enforce these rules. For an extended rationale, please see this comment.
Feedback guidelines
We are aware that parts of this policy will make some people very unhappy. As you are reading, we ask you to consider the following.
If your concern is for yourself or for your team
rust-lang/rust. Other repositories are not affected by this policy and are therefore not in scope.Previous versions of this document were discussed on Zulip, and we have made edits in responses to suggestions there.
Motivation
rust-lang/rustis currently dealing with a deluge of low-effort "slop" PRs primarily authored by LLMs.This policy is not intended as a debate over whether LLMs are a good or bad idea, nor over the long-term impact of LLMs. It is only intended to set out the future policy of
rust-lang/rustitself.Drawbacks
Rationale and alternatives
rust-lang/rust. This has the advantage that everyone knows what the policy is everywhere, and that it's easy to make things part of the mono-repo at a later date. It has the disadvantage that we think it is nigh-impossible to get everyone to agree. There are also reasons for teams to have different policies; for example, the standard for correctness is much higher within the compiler than within Clippy.Prior art
This prior art section is taken almost entirely from Jane Lusby's summary of her research, although we have taken the liberty of moving the Rust project's prior art to the top. We thank her for her help.
Rust
Other organizations
These are organized along a spectrum of AI friendliness, where top is least friendly, and bottom is most friendly.
Unresolved questions
See the "Moderation guidelines" and "Drawbacks" section for a list of topics that are out of scope.