Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't abort CI checks on first error #12206

Closed
alice-i-cecile opened this issue Feb 29, 2024 · 2 comments · Fixed by #12242
Closed

Don't abort CI checks on first error #12206

alice-i-cecile opened this issue Feb 29, 2024 · 2 comments · Fixed by #12242
Labels
A-Build-System Related to build systems or continuous integration C-Usability A targeted quality-of-life change that makes Bevy easier to use X-Controversial There is active debate or serious implications around merging this PR

Comments

@alice-i-cecile
Copy link
Member

alice-i-cecile commented Feb 29, 2024

What problem does this solve or what need does it fill?

When working on complex PRs like #12163, it's frustrating to fix all of the errors reported by CI, only to be met with another round of failures in a follow-up crate.

What solution would you like?

Pass the --keep-going flag to CI. See rust-lang/cargo#10496 for the tracking issue. Apparently this is now stable as of Rust 1.74: https://doc.rust-lang.org/cargo/commands/cargo-build.html#option-cargo-build---keep-going

It looks like for cargo test, we also need to pass in --no-fail-fast: https://doc.rust-lang.org/cargo/commands/cargo-test.html#options

What alternative(s) have you considered?

Always run CI repeatedly locally.

@alice-i-cecile alice-i-cecile added D-Trivial Nice and easy! A great choice to get started with Bevy A-Build-System Related to build systems or continuous integration C-Usability A targeted quality-of-life change that makes Bevy easier to use labels Feb 29, 2024
@alice-i-cecile alice-i-cecile changed the title Add keep-going flag to CI to catch more errors per run Don't abort CI checks on first error Feb 29, 2024
@mockersf mockersf added the X-Controversial There is active debate or serious implications around merging this PR label Mar 4, 2024
@mockersf
Copy link
Member

mockersf commented Mar 4, 2024

The goal of CI is to fail fast in a case of errors, and to not do extra works when possible.

Changing that would need more changes than just passing --keep-going everywhere: reducing dependency between jobs, not cancelling everything on the first job failed. Failing fast is the default everywhere and would need to add more configuration.

@alice-i-cecile alice-i-cecile removed the D-Trivial Nice and easy! A great choice to get started with Bevy label Mar 4, 2024
@alice-i-cecile
Copy link
Member Author

That makes a lot of sense: I didn't think about dependent workflows. As a reviewer, I would still prefer it if all errors were caught, but enabling this option locally will be tremendously useful as an author.

@alice-i-cecile alice-i-cecile closed this as not planned Won't fix, can't repro, duplicate, stale Mar 4, 2024
github-merge-queue bot pushed a commit that referenced this issue Mar 4, 2024
# Objective

- Make running CI locally relatively less painful by allowing
continuation after failure
- Fix #12206 

## Solution

Previously, `ci` would halt after encounting a failure (or shortly
thereafter). For instance, if you ran `cargo run -p ci` and a single
test within a CI check failed somewhere in the middle, you would never
even reach many of the other tests within the check and certainly none
of the CI checks that came afterward. This was annoying; if I am fixing
issues with CI tests, I want to just see all of the issues up-front
instead of having to rerun CI every time I fix something to see the next
error.

Furthermore, it is not infrequent (because of subtle
configuration/system differences) to encounter spurious CI failures
locally; previously, when these occurred, they would make the process of
running CI largely pointless, since you would have to push your branch
in order to see your actual CI failures from the automated testing
service.

Now, when running `ci` locally, we have a couple new tools to ameliorate
these and provide a smoother experience:
- Firstly, there is a new optional flag `--keep-going` that can be
passed while calling `ci` (e.g. `cargo run -p ci -- doc --keep-going`).
It has the following effects:
- It causes the `--keep-going` flag to be passed to the script's `cargo
doc` and `cargo check` invocations, so that they do not stop when they
encounter an error in a single module; instead, they keep going (!) and
find errors subsequently.
- It causes the `--no-fail-fast` flag to be passed to the script's
`cargo test` invocations, to similar effect.
- Finally, it causes `ci` itself not to abort after a single check
fails; instead, it will run every check that was invoked.

Thus, for instance, `cargo run -p ci -- --keep-going` will run every CI
check even if it encounters intermediate failures, and every such check
will itself be run to completion.

- Secondly, we now allow multiple ordinary arguments to be passed to
`ci`. For instance, `cargo -p ci -- doc test` now executes both the
'doc' checks and the 'test' checks. This allows the caller to run the
tests they care about with fewer invocations of `ci`.

As of this PR, automated testing will remain unchanged. 

---

## Changelog

- tools/ci/src/main.rs refactored into staging and execution steps,
since the previous control flow did not naturally support continuing
after failure.
- Added "--keep-going" flag to `ci`.
- Added support for invoking `ci` with multiple arguments.

---

## Discussion

### Design considerations

I had originally split this into separate flags that controlled:
1. whether `--keep-going`/`--no-fail-fast` would be passed to the
constituent commands
2. whether `ci` would continue after a component test failed

However, I decided to merge these two behaviors, since I think that, if
you're in the business of seeing all the errors, you'll probably want to
actually see all of the errors. One slightly annoying thing, however,
about the new behavior with `--keep-going`, is that you will sometimes
find yourself wishing that the script would pause or something, since it
tends to fill the screen with a lot of junk. I have found that sending
stdout to /dev/null helps quite a bit, but I don't think `cargo fmt` or
`cargo clippy` actually write to stderr, so you need to be cognizant of
that (and perhaps invoke the lints separately).

~~Next, I'll mention that I feel somewhat strongly that the current
behavior of `ci` for automated testing should remain the same, since its
job is more like detecting that errors exist rather than finding all of
them.~~ (I was convinced this could have value.) Orthogonally, there is
the question of whether or not we might want to make this (or something
similar) actually the default behavior and make the automated test
script invoke some optional flags — it doesn't have to type with its
hands, after all. I'm not against that, but I don't really want to rock
the boat much more with this PR, since anyone who looks at the diff
might already be a little incensed.
spectria-limina pushed a commit to spectria-limina/bevy that referenced this issue Mar 9, 2024
# Objective

- Make running CI locally relatively less painful by allowing
continuation after failure
- Fix bevyengine#12206 

## Solution

Previously, `ci` would halt after encounting a failure (or shortly
thereafter). For instance, if you ran `cargo run -p ci` and a single
test within a CI check failed somewhere in the middle, you would never
even reach many of the other tests within the check and certainly none
of the CI checks that came afterward. This was annoying; if I am fixing
issues with CI tests, I want to just see all of the issues up-front
instead of having to rerun CI every time I fix something to see the next
error.

Furthermore, it is not infrequent (because of subtle
configuration/system differences) to encounter spurious CI failures
locally; previously, when these occurred, they would make the process of
running CI largely pointless, since you would have to push your branch
in order to see your actual CI failures from the automated testing
service.

Now, when running `ci` locally, we have a couple new tools to ameliorate
these and provide a smoother experience:
- Firstly, there is a new optional flag `--keep-going` that can be
passed while calling `ci` (e.g. `cargo run -p ci -- doc --keep-going`).
It has the following effects:
- It causes the `--keep-going` flag to be passed to the script's `cargo
doc` and `cargo check` invocations, so that they do not stop when they
encounter an error in a single module; instead, they keep going (!) and
find errors subsequently.
- It causes the `--no-fail-fast` flag to be passed to the script's
`cargo test` invocations, to similar effect.
- Finally, it causes `ci` itself not to abort after a single check
fails; instead, it will run every check that was invoked.

Thus, for instance, `cargo run -p ci -- --keep-going` will run every CI
check even if it encounters intermediate failures, and every such check
will itself be run to completion.

- Secondly, we now allow multiple ordinary arguments to be passed to
`ci`. For instance, `cargo -p ci -- doc test` now executes both the
'doc' checks and the 'test' checks. This allows the caller to run the
tests they care about with fewer invocations of `ci`.

As of this PR, automated testing will remain unchanged. 

---

## Changelog

- tools/ci/src/main.rs refactored into staging and execution steps,
since the previous control flow did not naturally support continuing
after failure.
- Added "--keep-going" flag to `ci`.
- Added support for invoking `ci` with multiple arguments.

---

## Discussion

### Design considerations

I had originally split this into separate flags that controlled:
1. whether `--keep-going`/`--no-fail-fast` would be passed to the
constituent commands
2. whether `ci` would continue after a component test failed

However, I decided to merge these two behaviors, since I think that, if
you're in the business of seeing all the errors, you'll probably want to
actually see all of the errors. One slightly annoying thing, however,
about the new behavior with `--keep-going`, is that you will sometimes
find yourself wishing that the script would pause or something, since it
tends to fill the screen with a lot of junk. I have found that sending
stdout to /dev/null helps quite a bit, but I don't think `cargo fmt` or
`cargo clippy` actually write to stderr, so you need to be cognizant of
that (and perhaps invoke the lints separately).

~~Next, I'll mention that I feel somewhat strongly that the current
behavior of `ci` for automated testing should remain the same, since its
job is more like detecting that errors exist rather than finding all of
them.~~ (I was convinced this could have value.) Orthogonally, there is
the question of whether or not we might want to make this (or something
similar) actually the default behavior and make the automated test
script invoke some optional flags — it doesn't have to type with its
hands, after all. I'm not against that, but I don't really want to rock
the boat much more with this PR, since anyone who looks at the diff
might already be a little incensed.
mtsr pushed a commit to mtsr/bevy that referenced this issue Mar 15, 2024
# Objective

- Make running CI locally relatively less painful by allowing
continuation after failure
- Fix bevyengine#12206 

## Solution

Previously, `ci` would halt after encounting a failure (or shortly
thereafter). For instance, if you ran `cargo run -p ci` and a single
test within a CI check failed somewhere in the middle, you would never
even reach many of the other tests within the check and certainly none
of the CI checks that came afterward. This was annoying; if I am fixing
issues with CI tests, I want to just see all of the issues up-front
instead of having to rerun CI every time I fix something to see the next
error.

Furthermore, it is not infrequent (because of subtle
configuration/system differences) to encounter spurious CI failures
locally; previously, when these occurred, they would make the process of
running CI largely pointless, since you would have to push your branch
in order to see your actual CI failures from the automated testing
service.

Now, when running `ci` locally, we have a couple new tools to ameliorate
these and provide a smoother experience:
- Firstly, there is a new optional flag `--keep-going` that can be
passed while calling `ci` (e.g. `cargo run -p ci -- doc --keep-going`).
It has the following effects:
- It causes the `--keep-going` flag to be passed to the script's `cargo
doc` and `cargo check` invocations, so that they do not stop when they
encounter an error in a single module; instead, they keep going (!) and
find errors subsequently.
- It causes the `--no-fail-fast` flag to be passed to the script's
`cargo test` invocations, to similar effect.
- Finally, it causes `ci` itself not to abort after a single check
fails; instead, it will run every check that was invoked.

Thus, for instance, `cargo run -p ci -- --keep-going` will run every CI
check even if it encounters intermediate failures, and every such check
will itself be run to completion.

- Secondly, we now allow multiple ordinary arguments to be passed to
`ci`. For instance, `cargo -p ci -- doc test` now executes both the
'doc' checks and the 'test' checks. This allows the caller to run the
tests they care about with fewer invocations of `ci`.

As of this PR, automated testing will remain unchanged. 

---

## Changelog

- tools/ci/src/main.rs refactored into staging and execution steps,
since the previous control flow did not naturally support continuing
after failure.
- Added "--keep-going" flag to `ci`.
- Added support for invoking `ci` with multiple arguments.

---

## Discussion

### Design considerations

I had originally split this into separate flags that controlled:
1. whether `--keep-going`/`--no-fail-fast` would be passed to the
constituent commands
2. whether `ci` would continue after a component test failed

However, I decided to merge these two behaviors, since I think that, if
you're in the business of seeing all the errors, you'll probably want to
actually see all of the errors. One slightly annoying thing, however,
about the new behavior with `--keep-going`, is that you will sometimes
find yourself wishing that the script would pause or something, since it
tends to fill the screen with a lot of junk. I have found that sending
stdout to /dev/null helps quite a bit, but I don't think `cargo fmt` or
`cargo clippy` actually write to stderr, so you need to be cognizant of
that (and perhaps invoke the lints separately).

~~Next, I'll mention that I feel somewhat strongly that the current
behavior of `ci` for automated testing should remain the same, since its
job is more like detecting that errors exist rather than finding all of
them.~~ (I was convinced this could have value.) Orthogonally, there is
the question of whether or not we might want to make this (or something
similar) actually the default behavior and make the automated test
script invoke some optional flags — it doesn't have to type with its
hands, after all. I'm not against that, but I don't really want to rock
the boat much more with this PR, since anyone who looks at the diff
might already be a little incensed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Build-System Related to build systems or continuous integration C-Usability A targeted quality-of-life change that makes Bevy easier to use X-Controversial There is active debate or serious implications around merging this PR
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants