Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ui constant tests for big-endian platforms #106047

Merged
merged 1 commit into from
Jan 9, 2023

Conversation

uweigand
Copy link
Contributor

A number of tests under ui/const-ptr and ui/consts are currently failing on big-endian platforms as the binary encoding of some constants is hard-coded in the stderr test files.

Fix this by a combination of two types of changes:

  • Where possible (i.e. where the particular value of a constant does not affect the purpose of the test), choose constant values that have the same encoding on big- and little-endian platforms.

  • Where this is not possible, provide a normalize-stderr-test rule that transforms the printed big-endian encoding of such constants into the corresponding little-endian form.

Fixes part of #105383.

@rustbot
Copy link
Collaborator

rustbot commented Dec 22, 2022

r? @oli-obk

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 22, 2022
@RalfJung
Copy link
Member

I think I am fine with the general approach, but it seems very fragile. Without a big-endian CI runner this is bound to break again soon, in particular since there is zero documentation that even explains why we use such odd constants, or why we have these normalizing rules.

@uweigand
Copy link
Contributor Author

I think I am fine with the general approach, but it seems very fragile. Without a big-endian CI runner this is bound to break again soon, in particular since there is zero documentation that even explains why we use such odd constants, or why we have these normalizing rules.

I can certainly add some comments to why the constants are chosen - there is actually already some precedent like:

// Use `0` as constant to make behavior endianness-independent.

in ub-enum.rs.

As to CI, I agree - it would be great to have CI running on a BE system. I'd be happy provide access to a s390x machine and/or set up CI infrastructure myself. Not sure if this would be integrated into the main CI machinery or run stand-alone - whatever the Rust community prefers.

Taking a different approach - the problem is really just the "raw byte" dumps, which are provided as extra note diagnostics. Maybe it would make sense to have some more generic normalize-stderr-test pattern that just strips out these hex dumps completely? That still should leave the core purpose of those tests in place ... [ Or else, a compiler switch to turn off those raw byte notes in the first place, which the tests could use? ]

@uweigand
Copy link
Contributor Author

Taking a different approach - the problem is really just the "raw byte" dumps, which are provided as extra note diagnostics. Maybe it would make sense to have some more generic normalize-stderr-test pattern that just strips out these hex dumps completely? That still should leave the core purpose of those tests in place ...

For reference, I've now updated the patch to follow this approach, which significantly reduces the amount of changes needed to the test cases, and should make it much less fragile. This basically consists of adding:

 // Strip out raw byte dumps to make comparison endianness-independent:
 // normalize-stderr-test "([0-9a-f][0-9a-f] [0-9a-f][0-9a-f] )  *│.*" -> "HEX_DUMP"

to all affected test case file, and updating the expected outputs.

The only change apart from this is to src/test/ui/const-ptr/forbidden_slices.rs, which exposes an endian-specific difference in this diagnostic output:

       constructing invalid value at .<deref>[0]: encountered 0x11, but expected a boolean

depending on which byte of D0 is not a boolean value (0 or 1). Fixed this by choosing a value of D0 that differs from 0 or 1 in all bytes.

What do you think?

@uweigand
Copy link
Contributor Author

FYI @Mark-Simulacrum just pointed me to this alternative solution that I wasn't aware of: #104135. Either way would work for me ...

@RalfJung
Copy link
Member

Taking a different approach - the problem is really just the "raw byte" dumps, which are provided as extra note diagnostics. Maybe it would make sense to have some more generic normalize-stderr-test pattern that just strips out these hex dumps completely? That still should leave the core purpose of those tests in place ... [ Or else, a compiler switch to turn off those raw byte notes in the first place, which the tests could use? ]

We want to have tests covering that output though. There is a bunch of subtle logic for the pointer printing and things like that which needs testing.

What we could do is to disable the raw byte dump in the regular const tests (or use aggressive normalization that mostly removes the raw byte dumps), and then have little-endian-only tests that check the raw byte dumps. I'd be fine with such an approach.

But I don't like the approach this PR takes right now, since it drastically reduces the test coverage of our raw byte printing.

As to CI, I agree - it would be great to have CI running on a BE system. I'd be happy provide access to a s390x machine and/or set up CI infrastructure myself. Not sure if this would be integrated into the main CI machinery or run stand-alone - whatever the Rust community prefers.

I don't think third-party runners are a possibility. This would have to somehow run on our existing runners, e.g. via qemu. The infra team would know if we have the CI capacity for that.

FYI @Mark-Simulacrum just pointed me to this alternative solution that I wasn't aware of: #104135. Either way would work for me ...

The problem with that is -- how is a contributor supposed to be able to update those reference files? Basically nobody will be able to just --bless the big-endian files. We already do this for 32bit and 64bit and it is annoying but bearable since at least 64bit Linux hosts can easily --bless both of these.

@uweigand
Copy link
Contributor Author

What we could do is to disable the raw byte dump in the regular const tests (or use aggressive normalization that mostly removes the raw byte dumps), and then have little-endian-only tests that check the raw byte dumps. I'd be fine with such an approach.

I've tried to implement that approach by adding a new raw-bytes.rs test that collects those tests with raw byte dumps. Not sure if this matches what you were thinking of ... This can probably be improved upon; in particular, if we only care about the formatting of the dumps, raw-bytes.rs likely now contains redundant tests and could be simplified.

The problem with that is -- how is a contributor supposed to be able to update those reference files? Basically nobody will be able to just --bless the big-endian files. We already do this for 32bit and 64bit and it is annoying but bearable since at least 64bit Linux hosts can easily --bless both of these.

It seems to me it should be possible to "bless" files for another target simply via a cross-compiler. I tried a straightforward approach using --target, but this is more difficult than expected, because the infrastructure will always attempt to cross-build the std crate first - and the test cases actually do import (a few things from) std, so this might well be necessary. Unfortunately, the full std build depends on other crates like compiler_builtins and profiler_builtins, which in turn require a full C cross-toolchain, which may be nontrivial to provide on the host system.

This seems a pity, as those tests likely do not actually use anything from those special crates, in particular since they are compile-time tests anyway. I'm wondering if there couldn't be a way to enable cross-target compile-time tests using some form of a stripped-down std crate that can be built solely using the existing Rust compiler (which is already able to cross-compile)?

@RalfJung
Copy link
Member

It seems to me it should be possible to "bless" files for another target simply via a cross-compiler.

You mean for check-pass tests, since we don't have to actually link/run them? Yeah that is true in principle, but in practice our cross-build support needs a target toolchain pretty early, so a bunch of work would be required to make that happen. Also having to run test --bless more often still takes increasing amounts of time and work (unless we find a way to say "bless all output files please" without having to manually invoke x.py 3 or 4 times).

So I think for now this is a reasonable trade-off.

@@ -1,5 1,7 @@
// ignore-tidy-linelength
// stderr-per-bitwidth
Copy link
Member

@RalfJung RalfJung Dec 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check which of these files even still need stderr-per-bitwidth? I assume some of them will have bitwidth-independent output now.

You might have to also nomalize the alloc-id part of the alloc dumps for that, but that seems worth the trade-off to me.

@RalfJung
Copy link
Member

LGTM, modulo ideally even making these tests bitwidth-independent.

@oli-obk should also approve though.

A number of tests under ui/const-ptr and ui/consts are currently
failing on big-endian platforms as the binary encoding of some
constants is hard-coded in the stderr test files.  Fix this by
providing a normalize-stderr-test rule that strips out the
raw bytes hex dump, so the comparison can be done in an
endianness-independent manner.  Note that in most cases, this
means the tests are now also independent of word size, so the
32bit and 64bit cases can be re-unified.

To keep tests that verify the details of those raw bytes dumps,
a new test case raw-bytes.rs performs the tests where the hex
dumps were stripped out a second time, but only on little-
endian platforms.

In addition, src/test/ui/const-ptr/forbidden_slices.rs exposes
an endian-specific difference in this diagnostic output:
   constructing invalid value at .<deref>[0]: encountered 0x11,
   but expected a boolean
depending on which byte of D0 is not a boolean value (0 or 1).
Fixed this by choosing a value of D0 that differs from 0 or 1
in all bytes.

Fixes part of rust-lang#105383.
@uweigand
Copy link
Contributor Author

Can you check which of these files even still need stderr-per-bitwidth? I assume some of them will have bitwidth-independent output now.

I had to be a bit more thorough in removing all raw byte dumps (including those with alloc-ids), and normalize size/align values in these output lines:

  = note: the raw bytes of the constant (size: 16, align: 8) {

With that done, indeed for most of the tests, the 32bit and 64bit output files are now identical and can be re-unified. Patch updated accordingly.

The two remaining tests where stderr-per-bitwidth is still needed are const-eval/ub-enum.rs and std/alloc.rs, in both cases because of differences of this type:

<    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ constructing invalid value at .align.0.<enum-tag>: encountered 0x00000000, but expected a valid enum tag
---
>    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ constructing invalid value at .align.0.<enum-tag>: encountered 0x0000000000000000, but expected a valid enum tag

@RalfJung
Copy link
Member

We could normalize encountered 0x0 to encountered $NULL to mitigate that?

As long as we still have some tests that ensure that we print these numbers with as many leading 0s as matches the bitwidths of the current target.

@uweigand
Copy link
Contributor Author

We could normalize encountered 0x0 to encountered $NULL to mitigate that?

It's not always exactly 0, it's sometimes other (small) integers. The difference is always the number of leading zeros though.

As long as we still have some tests that ensure that we print these numbers with as many leading 0s as matches the bitwidths of the current target.

Sure. I guess that could be done in another patch? Seems not directly related to this one.

@RalfJung
Copy link
Member

RalfJung commented Dec 27, 2022

I seem to recall we are normalizing away leading 0x00000000 somewhere already... not sure.

Sure. I guess that could be done in another patch? Seems not directly related to this one.

Sure, fair. Thanks for getting rid of some of the stderr-per-bitwidth, that already helps!

@oli-obk
Copy link
Contributor

oli-obk commented Jan 9, 2023

@bors r rollup

This is a good step, let's land it and iterate in follow ups

@bors
Copy link
Contributor

bors commented Jan 9, 2023

📌 Commit 73e7207 has been approved by oli-obk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 9, 2023
JohnTitor pushed a commit to JohnTitor/rust that referenced this pull request Jan 9, 2023
…r=oli-obk

Fix ui constant tests for big-endian platforms

A number of tests under ui/const-ptr and ui/consts are currently failing on big-endian platforms as the binary encoding of some constants is hard-coded in the stderr test files.

Fix this by a combination of two types of changes:

- Where possible (i.e. where the particular value of a constant does not affect the purpose of the test), choose constant values that have the same encoding on big- and little-endian platforms.

- Where this is not possible, provide a normalize-stderr-test rule that transforms the printed big-endian encoding of such constants into the corresponding little-endian form.

Fixes part of rust-lang#105383.
JohnTitor pushed a commit to JohnTitor/rust that referenced this pull request Jan 9, 2023
…r=oli-obk

Fix ui constant tests for big-endian platforms

A number of tests under ui/const-ptr and ui/consts are currently failing on big-endian platforms as the binary encoding of some constants is hard-coded in the stderr test files.

Fix this by a combination of two types of changes:

- Where possible (i.e. where the particular value of a constant does not affect the purpose of the test), choose constant values that have the same encoding on big- and little-endian platforms.

- Where this is not possible, provide a normalize-stderr-test rule that transforms the printed big-endian encoding of such constants into the corresponding little-endian form.

Fixes part of rust-lang#105383.
bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 9, 2023
…fee1-dead

Rollup of 10 pull requests

Successful merges:

 - rust-lang#105292 (Change a commit_if_ok call to probe)
 - rust-lang#105655 (Remove invalid case for mutable borrow suggestion)
 - rust-lang#106047 (Fix ui constant tests for big-endian platforms)
 - rust-lang#106061 (Enable Shadow Call Stack for Fuchsia on AArch64)
 - rust-lang#106164 (Move `check_region_obligations_and_report_errors` to `TypeErrCtxt`)
 - rust-lang#106291 (Fix incorrect suggestion for extra `&` in pattern)
 - rust-lang#106389 (Simplify some canonical type alias names)
 - rust-lang#106468 (Use FxIndexSet when updating obligation causes in `adjust_fulfillment_errors_for_expr_obligation`)
 - rust-lang#106549 (Use fmt named parameters in rustc_borrowck)
 - rust-lang#106614 (error-code docs improvements (No. 2))

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 63f2a13 into rust-lang:master Jan 9, 2023
@rustbot rustbot added this to the 1.68.0 milestone Jan 9, 2023
@uweigand uweigand deleted the s390x-test-bigendian-ui branch January 10, 2023 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants