Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIR][OpenMP] Master directive #660

Open
wants to merge 1,597 commits into
base: main
Choose a base branch
from
Open

[CIR][OpenMP] Master directive #660

wants to merge 1,597 commits into from

Conversation

eZWALT
Copy link
Contributor

@eZWALT eZWALT commented Jun 5, 2024

This patch introduces support for another small OpenMP directive, master. Apologies for the delay in submission; I've been occupied with exams and haven't been able to upload the completed work I had locally. Over the next few days, I'll be consistently uploading small PRs for various directives. Thank you!

lanza and others added 30 commits January 26, 2024 19:16
This is currently missing and Debug builds are failing without it.
ghstack-source-id: 855519648a4bf2dced501f96e6de1b9b164d85ad
Pull Request resolved: llvm#424
ghstack-source-id: 0706d6bb81b5b8eefb04146719b4443aedb29ab1
Pull Request resolved: llvm#427
One more step towards completing try/catch.
…rface

This is prep work for introducing cir.try_call inside cir.try scopes.
One more incremental step towards try/catch: properly use cir.try_call instead
of regular cir.call when within a cir.try region.
)

This patch adds a new `volatile` tag to the following operations to
distinguish volatile loads and stores from normal loads and stores:

- `cir.load`
- `cir.store`
- `cir.get_bitfield`
- `cir.set_bitfield`

Besides, this patch also updates the CodeGen and LLVMIR lowering code to
start emitting CIR and LLVMIR operations with volatile flag.
This is part 3 of implementing vector types and vector operations in
ClangIR, issue llvm#284.

Create new operation `cir.vec.cmp` which implements the relational
comparison operators (`== != < > <= >=`) on vector types. A new
operation was created rather than reusing `cir.cmp` because the result
is a vector of a signed intergral type, not a `bool`.

Add CodeGen and Lowering tests for vector comparisons.

Fix the floating-point comparison predicate when lowering to LLVM. To
handle NaN values correctly, the comparisons need to be ordered rather
than unordered. (Except for `!=`, which needs to be unordered.) For
example, "ueq" was changed to "oeq".
Compilation of the following test
```
void foo6(A* a1) {
  A a2 = (*a1);
}
```
fails with.
```
NYI
UNREACHABLE executed at /home/huawei/cir/repo/llvm-project/clang/lib/CIR/CodeGen/CIRGenExprAgg.cpp:175!
```
Commit adds required visitor and fixes the issue.
In the original codegen a new type is created for the base class, while
in CIR we were rewriting the type being processed (due tp misused
pointers). This PR fix this, and also makes CIR codegen even with the
original one.
This is a first PR for variable length array support. There are one (or
more :) ) ahead.

Basically, we already did lot's of preliminary job in order to land VLA
in CIR in llvm#367 llvm#346 llvm#340. So now we add initial VLA support itself.

Most of the changes are taken from the original codegen, so there is
nothing to be scary of)

I added just one test, and basically that's all we can test right now.
Later, I will add more, from the original codegen tests.
Support \_\_extension\_\_ keyword in CIRGen
This PR adds support for section("$name") attribute
Support missing zero initialization of Bools
)

Compiling the given c-code

```
void foo() {
  int      i [2][1] = { { 1 }, { 0 } };
  long int li[2][1] = { { 1 }, { 0 } };
  float    fl[2][1] = { { 1 }, { 0 } };
  double   d [2][1] = { { 1 }, { 0 } };
}
```

leads to compilation error

```
unknown element in ConstArrayAttr
UNREACHABLE executed at /home/huawei/cir/repo/van/llvm-project/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp:951!
```

PR implements conversion the cir.zero attr to dense constant and fixed
this error.
This PR introduces CIR CodeGen support for `dynamic_cast`.

The full feature set of `dynamic_cast` is not fully implemented in this
PR as it's already pretty large. This PR only include support for
downcasting and sidecasting a pointer or reference. `dynamic_cast<void
*>` is not yet implemented.
Initial support for the following builtins:
```
__builtin_expect
__builtin_expect_with_probability
__builtin_unpredictable
```
This PR supports codegen for this builtins on "-O0" compilation
pipeline.
Support for ConditionalOperator inside the if condition stmt
This patch adds CIRGen for downcasting a pointer to the complete object
through `dynamic_cast<void *>`.

Together with llvm#426 , the full functionality of `dynamic_cast` should be
supported in CIRGen after this PR merges.
This PR adds support for constant arrays with trailing zeros.

The original `CodeGen` does the following: once a constant array contain
trailing zeros, a struct with two members is generated: initialized
elements and `zeroinitializer` for the remaining part. And depending on
some conditions, `memset` or `memcpy` are emitted. In the latter case a
global const array is created.
Well, we may go this way, but it requires us to implement
[features](https://github.com/llvm/clangir/blob/main/clang/lib/CIR/CodeGen/CIRGenDecl.cpp#L182)
that are not implemented yet.

Another option is to add one more parameter to the `constArrayAttr` and
utilize it during the lowering. So far I chose this way, but if you have
any doubts, we can discuss here. So we just emit constant array as
usually and once there are trailing zeros, lower this arrray (i.e. an
attribute) as a value.

I added a couple of tests and will add more, once we agree on the
approach. So far I marked the PR as a draft one.
…nters (llvm#400)

This also adds a missing check whether the pointer returned from
`memchr` is null and changes the result to `last` in that case.
Emit the false-branch of the consteval if statement, if any.
Originally, the location associated with a function is checked to be an
`mlir::FileLineColLoc` before the function is lowered to an LLVMIR
FuncOp. However, runtime function declarations do not have such
locations. This patch further allows `mlir::UnknownLoc` to be associated
with a function.
This PR adds a support for const structs with bitfields.
Now only global structs are supported, the support of the local ones can
be added more or less easily - there is one ugly thing need to be done
though)

So .. what is all about.
First of all - as usually, I'm sorry for the big PR. But it's hard to
break it down to peaces. The good news is that in the same time it's a
copy-pasta from the original codegen, no surprises here. Basically, the
most hard place to read is `ConstantAggregateBuilder::addBits` copied
with minimum of changes.

The main problem - and frankly speaking I have no idea why it's done
this way in the original codegen - is that the data layout is different
for such structures, I mean literally another type is used. For
instance, the code:
```
struct T {
  int X : 15;
  int Y : 6;
  unsigned Z : 9;
  int W;
};

struct T GV = { 1, 5, 256, -1};
```
is represented in LLVM IR (with no CIR enabled) as:

```
%struct.T = type { i32, i32 }
%struct.Inner = type { i8, i32 }

@gv = dso_local global { i8, i8, i8, i8, i32 } ...
```
i.e. the global var `GV` is looks like a struct of single bytes (up to
the last field, which is not a btfield).
And my guess is that we want to have the same behavior in CIR. So we do.

The main problem is that we have to treat the same data differently -
and this is why one additional `bitcast` is needed when we create a
global var. Actually, there was a comment there - and I really wonder
where it came from. But anyways, I don't really like this and don't see
any good workaround here. Well, maybe we may add a kind of map in order
to store the correspondence between types and do a bitcast more wisely.
The same is true for the const structs with bitfields defined locally.
The minimal bug repro:
```
#include <stdbool.h>
#include <stdint.h>
void bar() {
  bool x = true;
  uint8_t y = (uint8_t)x;
}
```
Fails on verification stage:
```
loc("repro.c":5:24): error: integer width of the output type is smaller or equal to the integer width of the input type
fatal error: error in backend: The pass manager failed to lower CIR to LLVMIR dialect!
```
The problem is that in some cases lowering from CIR emits the invalid
zext operation. PR fixes this issue by emitting the llvm.bitcast instead
of llvm.zext in such cases.
ShivaChen and others added 22 commits May 23, 2024 19:34
This commit introduces SCFPreparePass to
1) Canonicalize IV to LHS of loop comparison
For example, transfer `cir.cmp(gt, %bound, %IV)` to `cir.cmp(lt, %IV,
%bound)`. So we could use RHS as boundary and use `lt` to determine it's
an upper bound.
2) Hoist loop invariant operations in condition block out of loop.
The condition block may be generated as following which contains the
operations produced upper bound.
SCF for loop required loop boundary as input operands. So we might need
to hoist the boundary operations out of loop.
```
      cir.for : cond {
        %4 = cir.load %2 : !cir.ptr<!s32i>, !s32i
        %5 = cir.const #cir.int<100> : !s32i       <- upper bound
        %6 = cir.cmp(lt, %4, %5) : !s32i, !s32i
        %7 = cir.cast(int_to_bool, %6 : !s32i), !cir.boo
        cir.condition(%7
       } body {
```
Don't hook this up with CIRGen just yet.

While here update parsing tests to include `atomic(seq_cst)`.
…lvm#617)

The patch resolves [issue
llvm#248](llvm#248). It can be considered
a subsequent patch to [llvm#373](llvm#373),
where the case of empty strings was processed.

The new patch adds processing for non-empty strings that may contain
trailing zeros, such as:
```
char big_string[100000] = "123";
```
That is converted to
```
@big_string = #cir.const_array<"123" : !cir.array<!s8i x 3>, trailing_zeros> : !cir.array<!s8i x 100000>
```
One of the builders was adding a retun value to the CallOp when given a
void return type. The expected behavior is to not add a return value.
Two other minor fixes were added to the return value: its constraint was
replaced from variadic to optional and it was assigned a name. This
prevents function calls with multiple returns and facilitates access to
the single return value, respectively.
lowering var_arg op for ARM64 architecture. This is CIR lowering.

This PR modified LoweringPrepare CXXABI code to make
LoweringPrepareArm64CXXABI class inherit more generic
LoweringPrepareItaniumCXXABI, this way lowering var_arg would be only
meaningful for arm64 targets and for other arch its no op for now.

The ABI doc and detailed algorithm description can be found in this
official doc.

[](https://github.com/ARM-software/abi-aa/blob/617079d8a0d45bec83d351974849483cf0cc66d5/aapcs64/aapcs64.rst#appendix-variable-argument-lists)
…lvm#630)

This pr adds cir.shift lowering to MLIR passes and test files.
With C source code, we would able to update the CIR tests when needed.
One more step into fixing overall alignment requirements.
This patch adds a new CallConvLowering pass that aims to lower the
calling conventions of the functions in the module. It also includes a
new Clang command line option to enable it. Also, it is considered a
part of the lowering prepare set of passes, as it is unlikely to be used
elsewhere in the pipeline.

Since this will be dealing with ABI/Target-specific information, it
requires AST info. For this reason, it can only be executed through the
clang driver or cc1 tool for now as CIR does not encode AST info.

This pass is disabled by default and can be enabled by passing the flag
`-fclangir-call-conv-lowering`. Once this pass is more mature, it should
be enabled by default as a required step to lower to LLVM Dialect.
This PR adds the following operations for the builtin binary fp2fp
functions:

  - `cir.copysign` for `__builtin_copysign`;
  - `cir.fmax` for `__builtin_fmax`;
  - `cir.fmin` for `__builtin_fmin`;
  - `cir.fmod` for `__builtin_fmod`;
  - `cir.pow` for `__builtin_pow`.

This PR also includes CIRGen support for these new operations.
…uncOp (llvm#641)

CIRGlobalValueInterface inherits from mlir::Symbol as it should, and
GlobalOp and FuncOp now has interface mlir::Symbol through
CIRGlobalValueInterface

and this PR basically make function isDeclarationForLinker into the
CIRGlobalValueInterface interface. We also change some call sites of
isDeclaration to use CIRGlobalValueInterface when its appropriate.
…#645)

This pr adds cir.bit.clz and cir.bit.ctz lowering to MLIR passes and
test files.
I will complete the lowering of other `cir.bit` operations in subsequent
PRs.
Now that alignment computation is correct for neon, add more neon types
for load/store.
mlir::cir::ZeroInitConstOp was replaced with llvm.mlir.zero

resolves [llvm#627](llvm#627)
Close llvm#633.

This patch introduces `-fclangir-analysis-only` option to allow the
users to consume the AST to the CIR (and potential analysis passes, this
can be done by specifying `-Xclang -fclangir-lifetime-check=""` now or
some default value in following patches) and also generating the LLVM IR
by the traditional code gen path. This will be helpful to use CIR with
real world projects without worrying the correctness and completeness of
CIR CodeGen part.
This pr adds cir.bit.ffs cir.bit.parity cir.bit.clrsb cir.bit.popcount
lowering to MLIR passes and test files.
This PR adds support for  atomic `__sync_fetch_and_add`.

Basically it's a copy-pasta from the original `codegen`.
The only thing that I doubt about is what exact operation I need to
create in CIR. The first approach I used was to create `AtomicRMW`
operation in CIR. But as far as I see I can use the existing
`AtomicFetch` instead. Is it correct? or it's better to add a new op
here?
Moves all feature guarding static methods into a to a single header
file, centralizing the tracking of missing features in a common place
regardless of where it impacts the compilation pipeline. It also moves
the feature guarding logic into CIR's root include folder so that any
CIR library may use it.
Copy link
Collaborator

@seven-mile seven-mile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

clang/lib/CIR/CodeGen/CIRGenFunction.h Outdated Show resolved Hide resolved
clang/lib/CIR/CodeGen/CIRGenStmtOpenMP.cpp Outdated Show resolved Hide resolved
clang/lib/CIR/CodeGen/CIRGenStmtOpenMP.cpp Outdated Show resolved Hide resolved
@eZWALT eZWALT requested a review from seven-mile June 9, 2024 16:25
if (buildStmt(bodyStmt, /*useCurrentScope=*/true).failed())
res = mlir::failure();
});
builder.create<TerminatorOp>(getLoc(S.getSourceRange().getEnd()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's best if you return here, so you don't need to create the else arm, and just add the llvm_unreachable directly after the closing curly braces.

{
int x = 1;
y = x 1;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add something after the end of the pragma scope here just to make sure you insertion points are right and it's going to be emitted?


// CHECK-LABEL: @omp_master_test
// CHECK: call i32 @__kmpc_global_thread_num
// CHECK: call i32 @__kmpc_master({{.*}},{{.*}})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these already tested as part of the openmp dialect? If so we probably don't need to test them again for each feature you add. As CIR changes we might need to keep updating this file, which is probably non-ideal if things are already tested elsewhere.

An alternative option is to add one more RUN line to clang/test/CIR/CodeGen/OpenMP/master.cpp and add a emit-llvm instead of emit-cir, you can then test @__kmpc_global_thread_num and others there directly from source code - we do similar for CIR/LLVM in clang/test/CIR/CodeGen/abstract-cond.c, you can get some inspiration there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing some of these are already being tested on MLIR and that CIR might change, fair, but then what is the purpose of this lowering tests? If there were changes in CIR then it would affect other tests not just this one, take for instance clang/test/CIR/Lowering/ternary.cir.

I could change the CodeGen ones similar to the abstract-cond.c, but does that mean that I shouldn't make more lowering tests for new features? Thank you!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing some of these are already being tested on MLIR and that CIR might change, fair, but then what is the purpose of this lowering tests?

Pre-existing tests are one thing, adding new tests is another, I'm asking you to test the LLVM output altogether with the source code, there's no good reason in this PR to add a .cir test that is testing the same thing. Note that you are only changing CIRGen, not really lowering. Btw, this isn't a strict rule, but I'd prefer to keep CIR to CIR tests to more minimal testcases because of maintainance burden. In this specific PR I don't see any reason why a CIR to CIR test improves anything.

If there were changes in CIR then it would affect other tests not just this one, take for instance clang/test/CIR/Lowering/ternary.cir.

Yes, and probably most of them, and we have to go and change it all.

I could change the CodeGen ones similar to the abstract-cond.c, but does that mean that I shouldn't make more lowering tests for new features? Thank you!

It's on a per case basis, like I said before, not sure how this CIR test helps anything here. You just need to add one more RUN line to master.cpp and check for the same stuff.

mlir::Block &block = masterOp.getRegion().emplaceBlock();
mlir::OpBuilder::InsertionGuard guardCase(builder);
builder.setInsertionPointToEnd(&block);
// Build an scope for the master region
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the equivalent of OMPBuilderCBHelpers::FinalizeOMPRegion(*this, IP); and OMPBuilderCBHelpers::EmitOMPInlinedRegionBody, do we need our own CIR version of the helpers? The skeleton is important, specially if those helpers can take different actions for things you haven't yet, we need to track those with NYI, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point, I also thought that we could use those helper functions, thank you!

@bcardosolopes
Copy link
Member

Btw, whatever review you apply here, make sure you do the same on your other PR if it also applies there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet