Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIR][CodeGen] Const structs with bitfields #412

Merged
merged 21 commits into from
Feb 5, 2024

Conversation

gitoleg
Copy link
Collaborator

@gitoleg gitoleg commented Jan 22, 2024

This PR adds a support for const structs with bitfields.
Now only global structs are supported, the support of the local ones can be added more or less easily - there is one ugly thing need to be done though)

So .. what is all about.
First of all - as usually, I'm sorry for the big PR. But it's hard to break it down to peaces. The good news is that in the same time it's a copy-pasta from the original codegen, no surprises here. Basically, the most hard place to read is ConstantAggregateBuilder::addBits copied with minimum of changes.

The main problem - and frankly speaking I have no idea why it's done this way in the original codegen - is that the data layout is different for such structures, I mean literally another type is used. For instance, the code:

struct T {
  int X : 15;
  int Y : 6;
  unsigned Z : 9;
  int W;
};

struct T GV = { 1, 5, 256, -1};

is represented in LLVM IR (with no CIR enabled) as:

%struct.T = type { i32, i32 }
%struct.Inner = type { i8, i32 }

@GV = dso_local global { i8, i8, i8, i8, i32 } ...

i.e. the global var GV is looks like a struct of single bytes (up to the last field, which is not a btfield).
And my guess is that we want to have the same behavior in CIR. So we do.

The main problem is that we have to treat the same data differently - and this is why one additional bitcast is needed when we create a global var. Actually, there was a comment there - and I really wonder where it came from. But anyways, I don't really like this and don't see any good workaround here. Well, maybe we may add a kind of map in order to store the correspondence between types and do a bitcast more wisely. The same is true for the const structs with bitfields defined locally.

Copy link

github-actions bot commented Jan 22, 2024

✅ With the latest revision this PR passed the C/C code formatter.

@lanza lanza force-pushed the main branch 2 times, most recently from 4e069c6 to 79d4dc7 Compare January 29, 2024 23:34
@bcardosolopes
Copy link
Member

@lanza just rebased and this has the side effect of breaking GH PR workflow, making it impossible to review, apologies. Please rebase!

@gitoleg
Copy link
Collaborator Author

gitoleg commented Jan 31, 2024

@lanza just rebased and this has the side effect of breaking GH PR workflow, making it impossible to review, apologies. Please rebase!

@bcardosolopes done!

@bcardosolopes
Copy link
Member

Thanks for continuing work on this area!

i.e. the global var GV is looks like a struct of single bytes (up to the last field, which is not a btfield). And my guess is that we want to have the same behavior in CIR. So we do.

I haven't looked at the patch yet, going for some high level questions first. Is it possible to have something that looks sane in CIR out of CIRGen and only breaks down to something like this as part of LoweringPrepare?

The main problem is that we have to treat the same data differently - and this is why one additional bitcast is needed when we create a global var. Actually, there was a comment there - and I really wonder where it came from. But anyways, I don't really like this and don't see any good workaround here. Well, maybe we may add a kind of map in order to store the correspondence between types and do a bitcast more wisely. The same is true for the const structs with bitfields defined locally.

Gotcha, can you point me to the comment in question? I probably need to understand this better before giving you feedback

@gitoleg
Copy link
Collaborator Author

gitoleg commented Feb 1, 2024

Is it possible to have something that looks sane in CIR out of CIRGen and only breaks down to something like this as part of LoweringPrepare?

I will think about it

@bcardosolopes
Copy link
Member

I will think about it

It's also fine if that comes as a follow up PR too!

@gitoleg
Copy link
Collaborator Author

gitoleg commented Feb 2, 2024

@bcardosolopes
Well, so far I don't see any good way to postpone this annoying stuff until the lowering prepare phase. I'm not happy because of it either, and frankly speaking I have no idea why it's done this way in the original codegen, but anyway.

Previously we added CIR operations for bit fields, but the point is, the operations will be created after (in CIRGenExpr), i.e. we can't easily add something new to them here, in the CIRGenExprConst.
Also, I would say it will be hard to repeat the same bit manipulations in the lowering prepare, even if we could. Here we have a pretty straightforward approach - given a constant, we create a new struct type for it, and later create all the operations using this type. If we delayed it to the lowering prepare, we would go another way: check every constant, check it has bit fields, iterate over the whole program, find all get/set bitfield and get_member operations and replace them with the same ops but with using of the new type .. and the list of potentially affected operations may change in the future, if new ops will be add to CIR.

May be we could find another solution, but everything that comes to my head right now looks quite dirty.

So... what do you think?

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this seems like a case that we should care about raising only to the point where we have a use case for this, complexity doesn't see worth without one. Last remaining piece is the assert I mentioned in previous review and I'll just merge it.

@gitoleg
Copy link
Collaborator Author

gitoleg commented Feb 5, 2024

@bcardosolopes
added !

  if (VD->getTLSKind() != VarDecl::TLS_None)
    llvm_unrecheable("NYI");

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bcardosolopes bcardosolopes merged commit fdf178a into llvm:main Feb 5, 2024
6 checks passed
bcardosolopes pushed a commit that referenced this pull request Feb 14, 2024
The second part of the job started in #412 , now about local structures.

As it was mentioned previously, sometimes the layout for structures with
bit fields inited with constants differ from the originally created in
`CIRRecordLayoutBuilder` and it cause `storeOp` verification fail due to
different structure type was used to allocation.
This PR fix it.
An example:
```
typedef struct {
  int a : 4;
  int b : 5;
  int c;
} D;

void bar () {
  D d = {1,2,3};  
}
```

Well, I can't say I'm proud of these changes - it seems like a type
safety violation, but looks like it's the best we can do here.

The original codegen doesn't have this problem at all, there is just a
`memcpy` there, I provide LLVM IR just for reference:

```
%struct.D = type { i16, i32 }

@__const.bar.d = private unnamed_addr constant { i8, i8, i32 } { i8 33, i8 0, i32 3 }, align 4

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @bar() #0 {
entry:
  %d = alloca %struct.D, align 4
  call void @llvm.memcpy.p0.p0.i64(ptr align 4 %d, ptr align 4 @__const.bar.d, i64 8, i1 false)
  ret void
}
```
lanza pushed a commit that referenced this pull request Mar 23, 2024
This PR adds a support for const structs with bitfields. 
Now only global structs are supported, the support of the local ones can
be added more or less easily - there is one ugly thing need to be done
though)

So .. what is all about.  
First of all - as usually, I'm sorry for the big PR. But it's hard to
break it down to peaces. The good news is that in the same time it's a
copy-pasta from the original codegen, no surprises here. Basically, the
most hard place to read is `ConstantAggregateBuilder::addBits` copied
with minimum of changes.

The main problem - and frankly speaking I have no idea why it's done
this way in the original codegen - is that the data layout is different
for such structures, I mean literally another type is used. For
instance, the code:
```
struct T {
  int X : 15;
  int Y : 6;
  unsigned Z : 9;
  int W;
};

struct T GV = { 1, 5, 256, -1};
```
is represented in LLVM IR (with no CIR enabled) as:

```
%struct.T = type { i32, i32 }
%struct.Inner = type { i8, i32 }

@gv = dso_local global { i8, i8, i8, i8, i32 } ...
```
i.e. the global var `GV` is looks like a struct of single bytes (up to
the last field, which is not a btfield).
And my guess is that we want to have the same behavior in CIR. So we do.

The main problem is that we have to treat the same data differently -
and this is why one additional `bitcast` is needed when we create a
global var. Actually, there was a comment there - and I really wonder
where it came from. But anyways, I don't really like this and don't see
any good workaround here. Well, maybe we may add a kind of map in order
to store the correspondence between types and do a bitcast more wisely.
The same is true for the const structs with bitfields defined locally.
lanza pushed a commit that referenced this pull request Mar 23, 2024
The second part of the job started in #412 , now about local structures.

As it was mentioned previously, sometimes the layout for structures with
bit fields inited with constants differ from the originally created in
`CIRRecordLayoutBuilder` and it cause `storeOp` verification fail due to
different structure type was used to allocation.
This PR fix it.
An example:
```
typedef struct {
  int a : 4;
  int b : 5;
  int c;
} D;

void bar () {
  D d = {1,2,3};  
}
```

Well, I can't say I'm proud of these changes - it seems like a type
safety violation, but looks like it's the best we can do here.

The original codegen doesn't have this problem at all, there is just a
`memcpy` there, I provide LLVM IR just for reference:

```
%struct.D = type { i16, i32 }

@__const.bar.d = private unnamed_addr constant { i8, i8, i32 } { i8 33, i8 0, i32 3 }, align 4

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @bar() #0 {
entry:
  %d = alloca %struct.D, align 4
  call void @llvm.memcpy.p0.p0.i64(ptr align 4 %d, ptr align 4 @__const.bar.d, i64 8, i1 false)
  ret void
}
```
eZWALT pushed a commit to eZWALT/clangir that referenced this pull request Mar 24, 2024
This PR adds a support for const structs with bitfields. 
Now only global structs are supported, the support of the local ones can
be added more or less easily - there is one ugly thing need to be done
though)

So .. what is all about.  
First of all - as usually, I'm sorry for the big PR. But it's hard to
break it down to peaces. The good news is that in the same time it's a
copy-pasta from the original codegen, no surprises here. Basically, the
most hard place to read is `ConstantAggregateBuilder::addBits` copied
with minimum of changes.

The main problem - and frankly speaking I have no idea why it's done
this way in the original codegen - is that the data layout is different
for such structures, I mean literally another type is used. For
instance, the code:
```
struct T {
  int X : 15;
  int Y : 6;
  unsigned Z : 9;
  int W;
};

struct T GV = { 1, 5, 256, -1};
```
is represented in LLVM IR (with no CIR enabled) as:

```
%struct.T = type { i32, i32 }
%struct.Inner = type { i8, i32 }

@gv = dso_local global { i8, i8, i8, i8, i32 } ...
```
i.e. the global var `GV` is looks like a struct of single bytes (up to
the last field, which is not a btfield).
And my guess is that we want to have the same behavior in CIR. So we do.

The main problem is that we have to treat the same data differently -
and this is why one additional `bitcast` is needed when we create a
global var. Actually, there was a comment there - and I really wonder
where it came from. But anyways, I don't really like this and don't see
any good workaround here. Well, maybe we may add a kind of map in order
to store the correspondence between types and do a bitcast more wisely.
The same is true for the const structs with bitfields defined locally.
eZWALT pushed a commit to eZWALT/clangir that referenced this pull request Mar 24, 2024
The second part of the job started in llvm#412 , now about local structures.

As it was mentioned previously, sometimes the layout for structures with
bit fields inited with constants differ from the originally created in
`CIRRecordLayoutBuilder` and it cause `storeOp` verification fail due to
different structure type was used to allocation.
This PR fix it.
An example:
```
typedef struct {
  int a : 4;
  int b : 5;
  int c;
} D;

void bar () {
  D d = {1,2,3};  
}
```

Well, I can't say I'm proud of these changes - it seems like a type
safety violation, but looks like it's the best we can do here.

The original codegen doesn't have this problem at all, there is just a
`memcpy` there, I provide LLVM IR just for reference:

```
%struct.D = type { i16, i32 }

@__const.bar.d = private unnamed_addr constant { i8, i8, i32 } { i8 33, i8 0, i32 3 }, align 4

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @bar() #0 {
entry:
  %d = alloca %struct.D, align 4
  call void @llvm.memcpy.p0.p0.i64(ptr align 4 %d, ptr align 4 @__const.bar.d, i64 8, i1 false)
  ret void
}
```
lanza pushed a commit that referenced this pull request Apr 29, 2024
This PR adds a support for const structs with bitfields.
Now only global structs are supported, the support of the local ones can
be added more or less easily - there is one ugly thing need to be done
though)

So .. what is all about.
First of all - as usually, I'm sorry for the big PR. But it's hard to
break it down to peaces. The good news is that in the same time it's a
copy-pasta from the original codegen, no surprises here. Basically, the
most hard place to read is `ConstantAggregateBuilder::addBits` copied
with minimum of changes.

The main problem - and frankly speaking I have no idea why it's done
this way in the original codegen - is that the data layout is different
for such structures, I mean literally another type is used. For
instance, the code:
```
struct T {
  int X : 15;
  int Y : 6;
  unsigned Z : 9;
  int W;
};

struct T GV = { 1, 5, 256, -1};
```
is represented in LLVM IR (with no CIR enabled) as:

```
%struct.T = type { i32, i32 }
%struct.Inner = type { i8, i32 }

@gv = dso_local global { i8, i8, i8, i8, i32 } ...
```
i.e. the global var `GV` is looks like a struct of single bytes (up to
the last field, which is not a btfield).
And my guess is that we want to have the same behavior in CIR. So we do.

The main problem is that we have to treat the same data differently -
and this is why one additional `bitcast` is needed when we create a
global var. Actually, there was a comment there - and I really wonder
where it came from. But anyways, I don't really like this and don't see
any good workaround here. Well, maybe we may add a kind of map in order
to store the correspondence between types and do a bitcast more wisely.
The same is true for the const structs with bitfields defined locally.
lanza pushed a commit that referenced this pull request Apr 29, 2024
The second part of the job started in #412 , now about local structures.

As it was mentioned previously, sometimes the layout for structures with
bit fields inited with constants differ from the originally created in
`CIRRecordLayoutBuilder` and it cause `storeOp` verification fail due to
different structure type was used to allocation.
This PR fix it.
An example:
```
typedef struct {
  int a : 4;
  int b : 5;
  int c;
} D;

void bar () {
  D d = {1,2,3};  
}
```

Well, I can't say I'm proud of these changes - it seems like a type
safety violation, but looks like it's the best we can do here.

The original codegen doesn't have this problem at all, there is just a
`memcpy` there, I provide LLVM IR just for reference:

```
%struct.D = type { i16, i32 }

@__const.bar.d = private unnamed_addr constant { i8, i8, i32 } { i8 33, i8 0, i32 3 }, align 4

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @bar() #0 {
entry:
  %d = alloca %struct.D, align 4
  call void @llvm.memcpy.p0.p0.i64(ptr align 4 %d, ptr align 4 @__const.bar.d, i64 8, i1 false)
  ret void
}
```
lanza pushed a commit that referenced this pull request Apr 29, 2024
This PR adds a support for const structs with bitfields.
Now only global structs are supported, the support of the local ones can
be added more or less easily - there is one ugly thing need to be done
though)

So .. what is all about.
First of all - as usually, I'm sorry for the big PR. But it's hard to
break it down to peaces. The good news is that in the same time it's a
copy-pasta from the original codegen, no surprises here. Basically, the
most hard place to read is `ConstantAggregateBuilder::addBits` copied
with minimum of changes.

The main problem - and frankly speaking I have no idea why it's done
this way in the original codegen - is that the data layout is different
for such structures, I mean literally another type is used. For
instance, the code:
```
struct T {
  int X : 15;
  int Y : 6;
  unsigned Z : 9;
  int W;
};

struct T GV = { 1, 5, 256, -1};
```
is represented in LLVM IR (with no CIR enabled) as:

```
%struct.T = type { i32, i32 }
%struct.Inner = type { i8, i32 }

@gv = dso_local global { i8, i8, i8, i8, i32 } ...
```
i.e. the global var `GV` is looks like a struct of single bytes (up to
the last field, which is not a btfield).
And my guess is that we want to have the same behavior in CIR. So we do.

The main problem is that we have to treat the same data differently -
and this is why one additional `bitcast` is needed when we create a
global var. Actually, there was a comment there - and I really wonder
where it came from. But anyways, I don't really like this and don't see
any good workaround here. Well, maybe we may add a kind of map in order
to store the correspondence between types and do a bitcast more wisely.
The same is true for the const structs with bitfields defined locally.
lanza pushed a commit that referenced this pull request Apr 29, 2024
The second part of the job started in #412 , now about local structures.

As it was mentioned previously, sometimes the layout for structures with
bit fields inited with constants differ from the originally created in
`CIRRecordLayoutBuilder` and it cause `storeOp` verification fail due to
different structure type was used to allocation.
This PR fix it.
An example:
```
typedef struct {
  int a : 4;
  int b : 5;
  int c;
} D;

void bar () {
  D d = {1,2,3};  
}
```

Well, I can't say I'm proud of these changes - it seems like a type
safety violation, but looks like it's the best we can do here.

The original codegen doesn't have this problem at all, there is just a
`memcpy` there, I provide LLVM IR just for reference:

```
%struct.D = type { i16, i32 }

@__const.bar.d = private unnamed_addr constant { i8, i8, i32 } { i8 33, i8 0, i32 3 }, align 4

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @bar() #0 {
entry:
  %d = alloca %struct.D, align 4
  call void @llvm.memcpy.p0.p0.i64(ptr align 4 %d, ptr align 4 @__const.bar.d, i64 8, i1 false)
  ret void
}
```
eZWALT pushed a commit to eZWALT/clangir that referenced this pull request Apr 29, 2024
This PR adds a support for const structs with bitfields. 
Now only global structs are supported, the support of the local ones can
be added more or less easily - there is one ugly thing need to be done
though)

So .. what is all about.  
First of all - as usually, I'm sorry for the big PR. But it's hard to
break it down to peaces. The good news is that in the same time it's a
copy-pasta from the original codegen, no surprises here. Basically, the
most hard place to read is `ConstantAggregateBuilder::addBits` copied
with minimum of changes.

The main problem - and frankly speaking I have no idea why it's done
this way in the original codegen - is that the data layout is different
for such structures, I mean literally another type is used. For
instance, the code:
```
struct T {
  int X : 15;
  int Y : 6;
  unsigned Z : 9;
  int W;
};

struct T GV = { 1, 5, 256, -1};
```
is represented in LLVM IR (with no CIR enabled) as:

```
%struct.T = type { i32, i32 }
%struct.Inner = type { i8, i32 }

@gv = dso_local global { i8, i8, i8, i8, i32 } ...
```
i.e. the global var `GV` is looks like a struct of single bytes (up to
the last field, which is not a btfield).
And my guess is that we want to have the same behavior in CIR. So we do.

The main problem is that we have to treat the same data differently -
and this is why one additional `bitcast` is needed when we create a
global var. Actually, there was a comment there - and I really wonder
where it came from. But anyways, I don't really like this and don't see
any good workaround here. Well, maybe we may add a kind of map in order
to store the correspondence between types and do a bitcast more wisely.
The same is true for the const structs with bitfields defined locally.
eZWALT pushed a commit to eZWALT/clangir that referenced this pull request Apr 29, 2024
The second part of the job started in llvm#412 , now about local structures.

As it was mentioned previously, sometimes the layout for structures with
bit fields inited with constants differ from the originally created in
`CIRRecordLayoutBuilder` and it cause `storeOp` verification fail due to
different structure type was used to allocation.
This PR fix it.
An example:
```
typedef struct {
  int a : 4;
  int b : 5;
  int c;
} D;

void bar () {
  D d = {1,2,3};  
}
```

Well, I can't say I'm proud of these changes - it seems like a type
safety violation, but looks like it's the best we can do here.

The original codegen doesn't have this problem at all, there is just a
`memcpy` there, I provide LLVM IR just for reference:

```
%struct.D = type { i16, i32 }

@__const.bar.d = private unnamed_addr constant { i8, i8, i32 } { i8 33, i8 0, i32 3 }, align 4

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @bar() #0 {
entry:
  %d = alloca %struct.D, align 4
  call void @llvm.memcpy.p0.p0.i64(ptr align 4 %d, ptr align 4 @__const.bar.d, i64 8, i1 false)
  ret void
}
```
lanza pushed a commit that referenced this pull request Apr 29, 2024
This PR adds a support for const structs with bitfields.
Now only global structs are supported, the support of the local ones can
be added more or less easily - there is one ugly thing need to be done
though)

So .. what is all about.
First of all - as usually, I'm sorry for the big PR. But it's hard to
break it down to peaces. The good news is that in the same time it's a
copy-pasta from the original codegen, no surprises here. Basically, the
most hard place to read is `ConstantAggregateBuilder::addBits` copied
with minimum of changes.

The main problem - and frankly speaking I have no idea why it's done
this way in the original codegen - is that the data layout is different
for such structures, I mean literally another type is used. For
instance, the code:
```
struct T {
  int X : 15;
  int Y : 6;
  unsigned Z : 9;
  int W;
};

struct T GV = { 1, 5, 256, -1};
```
is represented in LLVM IR (with no CIR enabled) as:

```
%struct.T = type { i32, i32 }
%struct.Inner = type { i8, i32 }

@gv = dso_local global { i8, i8, i8, i8, i32 } ...
```
i.e. the global var `GV` is looks like a struct of single bytes (up to
the last field, which is not a btfield).
And my guess is that we want to have the same behavior in CIR. So we do.

The main problem is that we have to treat the same data differently -
and this is why one additional `bitcast` is needed when we create a
global var. Actually, there was a comment there - and I really wonder
where it came from. But anyways, I don't really like this and don't see
any good workaround here. Well, maybe we may add a kind of map in order
to store the correspondence between types and do a bitcast more wisely.
The same is true for the const structs with bitfields defined locally.
lanza pushed a commit that referenced this pull request Apr 29, 2024
The second part of the job started in #412 , now about local structures.

As it was mentioned previously, sometimes the layout for structures with
bit fields inited with constants differ from the originally created in
`CIRRecordLayoutBuilder` and it cause `storeOp` verification fail due to
different structure type was used to allocation.
This PR fix it.
An example:
```
typedef struct {
  int a : 4;
  int b : 5;
  int c;
} D;

void bar () {
  D d = {1,2,3};
}
```

Well, I can't say I'm proud of these changes - it seems like a type
safety violation, but looks like it's the best we can do here.

The original codegen doesn't have this problem at all, there is just a
`memcpy` there, I provide LLVM IR just for reference:

```
%struct.D = type { i16, i32 }

@__const.bar.d = private unnamed_addr constant { i8, i8, i32 } { i8 33, i8 0, i32 3 }, align 4

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @bar() #0 {
entry:
  %d = alloca %struct.D, align 4
  call void @llvm.memcpy.p0.p0.i64(ptr align 4 %d, ptr align 4 @__const.bar.d, i64 8, i1 false)
  ret void
}
```
bruteforceboy pushed a commit to bruteforceboy/clangir that referenced this pull request Oct 2, 2024
This PR adds a support for const structs with bitfields.
Now only global structs are supported, the support of the local ones can
be added more or less easily - there is one ugly thing need to be done
though)

So .. what is all about.
First of all - as usually, I'm sorry for the big PR. But it's hard to
break it down to peaces. The good news is that in the same time it's a
copy-pasta from the original codegen, no surprises here. Basically, the
most hard place to read is `ConstantAggregateBuilder::addBits` copied
with minimum of changes.

The main problem - and frankly speaking I have no idea why it's done
this way in the original codegen - is that the data layout is different
for such structures, I mean literally another type is used. For
instance, the code:
```
struct T {
  int X : 15;
  int Y : 6;
  unsigned Z : 9;
  int W;
};

struct T GV = { 1, 5, 256, -1};
```
is represented in LLVM IR (with no CIR enabled) as:

```
%struct.T = type { i32, i32 }
%struct.Inner = type { i8, i32 }

@gv = dso_local global { i8, i8, i8, i8, i32 } ...
```
i.e. the global var `GV` is looks like a struct of single bytes (up to
the last field, which is not a btfield).
And my guess is that we want to have the same behavior in CIR. So we do.

The main problem is that we have to treat the same data differently -
and this is why one additional `bitcast` is needed when we create a
global var. Actually, there was a comment there - and I really wonder
where it came from. But anyways, I don't really like this and don't see
any good workaround here. Well, maybe we may add a kind of map in order
to store the correspondence between types and do a bitcast more wisely.
The same is true for the const structs with bitfields defined locally.
bruteforceboy pushed a commit to bruteforceboy/clangir that referenced this pull request Oct 2, 2024
The second part of the job started in llvm#412 , now about local structures.

As it was mentioned previously, sometimes the layout for structures with
bit fields inited with constants differ from the originally created in
`CIRRecordLayoutBuilder` and it cause `storeOp` verification fail due to
different structure type was used to allocation.
This PR fix it.
An example:
```
typedef struct {
  int a : 4;
  int b : 5;
  int c;
} D;

void bar () {
  D d = {1,2,3};
}
```

Well, I can't say I'm proud of these changes - it seems like a type
safety violation, but looks like it's the best we can do here.

The original codegen doesn't have this problem at all, there is just a
`memcpy` there, I provide LLVM IR just for reference:

```
%struct.D = type { i16, i32 }

@__const.bar.d = private unnamed_addr constant { i8, i8, i32 } { i8 33, i8 0, i32 3 }, align 4

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @bar() #0 {
entry:
  %d = alloca %struct.D, align 4
  call void @llvm.memcpy.p0.p0.i64(ptr align 4 %d, ptr align 4 @__const.bar.d, i64 8, i1 false)
  ret void
}
```
Hugobros3 pushed a commit to shady-gang/clangir that referenced this pull request Oct 2, 2024
This PR adds a support for const structs with bitfields.
Now only global structs are supported, the support of the local ones can
be added more or less easily - there is one ugly thing need to be done
though)

So .. what is all about.
First of all - as usually, I'm sorry for the big PR. But it's hard to
break it down to peaces. The good news is that in the same time it's a
copy-pasta from the original codegen, no surprises here. Basically, the
most hard place to read is `ConstantAggregateBuilder::addBits` copied
with minimum of changes.

The main problem - and frankly speaking I have no idea why it's done
this way in the original codegen - is that the data layout is different
for such structures, I mean literally another type is used. For
instance, the code:
```
struct T {
  int X : 15;
  int Y : 6;
  unsigned Z : 9;
  int W;
};

struct T GV = { 1, 5, 256, -1};
```
is represented in LLVM IR (with no CIR enabled) as:

```
%struct.T = type { i32, i32 }
%struct.Inner = type { i8, i32 }

@gv = dso_local global { i8, i8, i8, i8, i32 } ...
```
i.e. the global var `GV` is looks like a struct of single bytes (up to
the last field, which is not a btfield).
And my guess is that we want to have the same behavior in CIR. So we do.

The main problem is that we have to treat the same data differently -
and this is why one additional `bitcast` is needed when we create a
global var. Actually, there was a comment there - and I really wonder
where it came from. But anyways, I don't really like this and don't see
any good workaround here. Well, maybe we may add a kind of map in order
to store the correspondence between types and do a bitcast more wisely.
The same is true for the const structs with bitfields defined locally.
Hugobros3 pushed a commit to shady-gang/clangir that referenced this pull request Oct 2, 2024
The second part of the job started in llvm#412 , now about local structures.

As it was mentioned previously, sometimes the layout for structures with
bit fields inited with constants differ from the originally created in
`CIRRecordLayoutBuilder` and it cause `storeOp` verification fail due to
different structure type was used to allocation.
This PR fix it.
An example:
```
typedef struct {
  int a : 4;
  int b : 5;
  int c;
} D;

void bar () {
  D d = {1,2,3};
}
```

Well, I can't say I'm proud of these changes - it seems like a type
safety violation, but looks like it's the best we can do here.

The original codegen doesn't have this problem at all, there is just a
`memcpy` there, I provide LLVM IR just for reference:

```
%struct.D = type { i16, i32 }

@__const.bar.d = private unnamed_addr constant { i8, i8, i32 } { i8 33, i8 0, i32 3 }, align 4

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @bar() #0 {
entry:
  %d = alloca %struct.D, align 4
  call void @llvm.memcpy.p0.p0.i64(ptr align 4 %d, ptr align 4 @__const.bar.d, i64 8, i1 false)
  ret void
}
```
keryell pushed a commit to keryell/clangir that referenced this pull request Oct 19, 2024
This PR adds a support for const structs with bitfields.
Now only global structs are supported, the support of the local ones can
be added more or less easily - there is one ugly thing need to be done
though)

So .. what is all about.
First of all - as usually, I'm sorry for the big PR. But it's hard to
break it down to peaces. The good news is that in the same time it's a
copy-pasta from the original codegen, no surprises here. Basically, the
most hard place to read is `ConstantAggregateBuilder::addBits` copied
with minimum of changes.

The main problem - and frankly speaking I have no idea why it's done
this way in the original codegen - is that the data layout is different
for such structures, I mean literally another type is used. For
instance, the code:
```
struct T {
  int X : 15;
  int Y : 6;
  unsigned Z : 9;
  int W;
};

struct T GV = { 1, 5, 256, -1};
```
is represented in LLVM IR (with no CIR enabled) as:

```
%struct.T = type { i32, i32 }
%struct.Inner = type { i8, i32 }

@gv = dso_local global { i8, i8, i8, i8, i32 } ...
```
i.e. the global var `GV` is looks like a struct of single bytes (up to
the last field, which is not a btfield).
And my guess is that we want to have the same behavior in CIR. So we do.

The main problem is that we have to treat the same data differently -
and this is why one additional `bitcast` is needed when we create a
global var. Actually, there was a comment there - and I really wonder
where it came from. But anyways, I don't really like this and don't see
any good workaround here. Well, maybe we may add a kind of map in order
to store the correspondence between types and do a bitcast more wisely.
The same is true for the const structs with bitfields defined locally.
keryell pushed a commit to keryell/clangir that referenced this pull request Oct 19, 2024
The second part of the job started in llvm#412 , now about local structures.

As it was mentioned previously, sometimes the layout for structures with
bit fields inited with constants differ from the originally created in
`CIRRecordLayoutBuilder` and it cause `storeOp` verification fail due to
different structure type was used to allocation.
This PR fix it.
An example:
```
typedef struct {
  int a : 4;
  int b : 5;
  int c;
} D;

void bar () {
  D d = {1,2,3};
}
```

Well, I can't say I'm proud of these changes - it seems like a type
safety violation, but looks like it's the best we can do here.

The original codegen doesn't have this problem at all, there is just a
`memcpy` there, I provide LLVM IR just for reference:

```
%struct.D = type { i16, i32 }

@__const.bar.d = private unnamed_addr constant { i8, i8, i32 } { i8 33, i8 0, i32 3 }, align 4

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @bar() #0 {
entry:
  %d = alloca %struct.D, align 4
  call void @llvm.memcpy.p0.p0.i64(ptr align 4 %d, ptr align 4 @__const.bar.d, i64 8, i1 false)
  ret void
}
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants